Performing Operations on a Hudi Table Using Spark SQL

This section applies only to MRS 3.5.0-LTS or later.

This section describes how to use Hudi with Spark SQLs.

You have created a user and added the user to user groups hadoop (primary group) and hive on FusionInsight Manager.

Download and install the Hudi client. For details, see Installing a Client.

Currently, Hudi is integrated in Spark. You only need to download the Spark client on FusionInsight Manager. For example, the client installation directory is /opt/client.
Log in to the node where the client is installed as user root and run the following command:

cd /opt/client
Run the following commands to load environment variables:

source bigdata_env

source Hudi/component_env

kinit Created user
- You need to change the password of the created user, and then run the kinit command to log in to the system again.
- In normal mode (Kerberos authentication disabled), you do not need to run the kinit command.
- If multiple services are installed, run the component_env command of the source Spark and then the component_env command of the source Hudi after you run the source bigdata_env command.
Start Spark SQL.
- Create a Hudi table.
  create table if not exists hudi_table2 (id int,name string,price double) using hudi options (type = 'cow',primaryKey = 'id',preCombineField = 'price');
- Insert data.
  insert into hudi_table2 select 1,1,1;
  
  insert into hudi_table2 select 2,1,1;
- Update data.
  update hudi_table2 set name=3 where id=1;
- Delete data.
  delete from hudi_table2 where id=2;
- Query data.
  select * from hudi_table2;