Operating a Hudi Table Using spark-sql

This section applies only to MRS 3.5.0-LTS and later versions.

This section describes how to use the Hudi function using spark-sql.

You have created a user and added the user to user groups hadoop (primary group) and hive on Manager.

Download and install the Hudi client. For details, see Installing a Client.

Currently, Hudi is integrated in Spark. You only need to download the Spark client on Manager. For example, the client installation directory is /opt/client.
Log in to the node where the client is installed as user root and run the following command:

cd /opt/client
Run the following commands to load environment variables:

source bigdata_env

source Hudi/component_env

kinit Created user
- You need to change the password of the created user, and then run the kinit command to log in to the system again.
- In normal mode (Kerberos authentication disabled), you do not need to run the kinit command.
- If multiple services are installed, run the component_env command of the source Spark and then the component_env command of the source Hudi after you run the source bigdata_env command.
Start spark-sql.
- Create a Hudi table.
  create table if not exists hudi_table2 (id int,name string,price double) using hudi options (type = 'cow',primaryKey = 'id',preCombineField = 'price');
- Insert data.
  insert into hudi_table2 select 1,1,1;
  
  insert into hudi_table2 select 2,1,1;
- Update data.
  update hudi_table2 set name=3 where id=1;
- Delete data.
  delete from hudi_table2 where id=2;
- Query data.
  select * from hudi_table2;