SparkSQL table creation parameter specifications
The rules
- When creating a table, you must specify primaryKey and preCombineField.
Hudi tables provide the data update and idempotent write capabilities. This capability requires that primary keys must be set for data records to identify duplicate data and update operations. If the primary key is not specified, the table will lose the data update capability. If the preCombineField parameter is not specified, duplicate primary keys will occur.
Parameter name |
Parameter Description |
Input Value |
Description |
---|---|---|---|
primaryKey |
primary key of hudi |
On Demand |
It must be specified. It can be a composite primary key but must be globally unique. |
preCombineField |
Pre-combination key. Multiple data records with the same primary key are merged based on this field. |
On demand |
This parameter is mandatory. Data with the same primary key will be merged by this field. You cannot specify multiple fields. |
- Do not set hoodie.datasource.hive_sync.enable to false during table creation.
If this parameter is set to false, newly written partitions cannot be synchronized to Hive Metastore. The query engine loses data when reading the data because the newly written partition information is missing.
- Do not set the Hudi index type to INMEMORY.
This index is for test use only. Using the index in the production environment will cause duplicate data.
Creating an example
create table data_partition(id int, comb int, col0 int, yy int, mm int, dd int) using hudi -- Specify the hudi data source. partitioned by(yyy, mm, dd) --Specify the partition. Multi-level partitioning is supported. location '/opt/log/data_partition' --Specify the path. If the table is not created in Hive Warehouse, the table is created. options( type='mor', --Table type: mor or cow primaryKey='id', --primary key, which can be a compound primary key but must be globally unique. preCombineField='comb' --Pre-combined field. Data with the same primary key will be merged by this field. Currently, only one field cannot be specified. )
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot