Partition Concurrency Control

This section applies only to MRS 3.3.0 or later.

Each task determines whether a write conflict occurs based on the modified partition information stored in the commit operation in the inflight state. In this way, concurrent write is implemented.

Lock control during concurrency is implemented based on ZooKeeper locking. You do not need to configure additional parameters.

Precautions

Concurrent write control for partitions is implemented based on concurrent write control for a single table. So, the constraints are basically the same as those for the latter.

Currently, data can be concurrently written to partitions only in Spark.

To prevent a large number of concurrent requests from occupying too many ZooKeeper resources, a quota limit function is added to Hudi on ZooKeeper. You can modify the zk.quota.number parameter of Spark on the server to adjust the quota of Hudi. The default value is 500000, and the minimum value is 5. This parameter cannot be used to control the number of concurrent tasks. It is used only to control the access pressure on ZooKeeper.

Using Partition Concurrency

Set hoodie.support.partition.lock to true to enable concurrent partition write.

Example:

Enable concurrent partition write in Spark datasource mode:

upsert_data.write.format("hudi").
option("hoodie.datasource.write.table.type", "COPY_ON_WRITE").
option("hoodie.datasource.write.precombine.field", "col2").
option("hoodie.datasource.write.recordkey.field", "primary_key").
option("hoodie.datasource.write.partitionpath.field", "col0").
option("hoodie.upsert.shuffle.parallelism", 4).
option("hoodie.datasource.write.hive_style_partitioning", "true").
option("hoodie.support.partition.lock", "true").
option("hoodie.table.name", "tb_test_cow").
mode("Append").save(s"/tmp/huditest/tb_test_cow")

Enable concurrent partition write in Spark SQL mode:

set hoodie.support.partition.lock=true;
insert into hudi_table1 select 1,1,1;

Parent topic: Data Management and Maintenance

Previous topic: Single-Table Concurrency Control

Next topic: Deleting Historical Data

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot