What If Checkpoint Is Executed Slowly in RocksDBStateBackend Mode When the Data Amount Is Large

Question

What to do if checkpoint is executed slowly in RocksDBStateBackend mode when the data amount is large?

Cause Analysis

Customized windows are used and the window state is ListState. There are many values under the same key. In the case of a new value, the merge operation of RocksDB is used. When calculation is triggered, all values under the key are read.

The RocksDB mode is merge() > merge() ... > merge() > read(). Data reading in this mode consumes much time, as shown in Figure 1.
The source operator sends a large amount of data in a short period of time and the data keys are the same. The window operator fails to process data fast enough and barriers accumulate in the cache. The time consumed for snapshot preparation is too long and the window operator cannot report snapshot completion to CheckpointCoordinator in the specified time. Therefore, CheckpointCoordinator determines snapshot preparation failure, as shown in Figure 2.

Figure 1 Time monitoring
Click to enlarge

Figure 2 Relationship
Click to enlarge

Answer

Flink introduces the third-party software package RocksDB, whose defect causes the problem. You are advised to set checkpoint to FsStateBackend mode.

Set checkpoint to FsStateBackend mode in the application code as follows:

env.setStateBackend(new FsStateBackend("hdfs://hacluster/flink/checkpoint/"));

Parent topic: FAQs in Flink Application Development

Previous topic: What If the Page Is Displayed Abnormally on Internet Explorer 10/11

Next topic: What If yarn-session Start Fails When blob.storage.directory Is Set to /home

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot