Configuration Rules
Flink Job Parameter Configuration Specifications
The following table describes the rules for configuring Flink job parameters.
Parameter |
Mandatory |
Description |
Recommended Value |
---|---|---|---|
-c |
Yes |
Main class name |
Set this parameter as you need. |
-ynm |
Yes |
Flink YARN job name |
Set this parameter as you need. |
execution.checkpointing.interval |
Yes |
Interval for triggering a checkpoint, which can be added using -yD. The unit is ms. |
60000 |
execution.checkpointing.timeout |
Yes |
Checkpoint timeout interval. You can run the -yD command to add a checkpoint timeout interval. The default value is 30 minutes. |
30min |
parallelism.default |
No |
Job parallelism. For example, to add the job parallelism for the join operator, use -yD. The default value is 1. |
Set this parameter based on the site requirements. |
table.exec.state.ttl |
Yes |
TTL (join ttl) of Flink state, which can be added using -yD. The default value is 0. |
Set this parameter based on the site requirements. |
Checkpoint Interval Should Be Longer Than the Checkpoint Execution Duration
The checkpoint execution duration depends on checkpoint data volume. The larger the data volume, the longer the execution duration.
Checkpoint Timeout Duration Should Be Longer Than the Checkpoint Interval
The checkpoint interval indicates the interval for triggering a checkpoint. If the execution duration is longer than the checkpoint timeout interval, the job fails.
If CDC is used, changelog needs to be enabled for Hudi table read and write.
To ensure Flink calculation accuracy when CDC is used, retain +I, +U, -U, and -D in Hudi tables. Changelog must be enabled when data is written to or read from the same Hudi table.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot