Updated on 2024-08-30 GMT+08:00

Specifications for Spark to read Hudi parameters in incremental mode

rules

Before the incremental query, you must specify the query mode of the current table as the incremental query mode and rewrite the query mode of the table after the query.

If the incremental query is complete and the table query mode is not set again, subsequent real-time query will be affected.

Example

set hoodie.tableName.consume.mode=INCREMENTAL;// The current table must be read in incremental mode.
set hoodie.tableName.consume.start.timestamp=20201227153030;// Specify the initial incremental pull. commit
set hoodie.tableName.consume.end.timestamp=20210308212318; // specifies the end of incremental pulling. If this parameter is not specified, the latest commit command is used.
select * from tableName where `_hoodie_commit_time`>' 20201227153030'and `_hoodie_commit_time`<=' 20210308212318'; //The results must be filtered based on start.timestamp and end.timestamp. If end.timestamp is not specified, the results must be filtered based only on start.timestamp.
The set hoodie.tableName.consume.mode=SNAPSHOT; // has used the incremental mode, and the query mode must be reset.