Parameter Specifications for Incremental Reading of Hudi Table Data with Spark

Rules

Before performing an incremental query, you must set the current table's query mode to incremental query and reset the table's query mode after the query is completed.

If the table's query mode is not reset after the incremental query, subsequent real-time queries will be affected.

Example

set hoodie.tableName.consume.mode=INCREMENTAL;--Set the current table to be read in incremental mode.
set hoodie.tableName.consume.start.timestamp=20201227153030;--Specify the initial incremental pull commit.
set hoodie.tableName.consume.end.timestamp=20210308212318;  --Specify the end commit of the incremental pull. If this parameter is not specified, the latest commit is used.
select * from tableName where `_hoodie_commit_time`>'20201227153030' and `_hoodie_commit_time`<='20210308212318'; --The results must be filtered based on start.timestamp and end.timestamp. If end.timestamp is not specified, then filtering should only be done based on start.timestamp.
set hoodie.tableName.consume.mode=SNAPSHOT;  --After using the incremental mode, reset the query mode.

Parent topic: Spark on Hudi Development Specifications

Previous topic: Parameter Specifications for Creating a Hudi Table with SparkSQL

Next topic: Parameter Specifications for Spark Asynchronous Task Execution Table Compaction