Updated on 2022-12-14 GMT+08:00

Hudi Table Schema

When writing data, Hudi generates a Hudi table based on attributes such as the storage path, table name, and partition structure.

Hudi table data files can be stored in the OS file system or distributed file system such as HDFS. To ensure analysis performance and data reliability, HDFS is generally used for storage. Using HDFS as an example, Hudi table storage files are classified into two types.

  • The .hoodie folder stores the log files related to file merging.

  • The path containing _partition_key stores actual data files and metadata by partition.

    Hudi data files of are stored in Parquet base files and Avro log files.