Updated on 2025-02-22 GMT+08:00

Overview

Scope

This section defines the rules for designing and developing lakehouse and stream and batch processing solutions using DLI-Hudi. These rules are applicable to table design and management, as well as job development in Hudi development scenarios.

It covers:

  • Data table design
  • Resource configuration
  • Performance tuning
  • Common troubleshooting
  • Typical parameter settings

Terms

This section uses the following terms:

  • Rule: a principle that must be followed during programming.
  • Suggestion: a principle that must be considered during programming.
  • Description: an explanation of the rule or suggestion in question.
  • Example: positive and negative examples of a rule or suggestion.

Application Scope

  • Design, develop, test, and maintain data storage and processing jobs based on DLI-Hudi.
  • These design and development specifications are based on Spark 3.3.1 and Hudi 0.11.0.