Help Center/
CloudTable Service/
Developer Guide/
Doris Application Development Guide/
Doris Usage Specifications/
Doris Data Import Suggestions
Updated on 2025-12-17 GMT+08:00
Doris Data Import Suggestions
This topic describes the technical suggestions for importing Doris data.
Data Import
- [Mandatory] Do not frequently perform the update, delete, or truncate operation. Perform an operation every several minutes. To use the delete operation, you must set the partitioning condition or primary key column.
- [Mandatory] Avoid using INSERT INTO tbl1 VALUES("1"),("a"); to frequently import data. Even this method may work for rare import of small amounts of data, in frequent import of large amounts of data, opt for Stream Load, Broker Load, or Flink Connector.
- [Optional] When Flink writes data to Doris in real time, set the checkpoint based on the data volume of each batch. If the data volume of each batch is too small, a large number of small files will be generated. The recommended value is 60s.
- [Optional] Import data in batches at a low frequency. The average interval for importing a single table must be greater than 30s. Import 10,000 to 100,000 rows of data each time at a recommended interval of 60s.
- [Optional] Do not use insert values as the main data write mode. Stream Load or Broker Load is recommended for batch data import.
- [Optional] If there are downstream dependencies or queries when you use INSERT INTO WITH LABEL XXX SELECT to import data, check whether the imported data is visible.
Run the show load where label='xxx' SQL command to check whether the current INSERT task is VISIBLE. The imported data is visible only when the status is VISIBLE.
- [Optional] Use Stream Load to import data within 10 GB, and use Broker Load to import data within 100 GB.
- [Optional] Do not use Routine Load import data. Instead, use Flink to query Kafka data and then write the data to Doris. This limits the amount of data to be imported in a single batch and avoids a large number of small files. If Routine Load has already been used to import data, set max_tolerable_backend_down_num to 1 on the FE before you change the import method to improve reliability.
Parent topic: Doris Usage Specifications
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
The system is busy. Please try again later.
For any further questions, feel free to contact us through the chatbot.
Chatbot