Updated on 2025-12-17 GMT+08:00

Doris Data Change Rules

This topic describes the rules and suggestions for changing Doris data.

Data Modification Rules

  • [Mandatory] Applications must not directly execute delete or update statements for data modifications. Instead, employ the CDC's upsert method.
    • The upsert method is suitable for infrequent operations, such as updates every few minutes.
    • Specify partition conditions when using delete statements.
  • [Mandatory] Use INSERT INTO tbl1 VALUES ("1"), ("a") for small-scale data imports. For larger imports, opt for Doris-provided methods like Stream Load, Broker Load, Spark Load, or Flink Connector.
  • [Optional] For extended SQL operations, set session variables in hint mode using SELECT /*+ SET_VAR(query_timeout = xxx*/ from table. Avoid altering global system variables.
  • [Optional] Avoid frequently adding or deleting fields in tables during peak hours. Reserve fields for future use when you create tables. If fields must be added or deleted, or field types and comments must be modified, stop writing and modifying tasks on the target table during off-peak hours and then re-create a table.
    1. Create a table. The structure of the table is the same as that of the table you want to modify. Add new fields to the new table, delete unnecessary fields, or change field types.
    2. Specify fields and insert them to the newly created table.
      INSERT INTO Newly created table SELECT Specified fields FROM Existing table whose columns need to be modified;

      To prevent high CPU or memory usage and minimize the impact on query service, you can import data to a new table in batches based on time if the table has a significant amount of data. The command is as follows:

      insert into tab1 select col from tab where date <= xx;
    3. Swap the names of two tables.
      ALTER TABLE [db.]tbl1 REPLACE WITH TABLE tbl2 [PROPERTIES('swap' = 'true')];