Updated on 2026-05-20 GMT+08:00

Doris

Doris is a high-performance and highly scalable distributed analytical database. It supports real-time data write and fast query, and is suitable for multi-dimensional analysis and report generation of massive amounts of data.

DataArts Migration can efficiently migrate data from and to MRS Doris and CloudTable Doris on Huawei Cloud.

How It Works

Doris Reader reads data using native JDBC, and DorisWriter writes data using JDBC or StreamLoad efficiently.

Preparation and Constraints

  • Network requirements

    The Doris data source can communicate with CDM. This ensures smooth data transmission. For details, see Enabling Network Connectivity.

  • Required permissions
    • MRS Doris read and write permissions
      • Read permission: Grant the read-only permission of MRS Doris to the IAM user or user group of DataArts Migration through a system policy such as MRS ReadOnlyAccess. You can also create a custom policy to grant read permissions such as SELECT.
      • Write permission: Grant the write permission of MRS Doris to the IAM user or user group of DataArts Migration through a system policy such as MRS CommonOperations and MRS FullAccess. You can also create a custom policy to grant write permissions such as INSERT INTO TABLE and CREATE TABLE.
    • CloudTable Doris read and write permissions
      • Read permission: Grant the ReadOnlyAccess system policy of CloudTable to the IAM user or user group of DataArts Migration, or create a custom policy to grant read permissions such as SELECT.
      • Write permission: Grant the CommonOperations or FullAccess system policy of CloudTable to the IAM user or user group of DataArts Migration, or create a custom policy to grant write permissions such as INSERT INTO TABLE and CREATE TABLE.
  • Enabling ports
    • JDBC port (9030): Ensure that the JDBC port 9030 of the Doris service has been enabled so that DataArts Migration can connect to the Doris database through JDBC and read and write data.
    • StreamLoad port (8030): If data is written using StreamLoad, ensure that the StreamLoad port 8030 of the Doris service has been enabled so that DataArts Migration can write data to Doris efficiently.
    • StreamLoad port (8050): If data is written using StreamLoad and HTTPS encryption is enabled, ensure that the StreamLoad HTTPS port 8050 of the Doris service has been enabled so that DataArts Migration can write data to Doris securely.

Driver Usage

  • The MySQL driver is recommended.
  • Version mapping between Doris and the driver:
    • Doris versions earlier than 2.0: MySQL 5.x driver is required.
    • Doris version 2.0 and later: MySQL 8.0.27 driver is required.

Supported Field Types

Different Doris versions support different data types. The following table lists the supported Doris fields. For details about all the field types supported by Doris of each version, see the official Doris documentation.

Category

Field Type

Read

Write

Numeric

SMALLINT

INT

BIGINT

LARGEINT

FLOAT

DOUBLE

DECIMAL

DECIMALV3

Time

DATE

DATETIME

DATEV2

DATETIMEV2

Character

CHAR

VARCHAR

STRING

VARCHAR

TEXT

Other

POINT

x

x

JSON

ARRAY

x

x

JSONB

x

x

HLL

x

x

BITMAP

x

x

QUANTILE_STATE

x

x

Supported Migration Scenarios

DataArts Migration supports the following offline synchronization modes:

  • Single table synchronization

    DataArts Migration supports table/file synchronization in data ingestion into a data lake or data migration to the cloud.

  • Database and table shard synchronization

    DataArts Migration supports synchronization of data from multiple databases and tables in data ingestion into a data lake or data migration to the cloud.

  • Entire DB migration

    DataArts Migrations supports synchronization of data from an on-premises database in data ingestion into a data lake or data migration to the cloud.

Database and table shard synchronization and entire DB migration are not supported in all regions. The following table lists the supported Doris migration scenarios.

Supported Migration Scenario

Single Table Read

Single Table Write

Database/Table Shard Read

Database/Table Shard Write

Entire DB Read

Entire DB Write

Supported

x

x

x

Core Capabilities

  • Connection configuration

    Configuration Item

    Supported

    Description

    Supported protocols

    JDBC/

    Streamload

    DataArts Migration can exchange data with Doris through JDBC or StreamLoad.

    JDBC is suitable for general database operations.

    StreamLoad provides more efficient data write and is suitable for quick import of a large amount of data.

    HTTPS support

    DataArts Migration can exchange data with Doris over HTTPS, ensuring data security and integrity during transmission.

    Connection configuration optimization

    Connection configuration such as connectTimeout can be optimized to improve connection performance.

  • Read capabilities

    Configuration Item

    Supported

    Description

    Incremental read

    Incremental read can be read through where conditions or SQL statements.

    Read mode

    The database table mode and SQL statements are supported. The database table mode can be used to read data from a specified table. SQL statements can be used to flexibly query data, meeting complex requirements.

    Shard concurrency

    Horizontal sharding based on common fields or partitions and multi-thread concurrent extraction significantly improve the throughput and efficiency.

    Custom fields

    You can add computed columns, constant columns, or masking functions for tasks to meet personalized service requirements.

    Dirty data processing

    Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.

  • Write capabilities

    Configuration Item

    Supported

    Description

    Write mode

    JDBC/

    STREAM_LOAD

    DataArts Migration can exchange data with Doris through JDBC or StreamLoad. JDBC is suitable for general database operations.

    StreamLoad provides more efficient data write and is suitable for quick import of a large amount of data.

    Pre- and post-import processing

    Operations such as preSql and truncate can clean and process data before and after data import.

    Optimization of the number of written rows

    In JDBC mode, you can tune the Batch Size parameter in the connection configuration to optimize write performance.

    In STREAM_LOAD mode, streamload configuration parameters can be used to optimize write performance.

    Dirty data processing

    x

    Abnormal data cannot be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.

    Concurrent write

    Concurrent write improves efficiency.

Creating a Data Source

Create a data source in Management Center. For details, see Configuring Data Connection Parameters.

Creating an Offline Data Migration Job

Create a Doris migration job in DataArts Factory. For details, see Creating an Offline Processing Migration Job.