Updated on 2024-10-23 GMT+08:00

Overview of Real-Time Processing Migration Jobs

DataArts Studio provides the real-time data synchronization function, which allows you to synchronize data changes in some or all tables in the source database to the destination database in real time. In this way, data in the destination database is consistent with that in the source database.

Real-time processing migration jobs are available in Beijing4, Shanghai1, Singapore, and Guangzhou, and will be available in other regions soon. You can use this function by apply for the trustlist membership. To enable it, contact customer service or technical support.

Functions

Real-time synchronization includes three basic capabilities: real-time reading, transformation, and writing. They interact with each other through system-defined intermediate data formats.

Real-time synchronization tasks support various types of data sources. In some scenarios, you can synchronize the entire database or all incremental data in real time, and synchronize multiple tables at a time.

Figure 1 Working mechanism
Table 1 Basic functions

Function

Description

Data synchronization between data sources

Real-time synchronization supports various types of data sources. You can combine multiple input and output data sources to form a synchronization link. For details, see Supported Data Sources.

Data synchronization in a complex network environment

Data can be synchronized between cloud databases, local IDCs, and databases on ECSs. You can select a proper synchronization method based on the network environment of your database to connect the data source to the resource group. Before configuring a synchronization task, ensure that the DataArts Migration resource group can communicate with the data source and destination. For details about how to configure the database environment and network connection, see Configuring Real-Time Network Connections.

Data synchronization scenarios

Real-time synchronization supports real-time synchronization of incremental data from one table to another table, from a database/table shard to a table, and from the entire database (multiple tables) to multiple tables.

  • Real-time synchronization of incremental data from one table to another
  • Table shards of databases from multiple sources can be migrated to one destination table. The mapping between source databases/tables and the destination table can be flexibly configured.
  • Real-time synchronization of incremental data from an entire database to multiple destination tables.
    • The change logs of an entire database can be synchronized to the destination for the collection of real-time logs.
    • Multiple tables in multiple databases of an instance can be configured at a time. A maximum of 50 destination tables can be configured in a task.

Real-time synchronization task configuration

You can synchronize a single table in real time and collect real-time data of an entire database through simple task configuration without compiling code. The following configurations are supported:

  • Real-time synchronization of incremental data in a table:

    Field mapping, additional fields, and UDF conversion

  • Database/Table shard
  • Real-time synchronization of data in an entire database:
    • Database/Table name matching rule
    • Automatic table creation
    • Assigning values to destination fields

      By default, real-time synchronization maps fields with the same name in the source and destination. Fields that fail to be mapped cannot be synchronized. In addition, you can add fields to destination tables and assign constants or variables to the fields.

    • Defining the DDL message processing policy

      The data source contains many DDL operations. During real-time synchronization, you can set policies for synchronizing different DDL messages to the destination based on your requirements.

Real-time synchronization task O&M

You can configure monitoring alarms for synchronization tasks.

Basic Features

Real-time processing migration jobs provide support for big data development and have the following features:

  • Timeliness: Data can be migrated within seconds.
  • Reliability: Mechanisms such as recovery upon exceptions and retry ensure data consistency and accuracy.
  • Diversity:
    • Diverse data sources: Multiple data sources can be selected at the source and destination.
    • Diverse links: Some links support full and incremental synchronization, and some links support database and table shards.
  • Maintainability: Job monitoring and logs are supported, enabling O&M engineers to locate faults.
  • Ease-of-use: You only need to configure necessary information on the console.