Updated on 2022-09-02 GMT+08:00

Usage Introduction

Function Description

FDI is a data integration component of ROMA Connect and supports data integration and conversion between multiple data sources. ROMA Connect has the following advantages for data integration:

  • Access of multiple data source types

    By default, ROMA Connect supports connections of multiple types of data sources, such as relational databases, big data storage, semi-structured storage, and message systems. For details about the supported data source types, see Data Sources Supported by Data Integration Task.

    If the default data source types provided by ROMA Connect cannot meet your data integration requirements, you can customize data sources. For details on how to customize a data source, see Connecting to a Custom Data Source.

  • Flexible integration modes

    ROMA Connect supports the following integration modes:

    • Scheduled: ROMA Connect periodically obtains data from the source based on a task schedule and then integrates the data to the destination.
    • Real-time: ROMA Connect integrates data generated at the source to the destination in real time.

    For details about the data source types supported in the two integration modes, see Connecting to Data Sources.

  • Custom mapping rules

    When converting data fields from the source to destination, you can customize mapping rules. For example, you can replicate one data column in source data to multiple data columns and then integrate these columns to the destination.

  • Data integration between different network environments

    FDI allows data sources at the source and destination to come from different network environments that are not interconnected. For example, if the data source at the source comes from an on-premises data center and the data source at the destination comes from a VPC on the cloud, FDI can access both the data sources, implementing data integration between different network environments.

  • Resumable transmission of real-time tasks

    After a fault on the source or destination is rectified or a task is manually restarted, FDI automatically resumes data collection from the last interrupted position to prevent data loss.

Process Flow

The following figure shows how data integration is performed using ROMA Connect.

Figure 1 Using ROMA Connect for data integration

  1. You have created a ROMA Connect instance and an integration application.
  2. Access data sources.

    Access data sources at the source and destination to ensure that data can be read from the source and written to the destination.

  3. Create an integration task.

    A data integration task defines detailed rules for data integration from the source to destination. The rules include data source types on both the source and destination, mapping rules of data fields, and filtering conditions for data integration. ROMA Connect allows you to create the following data integration tasks:

    • Common Data Integration Task: Supports all default data source types and two integration modes: scheduled and real-time. For database data sources, only one data table at the source can be integrated to one data table at the destination each time.
    • Composite Data Integration Task: Uses the Change Data Capture (CDC) to implement real-time and incremental synchronization of data from the source to the destination. Multiple data tables at the source can be integrated to multiple data tables at the destination. Currently, the following relational databases are supported: Oracle, MySQL, and SQL Server. For details, see CDC Configurations.
  4. Start the integration task.
    • After a scheduled task is started, ROMA Connect integrates data on a scheduled basis. During the first execution, all source data that meets the conditions is integrated to the destination. Then, full data that meets the conditions or only incremental data will be integrated based on the task configuration.
    • After a real-time task is started, ROMA Connect continuously detects data changes at the source. During the first execution, all source data that meets the conditions is integrated to the destination. Subsequently, only new data will be integrated to the destination each time.