Updated on 2024-07-17 GMT+08:00

Configuring Data Integration Between Systems

Prerequisites

  • The network where the service system is located can communicate with the network of ROMA Connect.

    If the ROMA Connect instances communicate with each other through the public network, an elastic IP address (EIP) must be bound to each instance.

  • The data source types of source and destination service system databases are supported by ROMA Connect.

    For details about the data sources used for integration, see Data Sources Supported by ROMA Connect.

  • ROMA Connect has the permission to write data to the destination database.

Configuring a Data Integration Task

  1. Create an integration application.

    All resources in the ROMA Connect instance must belong to an integration application. Before creating other resources, make sure an integration application is available. If an integration application is available, skip this step.

    1. Log in to the ROMA Connect console. On the Instances page, click View Console next to a specific instance.
    2. In the navigation pane on the left, choose Integration Applications. In the upper right corner of the page, click Create Integration Application.
    3. In the dialog box displayed, set Name and click OK.
  2. Connect to a data source.

    Configure the connection between ROMA Connect and service system databases to read and write data from the databases.

    The access configuration varies depending on the data source type. The following procedure uses Kafka as the source database and MySQL as the destination database. For details about other databases, see Connecting to Data Sources.

    Connecting to the Kafka data source at the source:

    1. In the navigation pane on the left, choose Data Sources. In the upper right corner of the page, click Access Data Source.
    2. On the Default tab page, select Kafka and click Next.
    3. Configure the data source connection information.
      Table 1 Data source connection information

      Parameter

      Description

      Name

      Enter a custom data source name. Using naming rules facilitates future search.

      Integration Application

      Select the integration application to which the data source belongs.

      Description

      Edit the description of the data source.

      Connection Address

      Enter the IP address and port number for connecting to Kafka.

      If Kafka has multiple brokers, click Add Address to enter the connection addresses.

      Enable SSL

      Determine whether to use SSL authentication for the connection between ROMA Connect and Kafka.

      SSL Username/Application Key

      Mandatory for Enable SSL set to Yes.

      User name used for SSL authentication.

      SSL Password/Application Secret

      Mandatory for Enable SSL set to Yes.

      User password used for SSL authentication.

    4. After setting the parameters for the data source, click Check Connectivity to test the data source connectivity.
      • If the test result is Data source connected successfully, go to the next step.
      • If the test result is Failed to connect to the data source, check the data source status and connection parameters, and click Recheck until the connection is successful.
    5. Click Create.

    Connecting to the MySQL data source at the destination:

    1. On the Data Sources page, click Access Data Source in the upper right corner.
    2. On the Default tab page, select MySQL and click Next.
    3. Configure the data source connection information.
      Table 2 Data source connection information

      Parameter

      Description

      Name

      Enter a custom data source name. Using naming rules facilitates future search.

      Integration Application

      Select the integration application to which the data source belongs.

      Description

      Edit the description of the data source.

      Connection Mode

      Select a database connection mode.

      • Default: The system automatically concatenates data source connection character strings based on your configured data.
      • Professional: You need to enter a data source connection string in JDBC format.

      Connection Address

      Mandatory for Connection Mode set to Default.

      Enter the IP address and port number for connecting to the database.

      Database Name

      Mandatory for Connection Mode set to Default.

      Enter the name of the database to be accessed.

      Encoding Format

      Available for Connection Mode set to Default.

      Enter the encoding format used by the database.

      Timeout Interval(s)

      Available for Connection Mode set to Default.

      Enter the timeout interval for connecting to the database, in seconds.

      Connection String

      Mandatory for Connection Mode set to Professional.

      Enter the JDBC connection string of the MySQL database, for example, jdbc:mysql://{hostname}:{port}/{dbname}.

      • {hostname} indicates the connection address of the database.
      • {port} indicates the port number for connecting to the database.
      • {dbname} indicates the name of the database to be connected.

      Username

      Enter the username for connecting to the database.

      Password

      Enter the password for connecting to the database.

    4. After setting the parameters for the data source, click Check Connectivity to test the data source connection.
      • If the test result is Data source connected successfully, go to the next step.
      • If the test result is Failed to connect to the data source, check the data source status and connection parameters, and click Recheck until the connection is successful.
    5. Click Create.
  3. Create a data integration task.
    ROMA Connect uses a data integration task to read data from the source database, convert the data structure, and write the converted data to the destination database.
    1. In the navigation pane on the left, choose Fast Data Integration > Task Management. On the displayed page, click Create Common Task.
    2. On the Create Task page, configure basic task information.
      Table 3 Basic task configuration

      Parameter

      Description

      Task Name

      Enter the task name as planned. Using naming rules facilitates future search.

      Description

      Enter a brief description of the task.

      Integration Mode

      Select the mode of data integration.

      • Scheduled: A data integration task is executed based on the schedule to integrate data on the source to the destination.
      • Real-Time: The data integration task continuously detects updates to the data at the source and integrates updates to the destination in real time.

      When Kafka is used as the source data source, only real-time tasks are supported. In this case, select Real-Time.

      Tag

      Add a tag to classify tasks for quick search.

      Enterprise Project

      Select the enterprise project to which the task belongs. In this example, retain the default value default.

    3. Configure source information.
      Table 4 Source information

      Parameter

      Description

      Instance

      Select the ROMA Connect instance that is being used.

      Integration Application

      Integration application to which the Kafka data source at the source belongs. The integration application has been configured on the Access Data Source page.

      Data Source Type

      Select Kafka.

      Data Source Name

      Select the Kafka data source that has been configured on the Access Data Source page.

      Topic Name

      Select the name of the topic whose data is to be obtained.

      Parse

      This parameter specifies whether ROMA Connect parses the obtained source data.

      • If you select Yes, ROMA Connect parses the obtained source data based on the configured parsing rules and then integrates the data to the destination.
      • If you select No, ROMA Connect transparently transmits the obtained source data and integrates the data to the destination.

      In this practice, you need to convert the data structure of the source database and then write the data to the destination database. Select Yes.

      Data Root Field

      This parameter specifies the path of the upper-layer common fields among all metadata in the data obtained from the source in JSON or XML format. This parameter is not required in this practice.

      Data Type

      Select the format of data obtained from the Kafka data source. The data format must be the same as the actual data format stored in Kafka.

      Offset

      Select whether to integrate the earliest message data or the latest message data.

      Metadata

      This parameter specifies each underlying key-value data element that is obtained from the source in JSON or XML format and needs to be integrated to the destination.

      • Alias: user-defined metadata name.
      • Type: data type of metadata. The value must be the same as the data type of the corresponding parameter in the source data.
      • Parsing Path: Set this parameter to the complete path of the metadata because the data root field is not set.

      For example, in the JSON data {"a": {"b": "xx", "c": "xx"}}, parameters b and c are bottom-layer data elements, and their parsing paths are a.b and a.c, respectively.

      Time Zone

      Select the time zone used by the Kafka data source so that ROMA Connect can identify the data timestamp. The default time zone is GMT+8:00.

    4. Configure destination information.
      Table 5 Destination information

      Parameter

      Description

      Instance

      Set this parameter to the ROMA Connect instance that is being used. After the source instance is configured, the destination instance is automatically associated and does not need to be configured.

      Integration Application

      Integration application to which the MySQL data source at the destination belongs. The integration application has been configured on the Access Data Source page.

      Data Source Type

      Select MySQL.

      Data Source Name

      Select the MySQL data source that has been configured on the Access Data Source page.

      Table

      Select the data table to be written to the MySQL database.

      After selecting a data table, click Select Table Field and select the column fields in which you want the data to be written.

      Batch Number Field

      Select a field whose type is String and length is greater than 14 characters in the destination table. In addition, the batch number field cannot be the same as the destination field in mapping information.

      The value of this field is a random number, which is used to identify the data in the same batch. The data inserted in the same batch uses the same batch number, indicating that the data is inserted in the same batch and can be used for location or rollback.

      Clear Table

      When this function is enabled, the destination table is cleared before tasks are scheduled.

    5. Configure the data mapping rule from the source to the destination.

      Click Automatic Mapping. The mapping rules between the source and destination data fields are automatically created. If the fields in the data tables at the two ends are inconsistent, you need to select the corresponding source fields for the destination fields.

    6. Click Save.

Starting a Data Integration Task

After a data integration task is created, Task Status is displayed as Stopped by default. You need to manually start the task by clicking Start.

  • After a real-time task is started, ROMA Connect continuously detects data changes at the source. During the first execution, all source data that meets the conditions is integrated to the destination. Subsequently, only new data will be integrated to the destination each time.
  • After a scheduled task is started, ROMA Connect integrates data on a scheduled basis. During the first execution, all source data that meets the conditions is integrated to the destination. Then, full data that meets the conditions or only incremental data will be integrated based on the task configuration.
  1. Start the data integration task.

    Select the task to be started and click Start above the task list. After the task is started, Task Status changes to Started.

  2. After the task is started, the task status is Executing. If the task status is successful, the data integration task is complete.
  3. After the execution is complete, you can view the integrated and synchronized data in the data table of the destination database.