Help Center/ Cloud Data Migration/ User Guide/ Creating a Link in a CDM Cluster/ Creating a Link Between CDM and a Data Source
Updated on 2024-11-20 GMT+08:00

Creating a Link Between CDM and a Data Source

Scenario

Before creating a data migration job, create a link to enable the CDM cluster to read data from and write data to a data source. A migration job requires a source link and a destination link. For details on the data sources that can be exported (source links) and imported (destination links) in different migration modes (table/file migration), see Supported Data Sources.

The link configurations depend on the data source. This section describes how to create these links.

Constraints

  • If changes occur in the connected data source (for example, the MRS cluster capacity is expanded), you need to edit and save the connection.
  • Do not change the password or user when the job is running. If you do so, the password will not take effect immediately and the job will fail.

Prerequisites

  • A CDM cluster is available.
  • The CDM cluster can communicate with the destination data source.
    • If the destination data source is an on-premises database, you need the Internet or Direct Connect. When using the Internet, ensure that an EIP has been bound to the CDM cluster, the security group of CDM allows outbound traffic from the host where the off-cloud data source is located, the host where the data source is located can access the Internet, and the connection port has been enabled in the firewall rules.
    • If the destination data source is a cloud service (such as DWS, MRS, and ECS), the following requirements must be met for network interconnection:
      • If the CDM cluster and the cloud service are in different regions, a public network or a dedicated connection is required for enabling communication between the CDM cluster and the cloud service. If the Internet is used for communication, ensure that an EIP has been bound to the CDM cluster, the host where the data source is located can access the Internet, and the port has been enabled in the firewall rules.
      • If the CDM cluster and the cloud service are in the same region, VPC, subnet, and security group, they can communicate with each other by default. If they are in the same VPC but in different subnets or security groups, you must configure routing rules and security group rules. For details about how to configure routing rules, see Configuring Routing Rules. For details about how to configure security group rules, see Configuring Security Group Rules.
      • The cloud service instance and the CDM cluster belong to the same enterprise project. If they do not, you can modify the enterprise project of the workspace.
  • You have obtained the URL and the account for accessing the data source. The account is granted with the read and write permissions for the data source.

Creating Links

  1. Log in to the CDM console and choose Cluster Management in the left navigation pane.

    Another method: Log in to the DataArts Studio console by following the instructions in Accessing the DataArts Studio Instance Console. On the DataArts Studio console, locate a workspace and click DataArts Migration to access the CDM console.

    Figure 1 Cluster list

    The Source column is displayed only when you access the DataArts Migration page from the DataArts Studio console.

  2. On the CDM console, choose Cluster Management in the left navigation pane. Locate the row that contains the target cluster and click Job Management in the Operation column. On the displayed Links page, click Create Link. On the displayed page shown in Figure 2, select a connector.

    The connectors are classified based on the type of the data source to be connected. All supported data types are displayed.

    Figure 2 Selecting a connector type

  3. Select a data source and click Next. The following describes how to create a MySQL link.

    The link parameters of different data sources vary. Table 1 describes the link parameters.
    Table 1 Link parameters

    Connector

    Description

    • RDS for PostgreSQL
    • RDS for SQL Server
    • PostgreSQL
    • Microsoft SQL Server

    Because the JDBC drivers used by these relational databases are the same, the parameters to be configured are also the same and are described in PostgreSQL/SQLServer Link Parameters.

    Data Warehouse Service

    For details about the parameters, see GaussDB(DWS) Link Parameters.

    SAP HANA

    For details about the parameters, see SAP HANA Link Parameters.

    Dameng database

    For details about the parameters, see Dameng Database Link Parameters.

    MySQL

    For details about the parameters, see RDS for MySQL/MySQL Database Link Parameters.

    Oracle

    For details about the parameters, see Oracle Database Link Parameters.

    Database Sharding

    For details about the parameters, see Shard Link Parameters.

    Object Storage Service (OBS)

    For details about the parameters, see OBS Link Parameters.

    • MRS HDFS
    • FusionInsight HDFS
    • Apache HDFS

    If the data source is HDFS of MRS, Apache Hadoop, or FusionInsight HD, see HDFS Link Parameters.

    • MRS HBase
    • FusionInsight HBase
    • Apache HBase

    If the data source is HBase of MRS, Apache Hadoop, or FusionInsight HD, see HBase Link Parameters.

    • MRS Hive
    • FusionInsight Hive
    • Apache Hive

    If the data source is Hive on MRS, Apache Hadoop, or FusionInsight HD, see Hive Link Parameters.

    CloudTable Service

    If the data source is CloudTable, see CloudTable Link Parameters.

    • FTP
    • SFTP

    If the data source is an FTP or SFTP server, see FTP/SFTP Link Parameters.

    HTTP

    These connectors are used to read files with an HTTP/HTTPS URL, such as reading public files on the third-party object storage system and web disks.

    When creating an HTTP link, you only need to configure the link name. The URL is configured during job creation.

    MongoDB

    If the data source is a local MongoDB, see MongoDB Link Parameters.

    Document Database Service (DDS)

    If the data source is DDS, see DDS Link Parameters.

    • Redis
    • Distributed Cache Service

    If the data source is Redis or DCS, see Redis Link Parameters.

    • MRS Kafka
    • Apache Kafka

    If the data source is MRS Kafka or Apache Kafka, see Kafka Link Parameters.

    Data Ingestion Service

    If the data source is DIS, see DIS Link Parameters.

    Cloud Search Service (CSS)

    Elasticsearch

    If the data source is CSS or Elasticsearch, see CSS Link Parameters.

    Data Lake Insight

    If the data source is DLI, see DLI Link Parameters.

    DMS Kafka

    If the data source is DMS Kafka, see DMS Kafka Link Parameters.

    Cassandra

    If the data source is Cassandra, see Cassandra Link Parameters.

    NOTE:

    Cassandra is not supported in version 2.9.3.300 or later.

    MRS Hudi

    For details about the parameters, see MRS Hudi Link Parameters.

    MRS ClickHouse

    For details about the parameters, see MRS ClickHouse Link Parameters.

    Shentong database

    For details about the parameters, see ShenTong Database Link Parameters.

    Currently, the following data sources are in the OBT phase: FusionInsight HDFS, FusionInsight HBase, FusionInsight Hive, SAP HANA, Document Database Service, CloudTable Service, Cassandra, DMS Kafka, Cloud Search Service, Sharding Database, and ShenTong Database.

  4. After configuring the parameters of the link, click Test to check whether the link is available. Alternatively, click Save, and the system checks automatically.

    If the network is poor or the data source is too large, the link test may take 30 to 60 seconds.

Managing Links

CDM allows you to perform the following operations on created links:
  • Deleting links: You can delete links that are not used by any job.
  • Editing a link: You can modify link parameters but cannot reselect the connector. To modify a link, you need to re-enter the password needed to access the data source.
  • Testing connectivity: You can test connectivity of a link that has been saved.
  • Viewing the JSON file of a link: You can view parameters of a link in a JSON file.
  • Editing the JSON file of a link: Modify parameters of a link in a JSON file.
  • Viewing the backend link: You can view the backend link corresponding to a link. For example, you can query details about the backend link if it is enabled.

Before managing a link, ensure that the link is not used by any job to avoid affecting job execution. The procedure for managing connections is as follows:

  1. Log in to the management console and choose Service List > Cloud Data Migration. On the CDM console, choose Cluster Management in the left navigation pane. Locate the row that contains the target cluster and click Job Management in the Operation column. On the displayed page, click the Links tab.
  2. On the Links page, locate the link to be modified.

    • Deleting a link: Click Delete in the Operation column to delete a link. Alternatively, select the links that are not used by any job and click Delete Link above the list to delete them.
    • Editing the link: Click the link name or click Edit in the Operation column to access the page for modifying the link. When modifying the link, you need to enter the password for logging in to the data source again.
    • Testing connectivity of the link: Click Test Connectivity in the Operation column.
    • Viewing the JSON file of the link: In the Operation column, choose More > View Link JSON to view link parameters in JSON format.
    • Editing the JSON file of the link: In the Operation column, choose More > Edit Link JSON to modify link parameters in JSON format.
    • Viewing the backend link: Locate the row that contains a link and click More in the Operation column and select View Backend Link to view the backend link corresponding to the link.