Help Center/ Data Lake Insight/ User Guide/ Configuring DLI to Read and Write Data from and to External Data Sources/ Example Typical Scenario: Configure Network Connectivity Between DLI and Data Sources on a Private Network
Updated on 2025-09-15 GMT+08:00

Example Typical Scenario: Configure Network Connectivity Between DLI and Data Sources on a Private Network

Scenario

When DLI accesses data sources on a private network (such as MRS, RDS, CSS, Kafka, and GaussDB(DWS)), it needs to establish a VPC peering connection with the target service's VPC through an enhanced datasource connection for network connectivity.

This section describes how to configure the network connectivity between DLI and data sources on a private network using an enhanced datasource connection.

Procedure

  1. Obtain data source information: Record the private IP address and port number of the data source to prepare for subsequent connectivity configurations.
  2. Acquire the CIDR block of the elastic resource pool: Note down the CIDR block of the DLI elastic resource pool for future connectivity setups.
  3. Allow access from the DLI CIDR block: Add ingress rules to the data source's security group to allow access from the DLI CIDR block.
  4. Create an enhanced datasource connection: Use the enhanced datasource connection feature available on the DLI console to set up a peering connection between DLI and the data source, enabling mutual network communication.
  5. Test network connectivity: Test the network connectivity between DLI and the data source on the DLI queue.
Figure 1 Configure network connectivity between DLI and data sources on a private network

Preparations

Step 1: Obtain the Private IP Address, Port Number, and Security Group of an External Data Source

Record the private IP address and port number of the data source to prepare for subsequent connectivity configurations.

Typically, the following information needs to be recorded:

  • VPC and subnet: used to configure enhanced datasource connections.
  • Private IP address: used to test the network connectivity between DLI and the data source.

For details about how to obtain the network information of common data sources, see Table 2.

Table 2 Methods of obtaining network information of each data source

Data Source

Method of Obtaining Network Information

DMS for Kafka

  1. Log in to the Kafka management console. In the navigation pane on the left, choose DMS for Kafka.
  2. In the navigation pane on the left, choose Kafka Instances. On the displayed page, click the name of the desired Kafka instance. The basic information page of the instance is displayed.
    • In the Connection area, obtain the private network addresses of the instance.
    • In the Network area, obtain the VPC and subnet of the instance.
    • In the Network area, obtain the security group of the instance.

RDS

  1. Log in to the RDS management console. In the navigation pane on the left, choose Instances.
  2. On the displayed page, click the name of a desired instance and check its connection information.

    Record the private IP address, VPC, subnet, database port, and security group of the RDS instance.

CSS

  1. Log in to the CSS management console. In the navigation pane on the left, choose Clusters > Elasticsearch.
  2. On the displayed page, click the name of the created CSS cluster.
  3. On the Cluster Information page, obtain the Private Network Address, VPC, Subnet, and Security Group.

GaussDB(DWS)

  1. Log in to the GaussDB(DWS) management console. In the navigation pane on the left, choose Clusters.
  2. On the displayed page, click the name of the created GaussDB(DWS) cluster.
  3. On the Basic Information tab, locate the Connection Information pane and obtain the private IP address and port number of the DB instance. In the Network pane, obtain the VPC, subnet, and security group information.

MRS HBase

An MRS 3.x cluster is used as an example.

  1. Log in to the MRS management console. In the navigation pane on the left, choose Clusters > Active Clusters.
  2. On the displayed page, click the name of a desired cluster.
  3. On the dashboard, obtain VPC, subnet, and security group from the Basic Information pane.
  4. The ZooKeeper instance and its port of the MRS cluster are required for creating a job that connects DLI to MRS HBase. You need to obtain the host information of the MRS cluster.
    1. On MRS Manager, choose Cluster > Name of the desired cluster > Services > ZooKeeper. Click the Instance tab and obtain the ZooKeeper host information such as the host name and service IP address.
    2. On MRS Manager, choose Cluster and click the name of the desired cluster. Choose Services > ZooKeeper. Click the Configurations tab and select All Configurations, search for the clientPort parameter, and obtain its value, that is, the ZooKeeper port number.
    3. Log in to any MRS node as user root over SSH.
    4. Run the following command to obtain MRS hosts information. Copy and save the information.

      cat /etc/hosts

      An example query result is as follows:

Step 2: Obtain the CIDR Block of the DLI Elastic Resource Pool

  1. Log in to the DLI management console.
  2. In the navigation pane on the left, choose Resources > Resource Pool.
  3. On the displayed page, select the elastic resource pool you want to check.
  4. Click to expand the basic information card of the elastic resource pool and view the pool's VPC CIDR block.

Step 3: Add a Rule to the Security Group of the External Data Source to Allow Access from the DLI Queue

  1. Log in to the VPC console.
  2. In the navigation pane on the left, choose Access Control > Security Groups.
  3. Click the name of the security group to which the external data source belongs.

    Obtain the security group name of the data source on the management console of the data source by referring to Step 1: Obtain the Private IP Address, Port Number, and Security Group of an External Data Source.

  4. On the Inbound Rules tab, add a rule to allow access from the queue network segment.

    Set the inbound rule parameters based on Table 3.

    Figure 2 Adding an inbound rule
    Table 3 Inbound rule parameters

    Parameter

    Description

    Example Value

    Priority

    The security group rule priority.

    The priority value ranges from 1 to 100. The default value is 1, indicating the highest priority. A smaller value indicates a higher priority of a security group rule.

    1

    Action

    Action of the security group rule.

    Select Allow.

    Protocol &Port

    • Network protocol: The value can be All, TCP, UDP, ICMP, or GRE.
    • Port: Port or port range over which the traffic can reach your instance. The port ranges from 1 to 65535.

    In this example, select TCP. Leave the port blank or set it to the data source port obtained in Step 1: Obtain the Private IP Address, Port Number, and Security Group of an External Data Source.

    Type

    Type of IP addresses.

    IPv4

    Source

    Allows access from IP addresses or instances in another security group.

    In this example, enter the queue CIDR block obtained in Step 2: Obtain the CIDR Block of the DLI Elastic Resource Pool.

    Description

    Supplementary information about the security group rule. This parameter is optional.

    _

Step 4: Create an Enhanced Datasource Connection

  1. Log in to the DLI management console. In the navigation pane on the left, choose Datasource Connections. On the displayed page, click Create in the Enhanced tab.
  2. In the displayed dialog box, set the following parameters:
  3. Click OK. Click the name of the created datasource connection to view its status. You can perform subsequent steps only after the connection status changes to Active.
  4. To connect to MRS HBase, you need to add MRS host information. The procedure is as follows:
    1. On the Datasource Connections page, click the Enhanced tab and locate the row that contains the created enhanced datasource connection. Click More > Modify Host in the Operation column.
    2. In the dialog box that appears, enter the MRS HBase host information obtained in Step 1: Obtain the Private IP Address, Port Number, and Security Group of an External Data Source to the Host Information box.
      Figure 3 Modifying host information
    3. Click OK.

Step 5: Test Network Connectivity

  1. In the navigation pane on the left, choose Resources > Queue Management. On the displayed page, locate the desired queue, click More in the Operation column, and select Test Address Connectivity.
  2. In the displayed dialog box, enter the IP address and port number of the data source obtained in Step 1: Obtain the Private IP Address, Port Number, and Security Group of an External Data Source in the address box and click Test. If the queue passes the test, it can access the data source.

    For MRS HBase, use ZooKeeper IP address:ZooKeeper port or ZooKeeper host information:ZooKeeper port for the test.