Help Center> Data Lake Insight> Getting Started> Creating an Enhanced Datasource Connection to RDS
Updated on 2024-05-29 GMT+08:00

Creating an Enhanced Datasource Connection to RDS

Scenario

This example creates an enhanced datasource connection for a Spark SQL job to access RDS database tables.

Procedure

Create an enhanced datasource connection to access RDS. In this example, you need to create an RDS MySQL instance, an RDS database table, and a DLI enhanced datasource connection, and then access RDS database tables through a Spark SQL job. The procedure is as follows:

Step 1: Create an RDS MySQL Instance

Step 2: Create an RDS Database Table

Step 3: Create a Queue

Step 4: Create an Enhanced Datasource Connection

Step 5: Submit a SQL Job

Step 1: Create an RDS MySQL Instance

The sample job name is JobSample, and RDS is used as the data source. For details about how to create an RDS MySQL instance, see Getting Started with RDS for MySQL.
  1. Log in to the RDS console.
  2. In the upper left corner of the management console, select the target region and project.
  3. On the Instance Management page, click Buy DB Instance.
  4. On the displayed page, select a billing mode and configure information about your DB instance. Then, click Next.
    Set basic information as follows:
    • Billing Mode: Pay-per-use
    • Region: The region where your RDS resources will be located. You can change it on the creation page, or go back to the Instance Management page and change it in the upper left corner.
    • DB Instance Name: Retain the default value.
    • DB Engine: MySQL
    • DB Engine Version: 8.0
    • DB Instance Type: Single
    • Storage Type: Cloud SSD
    • AZ: default value
    • Time Zone: default value
    • Instance Class: default value
    • Storage Space: default value
    • Disk Encryption: Disable
    • VPC: A dedicated virtual network in which your RDS DB instances are located. For details about how to create a VPC and subnet, see "Creating a VPC" in Virtual Private Cloud User Guide. If you need to create and use a subnet in an existing VPC, see "Creating a Subnet for the VPC" in Virtual Private Cloud User Guide.
      • The created VPC and the DB instance must be in the same region.
      • Retain the default settings unless otherwise specified.
    • Database Port: The database port of the read replica and the primary DB instance must be the same.
    • Security Group: For details about how to create a security group, see Creating a Security Group. For details about how to add rules to a security group, see Adding a Security Group Rule.
    • Password: Configure
    • Administrator: root
    • Administrator Password: The password can contain 8 to 32 characters and contain at least three types of the following: uppercase letters, lowercase letters, digits, and special characters (~!@#$*%^-_=+()?,&).
    • Confirm Password: Must be the same as Administrator Password.
    • Parameter Template: default values
    • Table Name: Set the table names case insensitive.
    • Enterprise Project: default
    • Tag: Skip this parameter.
    • Required Duration: Select a duration as needed.
    • Quantity: Set to 1.
    • Read Replica: Skip
  5. Click Next. The confirmation page is displayed.
  6. Click Submit.
  7. To view and manage the DB instance, go to the Instance Management page.

    During the creation process, the DB instance status is Creating. When the creation process is complete, the instance status will change to Available. You can view the detailed progress and result of the task on the Task Center page.

Step 2: Create an RDS Database Table

  1. Log in to the RDS console.
  2. In the upper left corner of the management console, select the target region and project.
  3. On the Instance Management page, locate the created DB instance and view its floating IP address.
    Figure 1 Viewing the floating IP address
  4. Click More > Log In in the Operation column. On the displayed page, enter the username and password for logging in to the instance and click Test Connection. After Connection is successful is displayed, click Log In.
    Figure 2 RDS console
    Figure 3 Logging in to an Instance
  5. Click Create Database. In the displayed dialog box, enter database name dli_demo. Then, click OK.
    Figure 4 Creating a database
  6. Choose SQL Operation > SQL Query and run the following SQL statement to create a MySQL table for test:
    CREATE TABLE `dli_demo`.`tabletest` (
    	`id` VARCHAR(32) NOT NULL,
    	`name` VARCHAR(32) NOT NULL,
    	PRIMARY KEY (`id`)
    )	ENGINE = InnoDB
    	DEFAULT CHARACTER SET = utf8mb4;

Step 3: Create a Queue

To run a DLI SQL job for datasource connections, you cannot use the existing default queue. Instead, you need to create a SQL queue, for example, a queue named test. For details, see Creating a Queue.

  1. Go to the DLI console.
  2. On the Overview page of the DLI management console, click Buy Queue in the upper right corner.
  3. Configure parameters.
    • Billing Mode: Pay-per-use
    • Region: Retain the default value.
    • Project: Retain the default value or select one as you need.
    • Name: test
    • Queue Usage: Select For SQL and select Dedicated Resource Mode.
    • AZ Mode: Single AZ
    • Specifications: 16 CUs
    • Enterprise Project: default
    • Description: Leave it blank.
    • Advanced Settings: Custom
    • CIDR Block: The configured CIDR block cannot conflict with the RDS subnet CIDR block.
    • Queue Type: Basic
    • Tags: Leave it blank.
  4. Click Buy to confirm the configuration.
  5. Submit the request.

Step 4: Create an Enhanced Datasource Connection

For details about datasource connections, see Datasource Connections > Enhanced Datasource Connections in the Data Lake Insight User Guide.

  • The enhanced datasource connection function supports only pay-per-use queues.
  • The CIDR block of the DLI queue bound with a datasource connection cannot overlap with the CIDR block of the data source.
  • Datasource connections cannot be created for the default queue.
  • To access a datasource connection table, you need to use the queue for which a datasource connection has been created.
  1. In the navigation pane of the DLI management console, choose Resources > Queue Management. The created SQL queue test is displayed in the queue list.
  2. In the navigation pane of the DLI management console, choose Global Configuration > Service Authorization. On the displayed page, select VPC Administrator, and click Update to grant the DLI user the permission to access VPC resources. The permission is used to create a VPC peering connection.
    Figure 5 Updating agency permissions
  3. In the navigation pane of the DLI management console, choose Datasource Connections.
  4. Click the Enhanced tab and click Create in the upper left corner. Set the following parameters:
    • Connection Name: dlirds
    • Resource Pool: test

      If you are not sure about the queue to be bound when creating an enhanced datasource connection, you do not need to bind it. After you create the datasource connection, click More in its Operation column and select Bind Queue to bind it to a queue. For details, see Binding a Queue.

    • VPC: Select the VPC where MySQL DB instance is located.

      On the Instance Management page of the RDS console, click the target instance. On the displayed page, choose Connection Information > VPC to obtain the VPC information.

    • Subnet: Select the subnet where the MySQL DB instance is located.
    • Host Information: (Optional) When connecting to the HBase cluster of MRS, enter the host name and IP address of the ZooKeeper instance.

      On the Instance Management page of the RDS console, click the target DB instance. On the displayed page, choose Connection Information > Subnet to obtain the subnet information.

    • Tags: Tags are used to identify cloud resources. A tag includes the tag key and tag value.
  5. Click OK.
  6. In the Enhanced tab, click the created connection dlirds to view its VPC Peering ID and Connection Status. If the connection status is Active, the connection is successful.
  7. Test the connectivity between the queue and the DB instance.
    1. On the Queue Management page, locate the target queue. In the Operation column, click More and select Test Address Connectivity.
      Figure 6 Testing address connectivity
    2. Enter the floating IP address of the DB instance.

      On the Instance Management page, click the target DB instance. On the displayed page, choose Connection Information > Floating IP Address to obtain the floating IP address.

      If the address is reachable, the queue and the DB instance are successfully connected over the network.

      Figure 7 Test result

      If the test result shows that the instance is unreachable, modify the security group rules of the VPC to which the instance belongs. The operations are as follows:

      • On the DLI management console, click Resources > Queue Management, select the bound queue, and click the arrow next to the queue name to view the network segment information of the queue.
      • On the Instance Management page of the RDS console, click the instance name. In the Connection Information area, locate Database Port to obtain the port number of the RDS DB instance.
      • In the Connection Information area, locate the Security Group and click the security group name to go to the management page. Click the Inbound Rules tab and click Add Rule. Set the protocol to TCP, the port to the RDS DB instance port, and Source to the CIDR block of the DLI queue. Click OK.
        Figure 8 VPC security group rule
      • After the configuration is complete, test the network connectivity again.

Step 5: Submit a SQL Job

This section uses a SQL job as an example to describe how to access an RDS table using a datasource connection. For sample code, see Using the Spark Job to Access Data Sources of Datasource Connections > Connecting to RDS in the Data Lake Insight Development Guide.

For details, see Creating and Submitting a Spark SQL Job.

  1. On the DLI management console, click SQL Editor in the navigation pane on the left. The SQL Editor page is displayed.
  2. In the editing window on the right of the SQL Editor page, enter the following SQL statement to create database db1 and click Execute.
    create database db1;
  3. On the top of the editing window, choose the test queue and the db1 database. Enter the following SQL statements to create a table, insert data to the table, and query the data. Click Execute.
    View the query result to verify that the query is successful and the datasource connection works.
    CREATE TABLE IF NOT EXISTS rds_test USING JDBC OPTIONS (
    'url' = 'jdbc:mysql://{{ip}}:{{port}}',  // Private IP address and port of RDS
      'driver' = 'com.mysql.jdbc.Driver',
    'dbtable' = 'dli_demo.tabletest', // Name of the created DB instance and table name
      'pwd_auth_name'="xxxxx" // Name of the datasource authentication of the password type created on DLI. If datasource authentication is used, you do not need to set the username and password for the job.
    )
    
    insert into rds_test VALUES ('123','abc');
    
    
    SELECT * from rds_test;