Updated on 2023-06-02 GMT+08:00

Creating MRS SparkSQL Data Connections

Before establishing a data connection to MRS SparkSQL, ensure that the following conditions are met:

  • You have created an MRS cluster of a version earlier than 2.x that is deployed with Spark, and Kerberos authentication has been disabled for the MRS cluster. DLV does not support MRS SparkSQL data sources with Kerberos authentication enabled.
  • You have obtained the MRS SparkSQL address.
  • A CDM cluster is used as the network agent to ensure that DLV can communicate with MRS. Ensure that an available CDM cluster exists and is in the same region, AZ, and VPC as the MRS data source. In addition, the CDM cluster and MRS data source must be in the same security group or the security group rule allows them to communicate with each other.

    In CDM, you only need to do nothing but create a CDM cluster.

Creating an MRS SparkSQL Data Connection

  1. Log in to the DLV console.
  2. On the Data page, click the workspace drop-down list at the top of the page, select the workspace to be accessed, and click New Data Connection.

    Figure 1 Creating a data connection

  3. In the New Data Connection dialog box, set Data Source Type to MRS SparkSQL and set the related parameters.

    Figure 2 Creating an MRS SparkSQL data connection

    Table 1 describes the MRS SparkSQL data connection parameters.

    Table 1 MRS SparkSQL data connection parameters

    Parameter

    Description

    Name

    Name of the data connection. Must contain 1 to 32 characters and contain only letters, digits, hyphens (-), and underscores (_).

    Cluster Name

    Name of the MRS cluster.

    Domain Name

    After an MRS cluster is selected, the preferred private IP address of the cluster will be automatically matched. The domain name cannot be changed.

    Agent

    Select an available connection agent, such as VPC connection, CDM proxy, or Internet.

    NOTE:

    MRS SparkSQL is not a fully managed service and thus cannot be directly connected to fully managed DLV. A CDM cluster serves as an agent that enables communication between them.

    Database

    You can select a database from the Database drop-down list.

  4. Click OK.

Using MRS SparkSQL Data Sources

You can configure and use MRS SparkSQL data sources by referring to the instructions in Using DWS Data Sources.