Updated on 2025-01-10 GMT+08:00

Connecting to DLI and Submitting SQL Jobs Using JDBC

Scenario

In Linux or Windows, you can connect to the DLI server using JDBC.

  • Jobs submitted to DLI using JDBC are executed on the Spark engine.
  • Once JDBC 2.X has undergone function reconstruction, query results can only be accessed from DLI job buckets. To utilize this feature, certain conditions must be met:
    • On the DLI management console, choose Global Configuration > Project to configure the job bucket.
    • Starting May 2024, new users can directly use DLI's function to write query results into buckets without needing to whitelist it.

      For users who started using DLI before May 2024, to use this function, they must submit a service ticket to whitelist it.

DLI supports 13 data types. Each type can be mapped to a JDBC type. If JDBC is used to connect to the server, you must use the mapped Java type. Table 1 describes the mapping relationships.
Table 1 Data type mapping

DLI Data Type

JDBC Type

Java Type

INT

INTEGER

java.lang.Integer

STRING

VARCHAR

java.lang.String

FLOAT

FLOAT

java.lang.Float

DOUBLE

DOUBLE

java.lang.Double

DECIMAL

DECIMAL

java.math.BigDecimal

BOOLEAN

BOOLEAN

java.lang.Boolean

SMALLINT/SHORT

SMALLINT

java.lang.Short

TINYINT

TINYINT

java.lang.Short

BIGINT/LONG

BIGINT

java.lang.Long

TIMESTAMP

TIMESTAMP

java.sql.Timestamp

CHAR

CHAR

Java.lang.Character

VARCHAR

VARCHAR

java.lang.String

DATE

DATE

java.sql.Date

Prerequisites

Before using JDBC, perform the following operations:

  1. Getting authorized.

    DLI uses the Identity and Access Management (IAM) to implement fine-grained permissions for your enterprise-level tenants. IAM provides identity authentication, permissions management, and access control, helping you securely access your HUAWEI CLOUD resources.

    With IAM, you can use your HUAWEI CLOUD account to create IAM users for your employees, and assign permissions to the users to control their access to specific resource types.

    Currently, roles (coarse-grained authorization) and policies (fine-grained authorization) are supported. For details about permissions and authorization operations, see the Data Lake Insight User Guide.

  2. Create a queue. Choose Resources > Queue Management. On the page displayed, click Buy Queue in the upper right corner. On the Buy Queue page displayed, select For general purpose for Type, that is, the compute resources of the Spark job.

    If the user who creates the queue is not an administrator, the queue can be used only after being authorized by the administrator. For details about how to assign permissions, see Queue Permission Management.

Procedure

  1. Install JDK 1.7 or later on the computer where JDBC is installed, and configure environment variables.
  2. Obtain the DLI JDBC driver package huaweicloud-dli-jdbc-<version>.zip by referring to Downloading and Installing the JDBC Driver Package. Decompress the package to obtain huaweicloud-dli-jdbc-<version>-jar-with-dependencies.jar.
  3. On the computer using JDBC, add huaweicloud-dli-jdbc-1.1.1-jar-with-dependencies.jar to the classpath path of the Java project.
  4. DLI JDBC provides two authentication modes, namely, token and AK/SK, to connect to DLI. For how to obtain the token and AK/SK, see Authentication.
  5. Run the Class.forName() command to load the DLI JDBC driver.

    Class.forName("com.huawei.dli.jdbc.DliDriver");

  6. Call the GetConnection method of DriverManager to create a connection.

    Connection conn = DriverManager.getConnection(String url, Properties info);

    JDBC configuration items are passed using the URL. For details, see Table 2. JDBC configuration items can be separated by semicolons (;) in the URL, or you can dynamically set the attribute items using the Info object. For details, see Table 3.
    Table 2 Database connection parameters

    Parameter

    Description

    url

    The URL format is as follows:

    jdbc:dli://<endPoint>/projectId? <key1>=<val1>;<key2>=<val2>...

    • EndPoint indicates the DLI domain name. ProjectId indicates the project ID.

      To obtain the endpoint corresponding to DLI, see Regions and Endpoints. To obtain the project ID, log in to the public cloud, move the mouse on the account, and click My Credentials from the shortcut menu.

    • Other configuration items are listed after ? in the form of key=value. The configuration items are separated by semicolons (;). They can also be passed using the Info object.

    Info

    The Info object passes user-defined configuration items. If Info does not pass any attribute item, you can set it to null. The format is as follows: info.setProperty ("Attribute item", "Attribute value").

    Table 3 Attribute items

    Item

    Mandatory

    Default Value

    Description

    Supported dli-jdbc

    queuename

    Yes

    -

    Queue name of DLI.

    dli-jdbc-1.x

    dli-jdbc-2.x

    databasename

    No

    -

    Name of a database.

    dli-jdbc-1.x

    dli-jdbc-2.x

    authenticationmode

    No

    token

    Authentication mode. Currently, token- and AK/SK-based authentication modes are supported.

    dli-jdbc-1.x

    accesskey

    Yes

    -

    AK that acts as the authentication key. For how to obtain the AK, see Authentication.

    dli-jdbc-1.x

    dli-jdbc-2.x

    secretkey

    Yes

    -

    SK that acts as the authentication key. For how to obtain the SK, see Authentication.

    dli-jdbc-1.x

    dli-jdbc-2.x

    regionname

    This parameter must be configured if authenticationmode is set to aksk.

    -

    Region name. For details, see Regions and Endpoints.

    dli-jdbc-1.x

    dli-jdbc-2.x

    token

    This parameter must be configured if authenticationmode is set to token.

    -

    Token-based authentication. For details, see Authentication.

    dli-jdbc-1.x

    charset

    No

    UTF-8

    JDBC encoding mode.

    dli-jdbc-1.x

    dli-jdbc-2.x

    usehttpproxy

    No

    false

    Whether to use the access proxy.

    dli-jdbc-1.x

    proxyhost

    This parameter must be configured if usehttpproxy is set to true.

    -

    Access proxy host.

    dli-jdbc-1.x

    dli-jdbc-2.x

    proxyport

    This parameter must be configured if usehttpproxy is set to true.

    -

    Access proxy port.

    dli-jdbc-1.x

    dli-jdbc-2.x

    dli.sql.checkNoResultQuery

    No

    false

    Whether to allow invoking the executeQuery API to execute statements (for example, DDL) that do not return results.

    • Value false indicates that invoking of the executeQuery API is allowed.
    • Value true indicates that invoking of the executeQuery API is not allowed.

    dli-jdbc-1.x

    dli-jdbc-2.x

    jobtimeout

    No

    300

    End time of the job submission. Unit: second

    dli-jdbc-1.x

    dli-jdbc-2.x

    directfetchthreshold

    No

    1000

    Check whether the number of returned results exceeds the threshold based on service requirements.

    The default threshold is 1000.

    dli-jdbc-1.x

  7. Create a Statement object, set related parameters, and submit Spark SQL to DLI.

    Statement statement = conn.createStatement();

    statement.execute("SET dli.sql.spark.sql.forcePartitionPredicatesOnPartitionedTable.enabled=true");

    statement.execute("select * from tb1");

  8. Obtain the result.

    ResultSet rs = statement.getResultSet();

  9. Display the result.

    while (rs.next()) {
    int a = rs.getInt(1);
    int b = rs.getInt(2);
    }

  10. Close the connection.

    conn.close();

Example

  • Hard-coded or plaintext AK and SK pose significant security risks. To ensure security, encrypt your AK and SK, store them in configuration files or environment variables, and decrypt them when needed.
  • In this example, the AK and SK stored in the environment variables are used. Specify the environment variables System.getenv("AK") and System.getenv("SK") in the local environment first.
import java.sql.*;
import java.util.Properties;

public class DLIJdbcDriverExample {

    public static void main(String[] args) throws ClassNotFoundException, SQLException {
        Connection conn = null;
        try {
            Class.forName("com.huawei.dli.jdbc.DliDriver");
            String url = "jdbc:dli://<endpoint>/<projectId>?databasename=db1;queuename=testqueue";
            Properties info = new Properties();
            info.setProperty("authenticationmode", "aksk");
            info.setProperty("regionname", "<real region name>");
            info.setProperty("accesskey", "<System.getenv("AK")>");
            info.setProperty("secretkey", "<System.getenv("SK")>");
            conn = DriverManager.getConnection(url, info);
            Statement statement = conn.createStatement();
            statement.execute("select * from tb1");
            ResultSet rs = statement.getResultSet();
            int line = 0;
            while (rs.next()) {
                line ++;
                int a = rs.getInt(1);
                int b = rs.getInt(2);
                System.out.println("Line:" + line + ":" + a + "," + b);
            }
            statement.execute("SET dli.sql.spark.sql.forcePartitionPredicatesOnPartitionedTable.enabled=true");
            statement.execute("describe tb1");
            ResultSet rs1 = statement.getResultSet();
            line = 0;
            while (rs1.next()) {
                line ++;
                String a = rs1.getString(1);
                String b = rs1.getString(2);
                System.out.println("Line:" + line + ":" + a + "," + b);
            }
        } catch (SQLException ex) {
        } finally {
            if (conn != null) {
                conn.close();
            }
        }
    }
}