Connecting to DLI and Submitting SQL Jobs Using JDBC

Scenario

In Linux or Windows, you can connect to the DLI server using JDBC.

Jobs submitted to DLI using JDBC are executed on the Spark engine.
Once JDBC 2.X has undergone function reconstruction, query results can only be accessed from DLI job buckets. To utilize this feature, certain conditions must be met:
- On the DLI management console, choose Global Configuration > Project to configure the job bucket.
- Starting May 2024, new users can directly use DLI's function to write query results into buckets without needing to whitelist it.
  For users who started using DLI before May 2024, to use this function, they must submit a service ticket to whitelist it.

DLI supports 13 data types. Each type can be mapped to a JDBC type. If JDBC is used to connect to the server, you must use the mapped Java type. Table 1 describes the mapping relationships.

**Table 1** Data type mapping
DLI Data Type	JDBC Type	Java Type
INT	INTEGER	java.lang.Integer
STRING	VARCHAR	java.lang.String
FLOAT	FLOAT	java.lang.Float
DOUBLE	DOUBLE	java.lang.Double
DECIMAL	DECIMAL	java.math.BigDecimal
BOOLEAN	BOOLEAN	java.lang.Boolean
SMALLINT/SHORT	SMALLINT	java.lang.Short
TINYINT	TINYINT	java.lang.Short
BIGINT/LONG	BIGINT	java.lang.Long
TIMESTAMP	TIMESTAMP	java.sql.Timestamp
CHAR	CHAR	Java.lang.Character
VARCHAR	VARCHAR	java.lang.String
DATE	DATE	java.sql.Date

Prerequisites

Before using JDBC, perform the following operations:

Getting authorized.
DLI uses the Identity and Access Management (IAM) to implement fine-grained permissions for your enterprise-level tenants. IAM provides identity authentication, permissions management, and access control, helping you securely access your HUAWEI CLOUD resources.

With IAM, you can use your HUAWEI CLOUD account to create IAM users for your employees, and assign permissions to the users to control their access to specific resource types.

Currently, roles (coarse-grained authorization) and policies (fine-grained authorization) are supported. For details about permissions and authorization operations, see the Data Lake Insight User Guide.
Create a queue. Choose Resources > Queue Management. On the page displayed, click Buy Queue in the upper right corner. On the Buy Queue page displayed, select For general purpose for Type, that is, the compute resources of the Spark job.

If the user who creates the queue is not an administrator, the queue can be used only after being authorized by the administrator. For details about how to assign permissions, see Queue Permission Management.

Procedure

Install JDK 1.7 or later on the computer where JDBC is installed, and configure environment variables.
Obtain the DLI JDBC driver package huaweicloud-dli-jdbc-<version>.zip by referring to Downloading and Installing the JDBC Driver Package. Decompress the package to obtain huaweicloud-dli-jdbc-<version>-jar-with-dependencies.jar.
On the computer using JDBC, add huaweicloud-dli-jdbc-1.1.1-jar-with-dependencies.jar to the classpath path of the Java project.
DLI JDBC provides two authentication modes, namely, token and AK/SK, to connect to DLI. For how to obtain the token and AK/SK, see Authentication.
Run the Class.forName() command to load the DLI JDBC driver.

Class.forName("com.huawei.dli.jdbc.DliDriver");

Call the GetConnection method of DriverManager to create a connection.

Connection conn = DriverManager.getConnection(String url, Properties info);

JDBC configuration items are passed using the URL. For details, see Table 2. JDBC configuration items can be separated by semicolons (;) in the URL, or you can dynamically set the attribute items using the Info object. For details, see Table 3.

**Table 2** Database connection parameters
Parameter	Description
url	The URL format is as follows: jdbc:dli://<endPoint>/projectId? <key1>=<val1>;<key2>=<val2>... EndPoint indicates the DLI domain name. ProjectId indicates the project ID. To obtain the endpoint corresponding to DLI, see Regions and Endpoints. To obtain the project ID, log in to the public cloud, move the mouse on the account, and click My Credentials from the shortcut menu. Other configuration items are listed after ? in the form of key=value. The configuration items are separated by semicolons (;). They can also be passed using the Info object.
Info	The Info object passes user-defined configuration items. If Info does not pass any attribute item, you can set it to null. The format is as follows: info.setProperty ("Attribute item", "Attribute value").

**Table 3** Attribute items
Item	Mandatory	Default Value	Description	Supported dli-jdbc
queuename	Yes	-	Queue name of DLI.	dli-jdbc-1.x dli-jdbc-2.x
databasename	No	-	Name of a database.	dli-jdbc-1.x dli-jdbc-2.x
authenticationmode	No	token	Authentication mode. Currently, token- and AK/SK-based authentication modes are supported.	dli-jdbc-1.x
accesskey	Yes	-	AK that acts as the authentication key. For how to obtain the AK, see Authentication.	dli-jdbc-1.x dli-jdbc-2.x
secretkey	Yes	-	SK that acts as the authentication key. For how to obtain the SK, see Authentication.	dli-jdbc-1.x dli-jdbc-2.x
regionname	This parameter must be configured if authenticationmode is set to aksk.	-	Region name. For details, see Regions and Endpoints.	dli-jdbc-1.x dli-jdbc-2.x
token	This parameter must be configured if authenticationmode is set to token.	-	Token-based authentication. For details, see Authentication.	dli-jdbc-1.x
charset	No	UTF-8	JDBC encoding mode.	dli-jdbc-1.x dli-jdbc-2.x
usehttpproxy	No	false	Whether to use the access proxy.	dli-jdbc-1.x
proxyhost	This parameter must be configured if usehttpproxy is set to true.	-	Access proxy host.	dli-jdbc-1.x dli-jdbc-2.x
proxyport	This parameter must be configured if usehttpproxy is set to true.	-	Access proxy port.	dli-jdbc-1.x dli-jdbc-2.x
dli.sql.checkNoResultQuery	No	false	Whether to allow invoking the executeQuery API to execute statements (for example, DDL) that do not return results. Value false indicates that invoking of the executeQuery API is allowed. Value true indicates that invoking of the executeQuery API is not allowed.	dli-jdbc-1.x dli-jdbc-2.x
jobtimeout	No	300	End time of the job submission. Unit: second	dli-jdbc-1.x dli-jdbc-2.x
directfetchthreshold	No	1000	Check whether the number of returned results exceeds the threshold based on service requirements. The default threshold is 1000.	dli-jdbc-1.x

Create a Statement object, set related parameters, and submit Spark SQL to DLI.

Statement statement = conn.createStatement();

statement.execute("SET dli.sql.spark.sql.forcePartitionPredicatesOnPartitionedTable.enabled=true");

statement.execute("select * from tb1");
Obtain the result.

ResultSet rs = statement.getResultSet();

Display the result.

while (rs.next()) {
int a = rs.getInt(1);
int b = rs.getInt(2);
}

Close the connection.

conn.close();

Example

Hard-coded or plaintext AK and SK pose significant security risks. To ensure security, encrypt your AK and SK, store them in configuration files or environment variables, and decrypt them when needed.
In this example, the AK and SK stored in the environment variables are used. Specify the environment variables System.getenv("AK") and System.getenv("SK") in the local environment first.

import java.sql.*;
import java.util.Properties;

public class DLIJdbcDriverExample {

    public static void main(String[] args) throws ClassNotFoundException, SQLException {
        Connection conn = null;
        try {
            Class.forName("com.huawei.dli.jdbc.DliDriver");
            String url = "jdbc:dli://<endpoint>/<projectId>?databasename=db1;queuename=testqueue";
            Properties info = new Properties();
            info.setProperty("authenticationmode", "aksk");
            info.setProperty("regionname", "<real region name>");
            info.setProperty("accesskey", "<System.getenv("AK")>");
            info.setProperty("secretkey", "<System.getenv("SK")>");
            conn = DriverManager.getConnection(url, info);
            Statement statement = conn.createStatement();
            statement.execute("select * from tb1");
            ResultSet rs = statement.getResultSet();
            int line = 0;
            while (rs.next()) {
                line ++;
                int a = rs.getInt(1);
                int b = rs.getInt(2);
                System.out.println("Line:" + line + ":" + a + "," + b);
            }
            statement.execute("SET dli.sql.spark.sql.forcePartitionPredicatesOnPartitionedTable.enabled=true");
            statement.execute("describe tb1");
            ResultSet rs1 = statement.getResultSet();
            line = 0;
            while (rs1.next()) {
                line ++;
                String a = rs1.getString(1);
                String b = rs1.getString(2);
                System.out.println("Line:" + line + ":" + a + "," + b);
            }
        } catch (SQLException ex) {
        } finally {
            if (conn != null) {
                conn.close();
            }
        }
    }
}

Parent topic: Submitting a SQL Job Using JDBC

Previous topic: Downloading and Installing the JDBC Driver Package

Next topic: APIs Supported By the DLI JDBC Driver