On this page

Scala Sample Code

Updated on 2022-09-14 GMT+08:00

Function Description

The JDBC API of the user-defined client is used to submit a data analysis task and return the results.

Sample Code

  1. Define an SQL statement. SQL must be a single statement and cannot contain ";". Example:

    val sqlList = new ArrayBuffer[String]
    sqlList += "CREATE TABLE CHILD (NAME STRING, AGE INT) " +
    "ROW FORMAT DELIMITED FIELDS TERMINATED BY ','"
    sqlList += "LOAD DATA INPATH '/home/data' INTO TABLE CHILD"
    sqlList += "SELECT * FROM child"
    sqlList += "DROP TABLE child"
    NOTE:
    • The data file in the sample project must be placed in the home directory of the host where the JDBCServer is located.
    • Ensure that the user and user group of the local data file are the same as those of the created table.

  2. Build JDBC URL.

    NOTE:

    In HA mode, the host and port of the URL must be ha-cluster.

    For a normal cluster, you need to change the 61st and 62nd lines (as shown in the following) of com.huawei.bigdata.spark.examples.ThriftServerQueriesTest.scala in the sample code from

    val sb = new StringBuilder("jdbc:hive2://ha-cluster/default"

    + securityConfig)

    to val sb = new StringBuilder("jdbc:hive2://ha-cluster/default").

    val HA_CLUSTER_URL = "ha-cluster"
    val sb = new StringBuilder(s"jdbc:hive2://$HA_CLUSTER_URL/default;")
    val url = sb.toString()

  3. Load the Hive JDBC driver.

    Class.forName("org.apache.hive.jdbc.HiveDriver").newInstance();

  4. Obtain the JDBC connection, execute the HiveQL statement, return the queried column name and results to the console, and close the JDBC connection.

    In network congestion, configure a timeout interval for a connection between the client and JDBCServer to avoid a client suspension due to timeless wait of the return result from the server. The configuration method is as follows:

    Before using the DriverManager.getConnection method to obtain the JDBC connection, add the DriverManager.setLoginTimeout(n) method to configure a timeout interval. n indicates the timeout interval for waiting for the return result from the server. The unit is second, the type is Int, and the default value is 0 (indicating never timing out).

    var connection: Connection = null
    var statement: PreparedStatement = null
    try {
      connection = DriverManager.getConnection(url)
      for (sql <- sqls) {
        println(s"---- Begin executing sql: $sql ----")
        statement = connection.prepareStatement(sql)
    
        val result = statement.executeQuery()
    
        val resultMetaData = result.getMetaData
        val colNum = resultMetaData.getColumnCount
        for (i <- 1 to colNum) {
          print(resultMetaData.getColumnLabel(i) + "\t")
        }
        println()
    
        while (result.next()) {
          for (i <- 1 to colNum) {
            print(result.getString(i) + "\t")
          }
          println()
        }
        println(s"---- Done executing sql: $sql ----")
      }
    } finally {
      if (null != statement) {
        statement.close()
      }
    
      if (null != connection) {
        connection.close()
      }
    }

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback