Updated on 2024-08-12 GMT+08:00

Using the JDBC interface to submit a data analysis task

Function

This section describes how to use the JDBC sample program to complete data analysis tasks.

Example Codes

Use the Hive JDBC interface to submit a data analysis task. This sample program is stored in JDBCExample.java of hive-examples/hive-jdbc-example. The following modules implement this function:
  1. Read the property file of the HiveServer client. The hiveclient.properties file is saved in the hive-jdbc-example/src/main/resources directory of the JDBC example program provided by Hive.
    Properties clientInfo = null;
    String userdir = System.getProperty("user.dir") + File.separator
    + "conf" + File.separator;
    InputStream fileInputStream = null;
    try{
    clientInfo = new Properties();
    //hiveclient.properties indicates the client configuration file.
    //Obtain the hiveclient.properties file from the config directory in the directory where the client installation package of the corresponding instance is decompressed and upload the file to the hive-jdbc-example/src/main/resources directory of the JDBC sample project.
    String hiveclientProp = userdir + "hiveclient.properties" ;
    File propertiesFile = new File(hiveclientProp);
    fileInputStream = new FileInputStream(propertiesFile);
    clientInfo.load(fileInputStream);
    }catch (Exception e) {
    throw new IOException(e);
    }finally{
    if(fileInputStream != null){
    fileInputStream.close();
    fileInputStream = null;
    }
    }
  2. Obtain the IP address and port list of ZooKeeper, the cluster authentication mode, the SASL configuration of HiveServers, node names of HiveServers in ZooKeeper, the discovery mode from the client to the server, and the principal server process for user authentication. The configuration sample code is automatically read from the hiveclient.properties file.
    //The format of zkQuorum is xxx.xxx.xxx.xxx:2181,xxx.xxx.xxx.xxx:2181,xxx.xxx.xxx.xxx:2181"; 
    //xxx.xxx.xxx.xxx is the IP address of the node where ZooKeeper resides. The default port is 24002. 
        zkQuorum = clientInfo.getProperty("zk.quorum");
        auth = clientInfo.getProperty("auth");
        sasl_qop = clientInfo.getProperty("sasl.qop");
        zooKeeperNamespace = clientInfo.getProperty("zooKeeperNamespace");
        serviceDiscoveryMode = clientInfo.getProperty("serviceDiscoveryMode");
        principal = clientInfo.getProperty("principal"); 
         
  3. Define HQL. HQL must be a single statement and cannot contain ";".
         // Define HQL. HQL cannot contain ";" 
         String[] sqls = {"CREATE TABLE IF NOT EXISTS employees_info(id INT,name STRING)", 
                "SELECT COUNT(*) FROM employees_info", "DROP TABLE employees_info"};     
  4. Build JDBC URL.

    You can also implement pre-authentication without the need of providing the account and keytab file path. If IBM JDK is used to run Hive applications, pre-authentication in JDBC sample code 2 must be implemented.

    The following is an example of the JDBC URL composed of code snippets:

    jdbc:hive2://xxx.xxx.xxx.xxx:2181,xxx.xxx.xxx.xxx:2181,xxx.xxx.xxx.xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sasl.qop=auth-conf;auth=KERBEROS;principal=hive/hadoop.<system domain name>@<system domain name>;

    To obtain the system domain name, log in to FusionInsight Manager, choose System > Permission > Domain and Mutual Trust, and check the Local Domain parameter.

    hive/hadoop.the system domain name indicates the user name. All letters in the system domain name contained in the user name are in lowercase. For example, if Local Domain is set to 9427068F-6EFA-4833-B43E-60CB641E5B6C.COM, the user name is hive/hadoop.9427068f-6efa-4833-b43e-60cb641e5b6c.com.

         // Concat JDBC URL  
           // Normal mode 
           sBuilder.append(";serviceDiscoveryMode=")  
                   .append(serviceDiscoveryMode) 
                   .append(";zooKeeperNamespace=") 
                   .append(zooKeeperNamespace)
                   .append(";auth=none;"); 
  5. Load the Hive JDBC driver.
       // Load the Hive JDBC driver. 
       Class.forName(HIVE_DRIVER);
  6. Obtain the JDBC connection, confirm the HQL type (DDL/DML), call ports to run the HQL statement, return the queried column name and results to the console, and close the JDBC connection.
     
       Connection connection = null; 
         try { 
           // Obtain the JDBC connection. 
           // If the normal mode is used, the second parameter needs to be set to a correct username. Otherwise, the anonymous user will be used for login.
           connection = DriverManager.getConnection(url, "", ""); 
              
         // Create a table 
         // To import data to a table after the table is created, you can use the LOAD statement. For example, import data from the HDFS to the table. 
           //load data inpath '/tmp/employees.txt' overwrite into table employees_info; 
           execDDL(connection,sqls[0]); 
           System.out.println("Create table success!"); 
             
         // Query 
         execDML(connection,sqls[1]); 
      
         // Delete the table
         execDDL(connection,sqls[2]); 
         System.out.println("Delete table success!"); 
       } 
       finally {
         // Close the JDBC connection. 
           if (null != connection) { 
             connection.close(); 
           }
      
     public static void execDDL(Connection connection, String sql) 
       throws SQLException { 
         PreparedStatement statement = null; 
         try { 
           statement = connection.prepareStatement(sql); 
           statement.execute(); 
         } 
         finally { 
           if (null != statement) { 
             statement.close(); 
           } 
         } 
       } 
      
      
       public static void execDML(Connection connection, String sql) throws SQLException { 
         PreparedStatement statement = null; 
         ResultSet resultSet = null; 
         ResultSetMetaData resultMetaData = null; 
          
         try { 
           // Run the HQL statement. 
           statement = connection.prepareStatement(sql); 
           resultSet = statement.executeQuery(); 
            
           // Return the queried column name to the console. 
           resultMetaData = resultSet.getMetaData(); 
           int columnCount = resultMetaData.getColumnCount(); 
           for (int i = 1; i <= columnCount; i++) { 
             System.out.print(resultMetaData.getColumnLabel(i) + '\t'); 
           } 
           System.out.println(); 
            
           // Return the results to the console. 
           while (resultSet.next()) { 
             for (int i = 1; i <= columnCount; i++) { 
               System.out.print(resultSet.getString(i) + '\t'); 
             } 
             System.out.println(); 
           } 
         } 
         finally { 
           if (null != resultSet) { 
             resultSet.close(); 
           } 
            
           if (null != statement) { 
             statement.close(); 
           } 
         } 
       }