Updated on 2024-04-02 GMT+08:00

JDBCServer Interface

Overview

The JDBCServeris another implement of HiveServer2 in the Hive. The Spark SQL is used to process the SQL statement at its bottom. Therefore, theJDBCServer has better performance than the Hive.

The JDBCServeris a JDBC interface. Users can log in to theJDBCServerand access the Spark SQL data through the JDBC. When theJDBCServeris started, a Spark SQL application is started, and the clients connected through the JDBC share the resources in this application. That is, various users can share data. When theJDBCServeris started, a listener is also started to wait for the connection of the JDBC client and submit the query after the connection. Therefore, during the configuration of theJDBCServer, at least the host name and port of theJDBCServer must be configured. If Hive data is required, the uris of the hive metastore needs to be provided.

JDBCServer starts a JDBC service on port 22550 of the installation node by default. (If you want to change the port, configure the hive.server2.thrift.port parameter.) You can connect to JDBCServer using Beeline or running the JDBC client code to run SQL statements.

For other information about the JDBCServer, visit the Spark official website:http://spark.apache.org/docs/3.1.1/sql-programming-guide.html#distributed-sql-engine.

Beeline

For connection methods of the Beeline provided by the open-source community, visit https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients.

The following command is used as a connection example of Beeline.

sh CLIENT_HOME/spark/bin/beeline -u "jdbc:hive2://<zkNode1_IP>:<zkNode1_Port>,<zkNode2_IP>:<zkNode2_Port>,<zkNode3_IP>:<zkNode3_Port>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=sparkthriftserver2x;"

  • <zkNode1_IP>:<zkNode1_Port>,<zkNode2_IP>:<zkNode2_Port>,<zkNode3_IP>:<zkNode3_Port>indicates the URL of ZooKeeper. Multiple URLs is separated by comma. For example:192.168.81.37:2181,192.168.195.232:2181,192.168.169.84:2181.
  • sparkthriftserver2xindicates the directory in Zookeeper where a randomJDBCServerinstance is selected for the connection to the client.

JDBC Client Codes

Log in to the JDBCServerby using the JDBC client codes and access the Spark SQL data. For details, seeAccessing the Spark SQL Through JDBC.

Enhanced Features

Compared with the open-source community, Huawei provides two enhanced features: the JDBCServerHA solution and timeout of configuring theJDBCServer.

  • The JDBCServer HA solution is described as follows:

    When multiple active nodes of JDBCServer provide services at the same time, a new client will be connected to another active node if a fault occurs on one node, ensuring continuous services for clusters. The operations by using the Beeline and JDBC client codes are the same. The operations by using the Beeline and JDBC client codes are the same.

  • Configure the timeout of the connection between the client and JDBCServer.
    • Beeline

      In network congestion, this feature can avoid the suspending of Beeline due to timeless wait of the return from the server. The method is described as follows:

      When the Beeline is started, add --socketTimeOut=n. The n indicates the timeout waiting for the service return. The unit is second and the default value is 0 (indicating never timing out). Set the maximum timeout waiting time as required.

    • JDBC Client Codes

      In the scenario of network congestion, this feature can avoid the suspending of the client due to limitless wait of the return of server. The method to use is shown as follows:

      Before the obtaining of the JDBC by using the DriverManager.getConnection method, add the DriverManager.setLoginTimeout(n) method to configure the timeout length. nindicates the timeout length of waiting for the service return. The unit is second and the type is Int. The default value is0 (indicating never timing out). Set the maximum timeout waiting time as required.