Help Center/ MapReduce Service/ Component Operation Guide (LTS)/ Using Hive/ Configuring Hive to Read Data from a Relational Database
Updated on 2025-08-22 GMT+08:00

Configuring Hive to Read Data from a Relational Database

Scenario

Hive allows users to create external tables to associate with other relational databases. External tables read data from associated relational databases and support Join operations with other tables in Hive.

Currently, Hive can read data from DB2 and Oracle relational databases.

Prerequisites

The Hive client has been installed. For details about how to download and install the cluster client, see Installing an MRS Cluster Client.

Procedure

  1. Log in to the node where the client is installed as the Hive client installation user.
  2. Run the following command to go to the client installation directory:

    cd Client installation directory

    For example, if the client is installed in the /opt/client directory, run the following command:

    cd /opt/client

  3. Run the following command to configure environment variables:

    source bigdata_env

  4. Check whether Kerberos authentication is enabled for the cluster.

    • If yes, run the following command to authenticate the user:
      kinit Hive service user
    • If no, go to 5.

  5. Run the following command to upload the driver JAR package of the relational database to be associated to an HDFS directory.

    hdfs dfs -put Directory where the JAR package is located HDFS directory for storing the JAR package

    For example, to upload the Oracle driver JAR package in /opt to the /tmp directory in HDFS, run the following command:

    hdfs dfs -put /opt/ojdbc6.jar /tmp

  6. Create an external table on the Hive client to associate with the relational database, as shown in the following example. The following is an example of associating with an Oracle Linux 6 database:

    1. Log in to the Hive client.
      beeline
    2. If Kerberos authentication is enabled for the cluster (in security mode), configure the admin permission for the user who creates the Hive table.
      set role admin;
    3. Add the driver JAR package of the target relational database. The driver JAR package varies depending on the database. Change the path of the JAR package as required.
      ADD JAR hdfs:///tmp/ojdbc6.jar;
    4. Create a Hive table. Ensure that the table has one more column than that returned by the database. This column is used for pagination query.
      CREATE EXTERNAL TABLE ora_test
      (id STRING,rownum string)
      STORED BY 'com.qubitproducts.hive.storage.jdbc.JdbcStorageHandler'
      TBLPROPERTIES (
      "qubit.sql.database.type" = "ORACLE",
      "qubit.sql.jdbc.url" = "jdbc:oracle:thin:@//10.163.xxx.xxx:1521/mydb",
      "qubit.sql.jdbc.driver" = "oracle.jdbc.OracleDriver",
      "qubit.sql.query" = "select name from aaa",
      "qubit.sql.column.mapping" = "id=name",
      "qubit.sql.dbcp.username" = "test",
      "qubit.sql.dbcp.password" = "xxx");
      Table 1 Parameters for creating a Hive table

      Parameter

      Description

      qubit.sql.database.type

      The type of the associated relational database, for example, ORACLE.

      qubit.sql.jdbc.url

      URL for connecting to a relational database through JDBC. The URL format varies depending on the database, for example, jdbc:oracle:thin:@//10.163.xxx.xxx:1521/mydb.

      qubit.sql.jdbc.driver

      Name of the relational database driver class, for example, oracle.jdbc.OracleDriver.

      qubit.sql.query

      Query SQL statements executed in the relational database. The result is returned to the Hive table.

      qubit.sql.column.mapping

      (Optional) Mapping between Hive table columns and relational database table columns

      qubit.sql.dbcp.username

      Username for connecting to the relational database.

      qubit.sql.dbcp.password

      Password for connecting to the relational database.

      Commands containing authentication passwords pose security risks. Disable the command recording function (history) before running such commands to prevent information leakage.