Configuring Hive to Read Data from a Relational Database
Scenario
Hive allows users to create external tables to associate with other relational databases. External tables read data from associated relational databases and support Join operations with other tables in Hive.
Currently, Hive can read data from DB2 and Oracle relational databases.
Prerequisites
The Hive client has been installed. For details about how to download and install the cluster client, see Installing an MRS Cluster Client.
Procedure
- Log in to the node where the client is installed as the Hive client installation user.
- Run the following command to go to the client installation directory:
cd Client installation directory
For example, if the client is installed in the /opt/client directory, run the following command:
cd /opt/client
- Run the following command to configure environment variables:
source bigdata_env
- Check whether Kerberos authentication is enabled for the cluster.
- If yes, run the following command to authenticate the user:
kinit Hive service user
- If no, go to 5.
- If yes, run the following command to authenticate the user:
- Run the following command to upload the driver JAR package of the relational database to be associated to an HDFS directory.
hdfs dfs -put Directory where the JAR package is located HDFS directory for storing the JAR package
For example, to upload the Oracle driver JAR package in /opt to the /tmp directory in HDFS, run the following command:
hdfs dfs -put /opt/ojdbc6.jar /tmp
- Create an external table on the Hive client to associate with the relational database, as shown in the following example. The following is an example of associating with an Oracle Linux 6 database:
- Log in to the Hive client.
beeline
- If Kerberos authentication is enabled for the cluster (in security mode), configure the admin permission for the user who creates the Hive table.
set role admin;
- Add the driver JAR package of the target relational database. The driver JAR package varies depending on the database. Change the path of the JAR package as required.
ADD JAR hdfs:///tmp/ojdbc6.jar;
- Create a Hive table. Ensure that the table has one more column than that returned by the database. This column is used for pagination query.
CREATE EXTERNAL TABLE ora_test (id STRING,rownum string) STORED BY 'com.qubitproducts.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "qubit.sql.database.type" = "ORACLE", "qubit.sql.jdbc.url" = "jdbc:oracle:thin:@//10.163.xxx.xxx:1521/mydb", "qubit.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", "qubit.sql.query" = "select name from aaa", "qubit.sql.column.mapping" = "id=name", "qubit.sql.dbcp.username" = "test", "qubit.sql.dbcp.password" = "xxx");
Table 1 Parameters for creating a Hive table Parameter
Description
qubit.sql.database.type
The type of the associated relational database, for example, ORACLE.
qubit.sql.jdbc.url
URL for connecting to a relational database through JDBC. The URL format varies depending on the database, for example, jdbc:oracle:thin:@//10.163.xxx.xxx:1521/mydb.
qubit.sql.jdbc.driver
Name of the relational database driver class, for example, oracle.jdbc.OracleDriver.
qubit.sql.query
Query SQL statements executed in the relational database. The result is returned to the Hive table.
qubit.sql.column.mapping
(Optional) Mapping between Hive table columns and relational database table columns
qubit.sql.dbcp.username
Username for connecting to the relational database.
qubit.sql.dbcp.password
Password for connecting to the relational database.
Commands containing authentication passwords pose security risks. Disable the command recording function (history) before running such commands to prevent information leakage.
- Log in to the Hive client.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot