Configuring Hive to Access HBase Data
Scenario
Hive on HBase allows users to query and operate data stored in HBase through the Hive SQL APIs. It combines HBase's efficient storage and real-time read/write capabilities and Hive's SQL query capabilities, providing a flexible and efficient data processing mode.
In MRS, Hive can access and process data stored in HBase through internal and external tables. This section describes how to use Hive on MRS to process MRS HBase data.
Prerequisites
- A cluster client has been installed. For how to install the client, see Installing a Client. In the following operations, the client is installed in /opt/hadoopclient directory. You can change it as required.
- If Kerberos authentication is enabled for the cluster (in security mode), a user for creating Hive on HBase tables has been created. The user has been added to the hive user group and configured with the HBase permissions:
- If Ranger authentication is enabled for HBase, configure the permissions to create tables, and write data into and read data from the tables for the user. For details, see Adding a Ranger Access Permission Policy for HBase.
- If Ranger authentication is disabled for HBase, configure the permissions to create tables, and write data into and read data from the tables for the user. For details, see Creating an HBase Permission Role.
Hive Accessing HBase Through Internal Tables
If no table is created in HBase, you can create a table in Hive. Hive automatically writes the table structure and data to HBase. The following example describes how to create a table in Hive to access HBase.
- Log in to the node where the client is installed as the client installation user and run the following commands to configure environment variables and authenticate the user:
Go to the client installation directory.
cd Client installation directoryLoad the environment variables.
source bigdata_env
Authenticate the user. Skip this step for clusters with Kerberos authentication disabled.
kinit Hive service user - Log in to the Hive client.
beeline
- Create an HBase table in Hive, insert data into the table, and view the table data.
- Create an HBase table, for example, hive_hbase_table, in Hive.
create table hive_hbase_table(id int, name string) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ("hbase.columns.mapping" = ":key,cf1:name") tblproperties ("hbase.table.name" = "hive_hbase_table"); - Insert data into the table.
insert into table hive_hbase_table values(12,'abab');
- View the table data.
select * from hive_hbase_table;
The table data is as follows:
Figure 1 Viewing Hive table data
- Create an HBase table, for example, hive_hbase_table, in Hive.
- Exit the Hive client.
!q
- Log in to the HBase client.
hbase shell
- Check whether the table has been created in HBase.
describe 'hive_hbase_table'
If the command output shown in Figure 2 is displayed, the hive_hbase_table table has been created in HBase using Hive.
- Check whether there is data written by Hive in HBase.
scan 'hive_hbase_table'
If the data is the same as that inserted in 3.b, the HBase table has been created in Hive.
Figure 3 Viewing data in the HBase table
Hive Accessing HBase Through Foreign Tables
If a table has been created in HBase and you want to access the table using Hive, you can create a foreign table in Hive to map the table in HBase. In this way, you can access the table in HBase through Hive.
- Log in to the node where the client is installed as the client installation user and run the following commands to configure environment variables and authenticate the user:
Go to the client installation directory.
cd Client installation directoryLoad the environment variables.
source bigdata_env
Authenticate the user. Skip this step for clusters with Kerberos authentication disabled.
kinit Hive service user - Log in to the HBase client.
hbase shell
- Create a table in HBase and query table data.
- Press Ctrl + C to exit the HBase client.
- Log in to the Hive client.
beeline
- Create a foreign table in Hive and map it to the table in HBase.
create external table hbase_table(key int,col1 string,col2 string) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties("hbase.columns.mapping" = "f:col1,f:col2") tblproperties("hbase.table.name" = "hbase_table", "hbase.mapred.output.outputtable" = "hbase_table"); - View the hbase_table data in Hive.
select * from hbase_table;
If the data is the same as that inserted in 3.b, the HBase foreign table has been created in Hive.
Figure 5 Viewing Hive table data
Helpful Links
- To use Hive on HBase after cross-cluster mutual trust is configured, see Configuring Hive on HBase in Across Clusters with Mutual Trust Enabled.
- To delete one or more data records that meet specific conditions from the HBase table associated with Hive, see Deleting Single-Row Records from Hive on HBase.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot

