Using the Hive Client

Scenario

This section guides users to use a Hive client in an O&M or service scenario.

Prerequisites

You have installed the MRS cluster client. For how to install the client, see Installing a Client. In the following operations, the client is installed in /opt/hadoopclient directory. You can change it as required.

Video Tutorial

This video demonstrates how to use the Hive client to create a foreign table stored in HDFS, insert data, query data, and delete a table after an MRS cluster with Kerberos authentication enabled is created and the client is installed.

The UI may vary depending on the version. The video tutorial is for reference only.

Using the Hive Client

Log in to the node where the client is installed as the client installation user.
Run the following command to go to the client installation directory:
```
cd /opt/hadoopclient
```
Run the following command to configure environment variables:
```
source bigdata_env
```
Log in to the Hive client based on the cluster authentication mode.
- If Kerberos authentication is enabled for the cluster (in security mode), run the following command to authenticate the user and log in to the Hive client. The user must have the permission to create Hive tables. For example, the user is added to the hive (the primary group) and hadoop user groups. For details about how to create a user, see Creating a Hive User and Binding the User to a Role.
  Authenticate the user.
```
kinit Component service user
```
  Log in to the Hive client.
```
beeline
```
- If Kerberos authentication is disabled for the cluster (in normal mode), run the following command to log in to the Hive client. If no component service user is specified, the current OS user is used to log in to the Hive client.
```
beeline -nComponent service user
```
Run the following command to create a table, for example, test:
```
create table test(id int,name string);
```
Run the following command to insert data to the table:
```
insert into table test(id,name) values("11","A");
```
Run the following command to query table data:
```
select * from test;
```
Run the following command to delete the Hive table:
```
drop table test;
```
Run the following command to exit the Hive client:
```
!q
```
- Exit the beeline client by running the !q command instead of by pressing Ctrl + C. Otherwise, the temporary files generated by the connection cannot be deleted and a large number of junk files will be generated as a result.
- If multiple statements need to be entered during the use of beeline clients, separate the statements with semicolons (;) and set the value of entireLineAsCommand to false.
  Setting method: If beeline is not started, run the beeline --entireLineAsCommand=false command. If the beeline has been started, run the !set entireLineAsCommand false command.
  
  After the setting, if a statement contains semicolons (;) that are not meant to denote the end of the statement, escape characters must be added, for example, select concat_ws('\;', collect_set(col1)) from tbl.

Common Hive Client Commands

You can also use the following commands to perform Hive table operations on the HCatalog client:
```
hcat -e "cmd"
```
cmd must be a Hive DDL statement.
```
hcat -e "show tables"
```
Notes:
- To use the HCatalog client, choose More > Download Client on the service page to download the clients of all services. This restriction does not apply to the Beeline client.
- Due to permission model incompatibility, tables created using the HCatalog client cannot be accessed on the HiveServer client, but can be accessed on the WebHCat client.
- If you use the HCatalog client in a cluster with Kerberos authentication disabled, the system will execute DDL commands as the current OS user.

The following table lists common Hive Beeline commands.

For more commands, see https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommands.

**Table 1** Common Hive Beeline commands
Command	Description
set <key>=<value>	Sets the value of a specific configuration variable (key). Beeline does not highlights misspelled variable names.
set	Prints the list of configuration variables overwritten by users or Hive.
set -v	Prints all configuration variables of Hadoop and Hive.
add FILE[S] <filepath> <filepath>* add JAR[S] <filepath> <filepath>* add ARCHIVE[S] <filepath> <filepath>*	Adds one or more files, JAR files, or ARCHIVE files to the resource list of the distributed cache.
add FILE[S] <ivyurl> <ivyurl>* add JAR[S] <ivyurl> <ivyurl>* add ARCHIVE[S] <ivyurl> <ivyurl>*	Adds one or more files, JAR files, or ARCHIVE files to the resource list of the distributed cache using the lvy URL in the ivy://group:module:version?query_string format.
list FILE[S] list JAR[S] list ARCHIVE[S]	Lists the resources that have been added to the distributed cache.
list FILE[S] <filepath>* list JAR[S] <filepath>* list ARCHIVE[S] <filepath>*	Checks whether given resources have been added to the distributed cache.
delete FILE[S] <filepath>* delete JAR[S] <filepath>* delete ARCHIVE[S] <filepath>*	Deletes resources from the distributed cache.
delete FILE[S] <ivyurl> <ivyurl>* delete JAR[S] <ivyurl> <ivyurl>* delete ARCHIVE[S] <ivyurl> <ivyurl>*	Delete the resource added using <ivyurl> from the distributed cache.
reload	Enables HiveServer2 to detect the JAR file changes in the path specified by the parameter hive.reloadable.aux.jars.path. (You do not need to restart HiveServer2.) JAR file changes include adding, deleting, and updating JAR files.
dfs <dfs command>	Runs the dfs command.
<query string>	Executes the Hive query and prints the result to the standard output.