Configuring Permissions for Spark SQL Service User

Scenarios

During SparkSQL service development, other components may also be used. For example, when you use Spark on HBase, the HBase permissions are required. This section describes how to configure the association between Spark SQL and HBase.

Prerequisites

The Spark client has been installed in a directory, for example, /opt/client.
You have obtained a user account with the MRS cluster administrator permissions, for example, admin.

Procedure

Spark on HBase authorization
After the permissions are assigned, you can use statements that are similar to SQL statements to access HBase tables from Spark SQL. The following uses the procedure for assigning a user the permissions to query HBase tables as an example.
1. Log in to FusionInsight Manager, choose Cluster > Services > ClickHouse, click Configurations and then All Configurations, search for spark.yarn.security.credentials.hbase.enabled, and change the value to true.
2. On Manager, create a role, for example, hive_hbase_create, and grant the permission to create HBase tables to the role.
  In the Configure Resource Permission table, choose Name of the desired cluster > HBase > HBase Scope > global. Select create of the namespace default, and click OK.
  
  In this example, the created table is saved in the default database of Hive and has the CREATE permission of the default database. If you save the table to a Hive database other than default, perform the following operations:
  
  In the Configure Resource Permission table, choose Name of the desired cluster > Hive > Hive Read Write Privileges, select CREATE for the desired database, and click OK.
3. On Manager, create a role, for example, hive_hbase_submit, and grant the permission to submit tasks to the Yarn queue.
  In the Configure Resource Permission table, choose Name of the desired cluster > Yarn > Scheduling Queue > root. Select Submit of default, and click OK.
4. On Manager, create a human-machine user, for example, hbase_creates_user, add the user to the hive group, and bind the hive_hbase_create and hive_hbase_submit roles to create Spark SQL and HBase tables.
5. Log in to the node where the client is installed as the client installation user.
6. Run the following command to configure environment variables:
  Load the environment variables.
```
source /opt/client/bigdata_env
```
  Load the component environment variables.
```
source /opt/client/Spark/component_env
```
7. Run the following command to authenticate the user:
```
kinit hbase_creates_user
```
8. Run the following commands to enter the shell environment on the Spark JDBCServer client:
```
/opt/client/Spark2x/spark/bin/beeline -u "jdbc:hive2://<zkNode1_IP>:<zkNode1_Port>,<zkNode2_IP>:<zkNode2_Port>,<zkNode3_IP>:<zkNode3_Port>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=sparkthriftserver2x;user.principal=spark2x/hadoop.<System domain name>@<System domain name>;saslQop=auth-conf;auth=KERBEROS;principal=spark2x/hadoop.<System domain name>@<System domain name>;"
```
9. Run the following command to create a table in Spark SQL and HBase, for example, create the hbaseTable table:
```
create table hbaseTable (id string, name string, age int) using org.apache.spark.sql.hbase.HBaseSource options (hbaseTableName "table1", keyCols "id", colsMapping = ", name=cf1.cq1, age=cf1.cq2");
```
  The created Spark SQL table and the HBase table are stored in the Hive database default and the HBase namespace default, respectively.
10. On Manager, create a role, for example, hive_hbase_select, and grant the role the permission to query Spark SQL on HBase table hbaseTable and HBase table hbaseTable.
  - In the Configure Resource Permission table, choose Name of the desired cluster > HBase > HBase Scope > global > default. Select read for the hbaseTable table, and click OK to grant the table query permission to the HBase role.
  - Edit the role. In the Configure Resource Permission table, choose Name of the desired cluster > HBase > HBase Scope > global > hbase. Select Execute for hbase:meta, and click OK.
  - Edit the role. In the Configure Resource Permission table, choose Name of the desired cluster > Hive > Hive Read Write Privileges > default. Select SELECT for the hbaseTable table, and click OK.
11. On Manager, create a human-machine user, for example, hbase_select_user, add the user to the hive group, and bind the hive_hbase_select role to the user for querying Spark SQL and HBase tables.
12. Run the following command to configure environment variables:
  Load the environment variables.
```
source /opt/client/bigdata_env
```
  Load the component environment variables.
```
source /opt/client/Spark/component_env
```
13. Run the following command to authenticate users:
```
kinit hbase_select_user
```
14. Run the following commands to enter the shell environment on the Spark JDBCServer client:
```
/opt/client/Spark2x/spark/bin/beeline -u "jdbc:hive2://<zkNode1_IP>:<zkNode1_Port>,<zkNode2_IP>:<zkNode2_Port>,<zkNode3_IP>:<zkNode3_Port>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=sparkthriftserver2x;user.principal=spark2x/hadoop.<System domain name>@<System domain name>;saslQop=auth-conf;auth=KERBEROS;principal=spark2x/hadoop.<System domain name>@<System domain name>;"
```
15. Run the following command to use a Spark SQL statement to query HBase table data:
```
select * from hbaseTable;
```