Updated on 2022-11-18 GMT+08:00

Creating a Spark SQL Role

Scenario

This section describes how to create and configure a SparkSQL role on Manager as the MRS cluster administrator. The Spark SQL role can be configured with the Spark administrator permission or the permission of performing operations on the table data.

Creating a database with Hive requires users to join in the hive group, without granting a role. Users have all permissions on the databases or tables created by themselves in Hive or HDFS. They can create tables, select, delete, insert, or update data, and grant permissions to other users to allow them to access the tables and corresponding HDFS directories and files. The created databases or tables are saved in the /user/hive/warehouse directory of HDFS by default.

  • If the current component uses Ranger for permission control, you need to configure permission management policies based on Ranger. For details, see Adding a Ranger Access Permission Policy for Spark2x.
  • After Ranger authentication is enabled or disabled on Spark2x, you need to restart Spark2x and download the client again or update the client configuration file spark/conf/spark-defaults.conf.

    Enable Ranger authentication: spark.ranger.plugin.authorization.enable=true

    Disable Ranger authentication: spark.ranger.plugin.authorization.enable=false

Procedure

  1. Log in to Manager, and choose System > Permission > Role.
  2. Click Create Role and set a role name and enter description.
  3. Set Configure Resource Permission. For details, see Table 1.
    • Hive Admin Privilege: Hive administrator permissions.
    • Hive Read Write Privileges: Hive data table management permission, which is the operation permission to set and manage the data of created tables.
      • Hive role management supports the administrator permission, and the permissions of accessing tables and views, without granting the database permission.
      • The permissions of the Hive administrator do not include the permission to manage HDFS.
      • If there are too many tables in the database or too many files in tables, the permission granting may last a while. For example, if a table contains 10,000 files, the permission granting lasts about 2 minutes.
      Table 1 Setting a role

      Task

      Operation

      Hive administrator permission

      In the Configure Resource Permission table, choose Name of the desired cluster > Hive and select Hive Admin Privilege.

      After being bound to the Hive administrator role, perform the following operations during each maintenance operation:
      1. Log in to the node where the Spark2x client is installed as the client installation user.
      2. Run the following command to configure environment variables:

        For example, if the Spark2x client installation directory is /opt/client, run source /opt/client/bigdata_env.

        source /opt/client/Spark2x/component_env

      3. Run the following command to perform user authentication:

        kinit Hive service user

      4. Run the following command to log in to the client tool:

        /opt/client/Spark2x/spark/bin/beeline -u "jdbc:hive2://<zkNode1_IP>:<zkNode1_Port>,<zkNode2_IP>:<zkNode2_Port>,<zkNode3_IP>:<zkNode3_Port>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=sparkthriftserver2x;user.principal=spark2x/hadoop.<system domain name>@<system domain name>;saslQop=auth-conf;auth=KERBEROS;principal=spark2x/hadoop.<system domain name>@<system domain name>;"

        NOTE:
        • <zkNode1_IP>:<zkNode1_Port>, <zkNode2_IP>:<zkNode2_Port>, <zkNode3_IP>:<zkNode3_Port> indicates the ZooKeeper URL, for example, 192.168.81.37:2181,192.168.195.232:2181,192.168.169.84:2181.
        • sparkthriftserver indicates a ZooKeeper directory, from which a random TriftServer or ProxyThriftServer is connected by the client.
        • You can log in to Manager, choose System > Permission > Domain and Mutual Trust, and view the value of Local Domain, which is the current system domain name. spark2x/hadoop.<System domain name> is the username. All letters in the system domain name contained in the username are lowercase letters. For example, Local Domain is set to 9427068F-6EFA-4833-B43E-60CB641E5B6C.COM, and the username is spark2x/hadoo.9427068f-6efa-4833-b43e-60cb641e5b6c.com.
      5. Run the following command to update the administrator permissions:

        set role admin;

      Setting the permission to query a table of another user in the default database

      1. In the Configure Resource Permission table, choose Name of the desired cluster > Hive > Hive Read Write Privileges.
      2. Click the name of the specified database in the database list. Tables in the database are displayed.
      3. In the Permission column of the specified table, select SELECT.

      Setting the permission to import data to a table of another user in the default database

      1. In the Configure Resource Permission table, choose Name of the desired cluster > Hive > Hive Read Write Privileges.
      2. Click the name of the specified database in the database list. Tables in the database are displayed.
      3. In the Permission column of the specified table, select DELETE and INSERT.
  4. Click OK.