Updated on 2023-08-31 GMT+08:00

Preparing the Configuration Files for Connecting to the Cluster

User Information for Cluster Authentication

For an MRS cluster with Kerberos authentication enabled, you need to prepare a user who has the operation permission on related components for program authentication.

The following Hive permission configuration example is for reference only. You can modify the configuration as you need.

  1. Log in to FusionInsight Manager.
  2. Choose Cluster > Services > Hive. On the displayed page, click More > Enable Ranger in the upper right corner. Check whether the button is grayed out.

    • If it is grayed out, create a user and assign related operation rights to the user in Ranger.
      1. Choose System > Permission > User. On the displayed page, click Create. On the displayed page, create a machine-machine user, for example, developuser.

        Add the hive user group to User Group.

      2. Log in to the Ranger management page as the Ranger administrator rangeradmin.

        The default password of user rangeradmin is Rangeradmin@123. For details, see User Account List.

      3. On the homepage, click Settings and choose Roles.
      4. Click the role whose Role Name is admin. In the Users area, click Select User and select the user name created in 2.a.
      5. Click Add Users, select Is Role Admin in the row where the username is located, and click Save.

        If an HDFS path is specified when you perform operations on a database or table, grant the HDFS path permission to the user in Ranger by referring to Table 2.

    • If the button is available, create a user and grant related operation permissions to the user on Manager.
      1. Choose System > Permission > Role. On the displayed page, click Create Role.
        1. Enter the role name, for example, developrole.
        2. In the Configure Resource Permission table, choose Name of the desired cluster > Yarn > Scheduler Queue > root. Select Submit and Admin for default, and click OK.

          If an HDFS path is specified when you perform operations on a database or table, grant the HDFS path permission to the user in Manager by referring to Table 2.

      2. Choose User in the navigation pane and click Create on the displayed page. Create a machine-machine user, for example, developuser.
        • Add the hive user group to User Group.
        • Add the new role created in 2.a to Role.

  3. Log in to FusionInsight Manager as user admin and choose System > Permission > User. In the Operation column of developuser, choose More > Download Authentication Credential. Save the file and decompress it to obtain the user.keytab and krb5.conf files of the user.

Preparing the Configuration Files of the Running Environment

During the development or a test run of the program, you need to use cluster configuration files to connect to an MRS cluster. The configuration files usually contain the cluster component information file and user files used for security authentication. You can obtain the required information from the created MRS cluster.

Nodes used for program debugging or running must be able to communicate with the nodes within the MRS cluster, and the hosts domain name must be configured.

  • Scenario 1: Prepare the configuration files required for debugging in the local Windows development environment.
    1. Log in to FusionInsight Manager and choose Cluster > Dashboard > More > Download Client. Set Select Client Type to Configuration Files Only. Select the platform type based on the type of the node where the client is to be installed (select x86_64 for the x86 architecture and aarch64 for the Arm architecture) and click OK. After the client files are packaged and generated, download the client to the local PC as prompted and decompress it.

      For example, if the client file package is FusionInsight_Cluster_1_Services_Client.tar, decompress it to obtain FusionInsight_Cluster_1_Services_ClientConfig_ConfigFiles.tar. Then, decompress this file.

    2. Go to the Hive\config directory where the client is decompressed and obtain the configuration files listed in Table 1.
      Table 1 Configuration files

      File

      Description

      hiveclient.properties

      Configuration parameters for Hive client connection

      core-site.xml

      Hadoop client configuration parameters

    3. Copy the hosts file content from the decompression directory to the hosts file of the local PC.
      • If you need to debug the application in the local Windows environment, ensure that the local PC can communicate with the hosts listed in the hosts file.
      • If your PC cannot communicate with the network plane where the MRS cluster is deployed, you can bind an EIP to access the MRS cluster. For details, see How Do I Access Hive of the Cluster in Security Mode on Windows Using EIPs?.
      • The local hosts file in a Windows environment is stored, for example, in C:\WINDOWS\system32\drivers\etc\hosts.
  • Scenario 2: Prepare the configuration files required for running the program in a Linux environment.
    1. Install the client on a node.

      For example, the client installation directory is /opt/client.

      The difference between the client time and the cluster time must be less than 5 minutes.

    2. Obtain the configuration files.
      1. Log in to FusionInsight Manager and choose Cluster > Dashboard > More > Download Client. Set Select Client Type to Configuration Files Only. Select the platform type based on the type of the node where the client is to be installed (select x86_64 for the x86 architecture and aarch64 for the Arm architecture), select Save to Path, and click OK. Download the client configuration file to the active OMS node of the cluster.
      2. Log in to the active OMS node as user root, go to the directory where the client configuration file is stored (/tmp/FusionInsight-Client/ by default), decompress the software package, and obtain the configuration files listed in Table 1 from the Hive/config directory.

        For example, if the client software package is FusionInsight_Cluster_1_Services_Client.tar and the download path is /tmp/FusionInsight-Client on the active OMS node, run the following commands:

        cd /tmp/FusionInsight-Client

        tar -xvf FusionInsight_Cluster_1_Services_Client.tar

        tar -xvf FusionInsight_Cluster_1_Services_ClientConfig_ConfigFiles.tar

        cd FusionInsight_Cluster_1_Services_ClientConfig_ConfigFiles

    3. Check the network connection of the client node.

      During the client installation, the system automatically configures the hosts file on the client node. You are advised to check whether the /etc/hosts file contains the host names of the nodes in the cluster. If no, manually copy the content of the hosts file in the decompression directory to the hosts file on the node where the client is located, to ensure that the local host can communicate with each host in the cluster.

Hive Operation Permissions

Before application development, the user must have been added to the Hive group. Additional operation permissions must be obtained from the system administrator. For details about permission requirements, see Table 2. To run example programs, you must have the CREATE permission for the default database.

Table 2 Required permissions

Operation Type/Object

Operation

Required Permissions

DATABASE

CREATE DATABASE dbname [LOCATION "hdfs_path"]

If the HDFS path hdfs_path is specified, the ownership and RWX permission of hdfs_path are required.

DROP DATABASE dbname

The ownership of database dbname is required.

ALTER DATABASE dbname SET OWNER user_or_role

The admin permission is required.

TABLE

CREATE TABLE table_a

The CREATE permission on databases is required.

CREATE TABLE table_a AS SELECT table_b

The CREATE permission on databases, and the SELECT permission on table_b are required.

CREATE TABLE table_a LIKE table_b

The CREATE permission on databases is required.

CREATE [EXTERNAL] TABLE table_a LOCATION "hdfs_path"

The CREATE permission for databases, and the ownership and RWX permission of hdfs_path in HDFS are required.

DROP TABLE table_a

The ownership of table_a is required.

ALTER TABLE table_a SET LOCATION "hdfs_path"

The ownership of table_a, and the ownership and RWX permission of hdfs_path in HDFS are required.

ALTER TABLE table_a SET FILEFORMAT

The ownership of table_a is required.

TRUNCATE TABLE table_a

The ownership of table_a is required.

ANALYZE TABLE table_a COMPUTE STATISTICS

The SELECT and INSERT permission on table_a is required.

SHOW TBLPROPERTIES table_a

The SELECT permission on table_a is required.

SHOW CREATE TABLE table_a

The SELECT permission WITH GRANT OPTION on table_a is required.

Alter

ALTER TABLE table_a ADD COLUMN

The ownership of table_a is required.

ALTER TABLE table_a REPLACE COLUMN

The ownership of table_a is required.

ALTER TABLE table_a RENAME

The ownership of table_a is required.

ALTER TABLE table_a SET SERDE

The ownership of table_a is required.

ALTER TABLE table_a CLUSTER BY

The ownership of table_a is required.

PARTITION

ALTER TABLE table_a ADD PARTITION partition_spec LOCATION "hdfs_path"

The INSERT permission on table_a, and the ownership and RWX permission of hdfs_path in HDFS are required.

ALTER TABLE table_a DROP PARTITION partition_spec

The DELETE permission on table_a is required.

ALTER TABLE table_a PARTITION partition_spec SET LOCATION "hdfs_path"

The ownership of table_a, and the ownership and RWX permission of hdfs_path in HDFS are required.

ALTER TABLE table_a PARTITION partition_spec SET FILEFORMAT

The ownership of table_a is required.

LOAD

LOAD INPATH 'hdfs_path' INTO TABLE table_a

The INSERT permission on table_a, and the ownership and RWX permission of hdfs_path in HDFS are required.

INSERT

INSERT TABLE table_a SELECT FROM table_b

The INSERT permission on table_a and SELECT permission on table_b are required. The SUBMIT permission on the default Yarn queue is required.

SELECT

SELECT * FROM table_a

The SELECT permission on table_a is required.

SELECT FROM table_a JOIN table_b

The SELECT permission on table_a and table_b, and the SUBMIT permission on the default Yarn queue are required.

SELECT FROM (SELECT FROM table_a UNION ALL SELECT FROM table_b)

The SELECT permission on table_a and table_b. The SUBMIT permission on the default Yarn queue is required.

EXPLAIN

EXPLAIN [EXTENDED|DEPENDENCY] query

The RX permission on related table directories is required.

VIEW

CREATE VIEW view_name AS SELECT ...

The SELECT permission WITH GRANT OPTION on related tables is required.

ALTER VIEW view_name RENAME TO new_view_name

The ownership of view_name is required.

DROP VIEW view_name

The ownership of view_name is required.

FUNCTION

CREATE [TEMPORARY] FUNCTION function_name AS 'class_name'

The admin permission is required.

DROP [TEMPORARY] function_name

The admin permission is required.

MACRO

CREATE TEMPORARY MACRO macro_name ...

The admin permission is required.

DROP TEMPORARY MACRO macro_name

The admin permission is required.

  • You can perform all the previous operations when owning the admin permission of Hive and the corresponding HDFS directory permissions.
  • If the component uses Ranger for permission control, you need to configure Ranger policies for permission management.