Help Center> MapReduce Service> Getting Started> Using Clusters with Kerberos Authentication Enabled
Updated on 2023-08-17 GMT+08:00

Using Clusters with Kerberos Authentication Enabled

Use security clusters and run MapReduce, Spark, and Hive programs.

In MRS 3.x, Presto does not support Kerberos authentication.

You can get started by reading the following topics:

  1. Creating a Security Cluster and Logging In to Manager
  2. Creating a Role and a User
  3. Running a MapReduce Program
  4. Running a Spark Program
  5. Running a Hive Program

Creating a Security Cluster and Logging In to Manager

  1. Create a security cluster. For details, see Buying a Custom Cluster. Enable Kerberos Authentication, configure Password, and confirm the password. This password is used to log in to Manager. Keep it secure.

    Figure 1 Setting security cluster parameters

  2. Log in to the MRS console.
  3. In the navigation pane on the left, choose Active Clusters and click the target cluster name on the right to access the cluster details page.
  4. Click Access Manager on the right of MRS Manager to log in to Manager.

    • If you have bound an EIP when creating the cluster, perform the following operations:
      1. Add a security group rule. By default, your public IP address used for accessing port 9022 is filled in the rule. If you want to view, modify, or delete a security group rule, click Manage Security Group Rule.
        • It is normal that the automatically generated public IP address is different from your local IP address and no action is required.
        • If port 9022 is a Knox port, you need to enable the permission to access port 9022 of Knox for accessing Manager.
      2. Select I confirm that xx.xx.xx.xx is a trusted public IP address and MRS Manager can be accessed using this IP address.
        Figure 2 Accessing Manager
    • If you have not bound an EIP when creating the cluster, perform the following operations:
      1. Select an EIP from the drop-down list or click Manage EIP to buy one.
      2. Add a security group rule. By default, your public IP address used for accessing port 9022 is filled in the rule. If you want to view, modify, or delete a security group rule, click Manage Security Group Rule.
        • It is normal that the automatically generated public IP address is different from the local IP address and no action is required.
        • If port 9022 is a Knox port, you need to enable the permission of port 9022 to access Knox for accessing MRS Manager.
      3. Select I confirm that xx.xx.xx.xx is a trusted public IP address and MRS Manager can be accessed using this IP address.
      Figure 3 Accessing Manager

  5. Click OK. The Manager login page is displayed. To assign other users the permission to access Manager, add the IP addresses as trusted ones by referring to Accessing Manager.
  6. Enter the default username admin and the password you set when creating the cluster, and click Log In.

Creating a Role and a User

For clusters with Kerberos authentication enabled, perform the following steps to create a user and assign permissions to the user to run programs.

  1. On Manager, choose System > Permission > Role.

    Figure 4 Role

  2. Click Create Role. For details, see Creating a Role.

    Figure 5 Creating a role

    Specify the following information:

    • Enter a role name, for example, mrrole.
    • In Configure Resource Permission, select the cluster to be operated, choose Yarn > Scheduler Queue > root, and select Submit and Admin in the Permission column. After you finish configuration, do not click OK but click the name of the target cluster shown in the following figure and then configure other permissions.
      Figure 6 Configuring resource permissions for Yarn
    • Choose HBase > HBase Scope. Locate the row that contains global, and select create, read, write, and execute in the Permission column. After you finish configuration, do not click OK but click the name of the target cluster shown in the following figure and then configure other permissions.
      Figure 7 Configuring resource permissions for HBase
    • Choose HDFS > File System > hdfs://hacluster/ and select Read, Write, and Execute in the Permission column. After you finish configuration, do not click OK but click the name of the target cluster shown in the following figure and then configure other permissions.
      Figure 8 Configuring resource permissions for HDFS
    • Choose Hive > Hive Read Write Privileges, select Select, Delete, Insert, and Create in the Permission column, and click OK.
      Figure 9 Configuring resource permissions for Hive

  3. Choose System. In the navigation pane on the left, choose Permission > User Group > Create User Group to create a user group for the sample project, for example, mrgroup. For details, see Creating a User Group.

    Figure 10 Creating a user group

  4. Choose System. In the navigation pane on the left, choose Permission > User > Create to create a user for the sample project. For details, see Creating a User.

    • Enter a username, for example, test. If you want to run a Hive program, enter hiveuser in Username.
    • Set User Type to Human-Machine.
    • Enter a password. This password will be used when you run the program.
    • In User Group, add mrgroup and supergroup.
    • Set Primary Group to supergroup and bind the mrrole role to obtain the permission.

      Click OK.

    Figure 11 Creating a user

  5. Choose System. In the navigation pane on the left, choose Permission > User, locate the row where user test locates, and select Download Authentication Credential from the More drop-down list. Save the downloaded package and decompress it to obtain the keytab and krb5.conf files.

    Figure 12 Downloading the authentication credential

Running a MapReduce Program

This section describes how to run a MapReduce program in security cluster mode.

Prerequisites

You have compiled the program and prepared data files, for example, mapreduce-examples-1.0.jar, input_data1.txt, and input_data2.txt. For details about MapReduce program development and data preparations, see MapReduce Introduction.

Procedure

  1. Use a remote login software (for example, MobaXterm) to log in to the master node of the security cluster using SSH (using the EIP).
  2. After the login is successful, run the following commands to create the test folder in the /opt/Bigdata/client directory and create the conf folder in the test directory:

    cd /opt/Bigdata/client
    mkdir test
    cd test
    mkdir conf

  3. Use an upload tool (for example, WinSCP) to copy mapreduce-examples-1.0.jar, input_data1.txt, and input_data2.txt to the test directory, and copy the keytab and krb5.conf files obtained in 5 in Creating Roles and Users to the conf directory.
  4. Run the following commands to configure environment variables and authenticate the created user, for example, test:

    cd /opt/Bigdata/client
    source bigdata_env
    export YARN_USER_CLASSPATH=/opt/Bigdata/client/test/conf/
    kinit test

    Enter the password as prompted. If no error message is displayed (you need to change the password as prompted upon the first login), Kerberos authentication is complete.

  5. Run the following commands to import data to the HDFS:

    cd test
    hdfs dfs -mkdir /tmp/input
    hdfs dfs -put input_data* /tmp/input

  6. Run the following commands to run the program:

    yarn jar mapreduce-examples-1.0.jar com.huawei.bigdata.mapreduce.examples.FemaleInfoCollector /tmp/input /tmp/mapreduce_output

    In the preceding commands:

    /tmp/input indicates the input path in the HDFS.

    /tmp/mapreduce_output indicates the output path in the HDFS. This directory must not exist. Otherwise, an error will be reported.

  7. After the program is executed successfully, run the hdfs dfs -ls /tmp/mapreduce_output command. The following command output is displayed.

    Figure 13 Program running result

Running a Spark Program

This section describes how to run a Spark program in security cluster mode.

Prerequisites

You have compiled the program and prepared data files, for example, FemaleInfoCollection.jar, input_data1.txt, and input_data2.txt. For details about Spark program development and data preparations, see Spark Application Development Overview.

Procedure

  1. Use a remote login software (for example, MobaXterm) to log in to the master node of the security cluster using SSH (using the EIP).
  2. After the login is successful, run the following commands to create the test folder in the /opt/Bigdata/client directory and create the conf folder in the test directory:

    cd /opt/Bigdata/client
    mkdir test
    cd test
    mkdir conf

  3. Use an upload tool (for example, WinSCP) to copy FemaleInfoCollection.jar, input_data1.txt, and input_data2.txt to the test directory, and copy the keytab and krb5.conf files obtained in 5 in section Creating Roles and Users to the conf directory.
  4. Run the following commands to configure environment variables and authenticate the created user, for example, test:

    cd /opt/Bigdata/client
    source bigdata_env
    export YARN_USER_CLASSPATH=/opt/Bigdata/client/test/conf/
    kinit test

    Enter the password as prompted. If no error message is displayed, Kerberos authentication is complete.

  5. Run the following commands to import data to the HDFS:

    cd test
    hdfs dfs -mkdir /tmp/input
    hdfs dfs -put input_data* /tmp/input

  6. Run the following commands to run the program:

    cd /opt/Bigdata/client/Spark/spark
    bin/spark-submit --class com.huawei.bigdata.spark.examples.FemaleInfoCollection --master yarn-client /opt/Bigdata/client/test/FemaleInfoCollection-1.0.jar /tmp/input

  7. After the program is run successfully, the following information is displayed.

    Figure 14 Program running result

Running a Hive Program

This section describes how to run a Hive program in security cluster mode.

Prerequisites

You have compiled the program and prepared data files, for example, hive-examples-1.0.jar, input_data1.txt, and input_data2.txt. For details about Hive program development and data preparations, see Hive Application Development Overview.

Procedure

  1. Use a remote login software (for example, MobaXterm) to log in to the master node of the security cluster using SSH (using the EIP).
  2. After the login is successful, run the following commands to create the test folder in the /opt/Bigdata/client directory and create the conf folder in the test directory:

    cd /opt/Bigdata/client
    mkdir test
    cd test
    mkdir conf

  3. Use an upload tool (for example, WinSCP) to copy FemaleInfoCollection.jar, input_data1.txt, and input_data2.txt to the test directory, and copy the keytab and krb5.conf files obtained in 5 in section Creating Roles and Users to the conf directory.
  4. Run the following commands to configure environment variables and authenticate the created user, for example, test:

    cd /opt/Bigdata/client
    source bigdata_env
    export YARN_USER_CLASSPATH=/opt/Bigdata/client/test/conf/
    kinit test

    Enter the password as prompted. If no error message is displayed, Kerberos authentication is complete.

  5. Run the following command to run the program:

    chmod +x /opt/hive_examples -R   cd /opt/hive_examples   java -cp .:hive-examples-1.0.jar:/opt/hive_examples/conf:/opt/Bigdata/client/Hive/Beeline/lib/*:/opt/Bigdata/client/HDFS/hadoop/lib/* com.huawei.bigdata.hive.example.ExampleMain

  6. After the program is run successfully, the following information is displayed.

    Figure 15 Program running result