Help Center > > User Guide> Storage-Compute Decoupling Operation Guide> Configuring a Cluster with Storage and Compute Separated

Configuring a Cluster with Storage and Compute Separated

Updated at: Mar 25, 2021 GMT+08:00

MRS allows you to store data in OBS and use an MRS cluster for data computing only. In this way, storage and compute are separated. You can create an IAM agency, which enables ECS to automatically obtain the temporary AK/SK to access OBS. This prevents the AK/SK from being exposed in the configuration file.

This function is available for components Hadoop, Hive, Spark, HBase, Presto, and Flink in clusters of MRS 1.9.2 or later.

(Optional) Step 1: Create an ECS Agency with OBS Access Permissions

  • MRS presets MRS_ECS_DEFAULT_AGENCY in the agency list of IAM so that you can select this agency when creating a cluster. This agency has the OBS OperateAccess permissions and the CES FullAccess (only available for users who have enabled fine-grained policies), CES Administrator, and KMS Administrator permissions in the region where the cluster resides. Do not modify MRS_ECS_DEFAULT_AGENCY on IAM.
  • If you want to use the preset agency, skip the step for creating an agency. If you want to use a custom agency, perform the following steps to create an agency. (To create or modify an agency, you must have the Security Administrator permission.)
  1. Log in to the IAM console.
  2. Choose Agencies. On the displayed page, click Create Agency.
  3. Enter an agency name, for example, mrs_ecs_obs.
  4. Set Agency Type to Cloud service and select ECS BMS to authorize ECS or BMS to invoke OBS. See Figure 1.
  5. Set Validity Period to Unlimited.
    Figure 1 Creating an agency
  6. Click Assign Permissions in the Permissions area.
  7. On the displayed page, search for the OBS OperateAccess policy. Select it, and click OK, as shown in Figure 2.
    Figure 2 Setting permissions
  8. Click OK.

Step 2: Create a Cluster with Storage and Compute Separated

You can configure an agency when creating a cluster or bind an agency to an existing cluster to separate storage and compute. This section uses a cluster with Kerberos authentication enabled as an example.

Configuring an agency when creating a cluster:

  1. Log in to the MRS management console.
  2. Click Buy Cluster and select the Custom Config tab.
    Figure 3 Custom purchase of a cluster
  3. On the Custom Config tab page, set software parameters.
    • Region: Use the default value.
    • Cluster Name: You can use the default name. However, you are advised to include a project name abbreviation or date for consolidated memory and easy distinguishing.
    • Cluster Version: Select MRS 1.9.2 or later. (MRS 1.9.2 or later supports OBS access through an agency.)
    • Cluster Type: Select Analysis cluster or Hybrid cluster and select all components.
    • Kerberos Authentication: This function is enabled by default. You can enable or disable it.
    • Username: The default username is admin, which is used to log in to MRS Manager.
    • Password: Set a password for user admin.
    • Confirm Password: Enter the password again.
  4. Click Next and set hardware parameters.
    • Billing Mode: Select Pay-per-use.
    • AZ: Use the default value.
    • VPC: Use the default value.
    • Subnet: Use the default value.
    • Security Group: Use the default value.
    • EIP: Use the default value.
    • Enterprise Project: Use the default value.
    • CPU Architecture: Use the default value.
    • Cluster Node: Select the number of cluster nodes and node specifications based on site requirements.
    • Login Mode: Select a method for logging in to ECSs. In this example, select Password.
    • Username: The default username is root, which is used to remotely log in to ECSs.
    • Password: Set a password for user root.
    • Confirm Password: Enter the password again.
  5. Click Next and set related parameters.
  6. In this example, configure an agency and leave other parameters blank. For details about how to configure other parameters, see (Optional) Advanced Configuration.

    Agency: Select the agency created in (Optional) Step 1: Create an ECS Agency with OBS Access Permissions or MRS_ECS_DEFAULT_AGENCY preset in IAM.

    Figure 4 Configuring an agency
  7. Click Buy Now and wait until the cluster is created.

Configuring an agency for an existing cluster:

  1. Log in to the MRS management console. In the left navigation pane, choose Clusters > Active Clusters.
  2. Click the name of the cluster to enter its details page.
  3. On the Dashboard tab page, click Click to synchronize on the right side of IAM User Sync to synchronize IAM users.
  4. On the Dashboard tab page, click Manage Agency on the right side of Agency to select an agency and click OK to bind it. Alternatively, click Create Agency to go to the IAM console to create an agency and select it.
    Figure 5 Binding an agency

Step 3: Create an OBS Bucket for Storing Data

  1. Log in to OBS Console.
  2. Click Create Bucket in the upper right corner.
  3. Enter an OBS bucket name, for example, mrs-word001.

    Use the default values for other parameters.

  4. Click Create Now.
  5. In the bucket list on OBS Console, click the bucket name to go to the bucket details page.
  6. In the navigation pane, choose Objects and create the program and input folders.
    • program: stores program packages.
    • input: stores input data.

Step 4: Verify Cluster Access to OBS

  1. Log in to a Master node as user root. For details, see Logging In to an ECS.
  2. Run the following command to set the environment variables:

    source /opt/client/bigdata_env

  3. Verify that Hadoop can access OBS.
    1. View a file list in bucket obs-test.

      hadoop fs -ls obs://mrs-word001/

    2. Check whether the file list is returned. If it is returned, OBS access is successful.
      Figure 6 Returned file list
  4. Verify that Hive can access OBS.
    1. If Kerberos authentication has been enabled for the cluster, run the following command to authenticate the current user. The current user must have a permission to create Hive tables. For details about how to configure a role with a permission to create Hive tables, see Creating a Role. For details about how to create a user and bind a role to the user, see Creating a User. If the Kerberos authentication is disabled for the current cluster, skip this step.

      kinit MRS cluster user

      Example: kinit hiveuser

    2. Run the client command of the Hive component.

      beeline

    3. Access the OBS directory in the beeline. For example, run the following command to create a Hive table and specify that data is stored in the test_obs directory of bucket obs-test:

      create table test_obs(a int, b string) row format delimited fields terminated by "," stored as textfile location "obs://mrs-word001/test_obs";

    4. Run the following command to query all tables. If table test_obs is displayed in the command output, OBS access is successful.

      show tables;

      Figure 7 Returned table name
    5. Press Ctrl+C to exit the Hive beeline.
  5. Verify that Spark can access OBS.
    1. Run the client command of the Spark component.

      spark-beeline

    2. Access OBS in spark-beeline. For example, create table test in the obs://mrs-word001/table/ directory.

      create table test(id int) location 'obs://mrs-word001/table/';

    3. Run the following command to query all tables. If table test is displayed in the command output, OBS access is successful.

      show tables;

      Figure 8 Returned table name
    4. Press Ctrl+C to exit the Spark beeline.
  6. Verify that Presto can access OBS.
    • For normal clusters with Kerberos authentication disabled
      1. Run the following command to connect to the client:

        presto_cli.sh

      2. On the Presto client, run the following statement to create a schema and set location to an OBS path:

        CREATE SCHEMA hive.demo01 WITH (location = 'obs://mrs-word001/presto-demo002/');

      3. Create a table in the schema. The table data is stored in the OBS bucket. The following is an example.

        CREATE TABLE hive.demo.demo_table WITH (format = 'ORC') AS SELECT * FROM tpch.sf1.customer;

        Figure 9 Return result
      4. Run exit to exit the client.
    • For security clusters with Kerberos authentication enabled
      1. Log in to MRS Manager and create a role with the Hive Admin Privilege permissions, for example, prestorole. For details about how to create a role, see Creating a Role.
      2. Create a user that belongs to the Presto and Hive groups and bind the role created in 6.a to the user, for example, presto001. For details about how to create a user, see Creating a User.
      3. Authenticate the current user.

        kinit presto001

      4. Download the user credential.
        1. For MRS 2.1.0 or earlier, on MRS Manager, choose System > Manage User. In the row of the new user, choose More > Download Authentication Credential.
          Figure 10 Downloading the Presto user authentication credential
        2. On FusionInsight Manager for MRS 3.x or later,, choose System > Permission > User. In the row that contains the newly added user, click More > Download Authentication Credential.
          Figure 11 Downloading the Presto user authentication credential
      5. Decompress the downloaded user credential file, and save the obtained krb5.conf and user.keytab files to the client directory, for example, /opt/client/Presto/.
      6. Run the following command to obtain a user principal:

        klist -kt /opt/client/Presto/user.keytab

      7. For clusters with Kerberos authentication enabled, run the following command to connect to the Presto Server of the cluster:

        presto_cli.sh --krb5-config-path {krb5.conf file path} --krb5-principal {user principal} --krb5-keytab-path {user.keytab file path} --user {presto username}

        • krb5.conf file path: Replace it with the file path set in 6.e, for example, /opt/client/Presto/krb5.conf.
        • user.keytab file path: Replace it with the file path set in 6.e, for example, /opt/client/Presto/user.keytab.
        • user principal: Replace it with the result returned in 6.f.
        • presto username: Replace it with the name of the user created in 6.b, for example, presto001.

        Example: presto_cli.sh --krb5-config-path /opt/client/Presto/krb5.conf --krb5-principal prest001@xxx_xxx_xxx_xxx.COM --krb5-keytab-path /opt/client/Presto/user.keytab --user presto001

      8. On the Presto client, run the following statement to create a schema and set location to an OBS path:

        CREATE SCHEMA hive.demo01 WITH (location = 'obs://mrs-word001/presto-demo002/');

      9. Create a table in the schema. The table data is stored in the OBS bucket. The following is an example.

        CREATE TABLE hive.demo01.demo_table WITH (format = 'ORC') AS SELECT * FROM tpch.sf1.customer;

        Figure 12 Return result
      10. Run exit to exit the client.
  7. Verify that Flink can access OBS.
    1. On the Dashboard tab page, click Click to synchronize on the right side of IAM User Sync to synchronize IAM users.
    2. After user synchronization is complete, choose Jobs > Create on the cluster details page to create a Flink job. In Parameters, enter parameters in --input <Job input path> --output <Job output path> format. You can click OBS to select a job input path, and enter a job output path that does not exist, for example, obs://mrs-word001/output/. See Figure 13.
      Figure 13 Creating a Flink job
    3. On OBS Console, go to the output path specified during job creation. If the output directory is automatically created and contains the job execution results, OBS access is successful.
      Figure 14 Flink job execution result

Reference

For details about how to control permissions to access OBS, see Configuring Fine-Grained Permissions for MRS Multi-User Access to OBS.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel