Updated on 2025-03-17 GMT+08:00

Interconnecting Guardian with OBS

Scenario

This section describes how to enable decoupled storage and compute for Guardian. Once enabled, Guardian can offer temporary authentication credentials for components such as HDFS, Hive, Spark, Loader, and HetuEngine to access OBS in decoupled storage and compute scenarios.

To interconnect Guardian with OBS, do as follows:

  1. Creating an OBS Parallel File System
  2. Creating an Agency for a Regular Account
  3. Creating a Cloud Service Agency and Binding It to a Cluster
  4. Granting Guardian Permissions to Access OBS
  5. Enabling Cascading Authorization for Hive Tables
  6. Configuring a Recycle Bin Cleanup Policy

Prerequisites

  • Components such as Guardian, Ranger, and Hadoop have been installed in the cluster.
  • If Guardian is installed after components such as Hadoop, HetuEngine, Hive, and Spark are installed, you need to redownload the Guardian client and refresh the default client for job submission on the management plane.
  • If Kerberos authentication is not enabled for the current cluster, the user who accesses OBS must belong to the supergroup group. To achieve this, do as follows: Log in to FusionInsight Manager and choose System > Permission > User. Locate the user in the user list and click Modify. In User Group, bind the user to the supergroup group.
  • The AccessLabel function must be enabled on OBS. For how to enable it, contact OBS O&M personnel.

Impact on the System

  • Once you finish the configuration, you will need to either refresh the original client's configuration or reinstall the client.
  • To submit a job on console, log in to the active OMS node as user omm and run sh /opt/executor/bin/refresh-client-config.sh to refresh the cluster's built-in client.

Creating an OBS Parallel File System

  1. Log in to the OBS console.
  2. Choose Parallel File Systems > Create Parallel File System.
  3. Enter a file system name, for example, guardian-obs.

    Use the enterprise project selected during MRS cluster creation and set other parameters as needed.

  4. Click Create Now.

Creating an Agency for a Regular Account

  1. Log in to the Huawei Cloud management console.
  2. In the service list, choose Management & Governance > Identity and Access Management.
  3. Choose Agencies. On the displayed page, click Create Agency.
  4. On the Create Agency page, set the following parameters and click Done:
    • Agency Name: Enter an agency name, for example, agency-MRS-to-OBS.
    • Agency Type: Select Account.
    • Delegated Account: Enter your cloud account that you signed up for with your mobile number. It cannot be a federated user or an IAM user created using your cloud account.
    • Validity Period: Select Unlimited.
  5. In the displayed dialog box, click Authorize. On the displayed page, click Create Policy.
    On the Create Policy page, set the following parameters and click Next:
    • Policy Name: Enter a policy name, for example, guardian-policy.
    • Policy View: Select JSON.
    • Policy Content: Configure the parameter as follows:
      {
          "Version": "1.1",
          "Statement": [
              {
                  "Effect": "Allow",
                  "Action": [
      			"obs:bucket:GetBucketLocation",
      			"obs:bucket:ListBucketMultipartUploads",
      			"obs:object:GetObject",
      			"obs:object:ModifyObjectMetaData",
      			"obs:object:DeleteObject",
      			"obs:object:ListMultipartUploadParts",
      			"obs:bucket:HeadBucket",
      			"obs:object:AbortMultipartUpload",
      			"obs:bucket:ListBucket",
      			"obs:object:PutObject",
      			"obs:object:GetAccessLabel",
      			"obs:object:DeleteAccessLabel",
      			"obs:object:PutAccessLabel",
      		        "obs:bucket:ListAllMyBuckets"
                  ],
                  "Resource": [
                      "OBS:*:*:bucket:guardian-obs",
                      "OBS:*:*:object:*"
                  ]
              }
          ]
      }

      In the preceding configuration, Resource indicates that all resources of the configured parallel file system can be accessed. guardian-obs indicates the name of the OBS parallel file system created in Creating an OBS Parallel File System.

  6. Click Next. On the Select Policy/Role page, select the policy created in 5.
  7. Click Next, select All resources, click Show More, select Global resources, and click OK.
  8. View and record the agency ID.
    Figure 1 Viewing an agency ID

Creating a Cloud Service Agency and Binding It to a Cluster

  1. Log in to the Huawei Cloud management console.
  2. In the service list, choose Management & Governance > Identity and Access Management.
  3. Choose Agencies. On the displayed page, click Create Agency.
  4. Set Agency Name. For example, enter mrs_ecs_obs.
  5. Set Agency Type to Cloud service and select Elastic Cloud Server (ECS) and Bare Metal Server (BMS) to authorize ECS or BMS to access OBS.
    Figure 2 Creating an agency
  6. Set Validity Period to Unlimited and click Done.
  7. In the displayed dialog box, click Authorize. On the displayed page, click Create Policy.
    On the Create Policy page, set the following parameters and click Next:
    • Policy Name: Enter a policy name, for example, guardian-assume-policy.
    • Policy View: Select JSON.
    • Policy Content: Enter the following content. {Agency ID} indicates the ID recorded in 8.
      {
          "Version": "1.1",
          "Statement": [
              {
                  "Action": [
                      "iam:agencies:assume"
                  ],
                  "Resource": {
                      "uri": [
                          "/iam/agencies/{Agency ID}"
                      ]
                  },
                  "Effect": "Allow"
              }
          ]
      }
  8. Click Next. On the Select Policy/Role page, select the policy created in 7.
  9. Click Next, click Show More, select Global services, and click OK.
  10. In the displayed dialog box, click OK to start authorization. Click Finish after the message "Authorization successful." is displayed.
  11. Log in to the MRS console. In the navigation pane on the left, choose Active Clusters.
  12. Click the name of the target cluster to access its details page.
  13. On the Dashboard tab, click Synchronize next to IAM User Sync to synchronize IAM users.
  14. On the Dashboard tab, click Manage Agency next to Agency. In the displayed dialog box, select the agency you created, for example, mrs_ecs_obs, and click OK.
    Figure 3 Binding an agency

Granting Guardian Permissions to Access OBS

  1. Log in to FusionInsight Manager, choose Cluster > Services > Guardian, and click Configurations then All Configurations. On the displayed page, search for and modify the following parameters:

    Parameter

    Description

    Example Value

    fs.obs.guardian.accesslabel.enabled

    Whether to enable AccessLabel on OBS, which allows Guardian to connect to OBS.

    true

    fs.obs.guardian.enabled

    Whether to enable Guardian.

    true

    fs.obs.delegation.token.providers

    Delegation token generator. When fs.obs.guardian.enabled is set to true, you need to set both com.huawei.mrs.dt.MRSDelegationTokenProvider and com.huawei.mrs.dt.GuardianDTProvider.

    com.huawei.mrs.dt.MRSDelegationTokenProvider and com.huawei.mrs.dt.GuardianDTProvider

    token.server.access.label.agency.name

    Name of the specified IAM agency, which is the one created in Creating an Agency for a Regular Account.

    agency-MRS-to-OBS

  2. Save the service configuration, choose More > Restart Configuration-Expired Instances on the FusionInsight Manager home page, and restart all service instances whose configurations have expired as prompted.
  3. To submit jobs on the MRS console, log in to the active OMS node as user omm and run the following command to refresh the built-in client configuration:

    sh /opt/executor/bin/refresh-client-config.sh

Enabling Cascading Authorization for Hive Tables

  1. Log in to FusionInsight Manager, choose Cluster > Services > Ranger, and click Configurations.
  2. Search for ranger.ext.authorization.cascade.enable and set it to true.
  3. Click Save.
  4. Click Instance and select all RangerAdmin instances. Click More and select Restart Instance. Enter the password and click OK to restart all RangerAdmin instances.

You can only enable OBS cascading authorization for clusters with Kerberos authentication enabled.

Configuring a Recycle Bin Cleanup Policy

  1. Log in to the OBS console.
  2. In the navigation pane on the left, choose Resources > Parallel File Systems. On the displayed page, click the name of the file system created in Creating an OBS Parallel File System.
  3. In the navigation pane on the left, choose Basic Configurations > Lifecycle Rules. On the displayed page, click Create to create a lifecycle rule for the /user/.Trash directory.

    Once you have configured decoupled storage and compute for a cluster, you must create lifecycle rules for relevant directories. Otherwise, there is a risk of running out of storage space and incurring additional storage costs.

    Table 1 Parameters for creating a lifecycle rule

    Parameter

    Description

    Example Value

    Status

    Whether to enable the lifecycle rule.

    Enabled

    Rule Name

    Enter a rule name, which is used to identify different lifecycle configurations.

    rule-test

    Prefix

    Prefix of the objects to which the lifecycle rule applies. Typically, the prefix of the recycle bin directory of MRS components is /user/.Trash.

    user/.Trash

    Transition to Infrequent Access After (Days)

    Number of days after the last update of an object that it will be transitioned to infrequent access storage based on the rule. The minimum value is 30.

    30 days

    Transition to Archive After (Days)

    Number of days after the last update of an object that it will be transitioned to archive based on the rule. If you are setting both this parameter and Transition to Infrequent Access After (Days), make sure this parameter value is at least 30 days greater than the value of Transition to Infrequent Access After (Days). If you are only setting this parameter, assign any value to it as needed.

    31 days

    Delete Files After (Days)

    Number of days after the last update of an object that it will expire and be automatically deleted by OBS based on the rule. The value of this parameter must be greater than the values of the two transition parameters.

    32 days

    Delete Fragments After (Days)

    Number of days of a fragment that it will expire and be automatically deleted by OBS based on the rule.

    30 days

  4. Click OK.

    To modify, disable, or enable a lifecycle rule, locate the rule and click Edit, Disable, or Enable in the Operation column, respectively.