Updated on 2025-08-11 GMT+08:00

Accessing OBS Using Hive Through Guardian

After Guardian is interconnected with OBS by referring to Disabling Ranger OBS Path Authentication for Guardian or Enabling Ranger OBS Path Authentication for Guardian, you can create tables stored in the OBS parallel file system on the Hive client.

Prerequisites

If Guardian is connected to OBS by referring to Enabling Ranger OBS Path Authentication for Guardian, ensure that you have the read and write permissions on OBS path in Ranger. For details about how to grant the permissions, see Configuring Ranger Permissions.

Interconnecting Hive with OBS

MRS clusters allow Hive to connect to OBS through Metastore.

Interconnecting Hive with OBS through Metastore

  1. You have configured storage and compute decoupling by referring to Enabling Ranger OBS Path Authentication for Guardian.
  2. Log in to FusionInsight Manager of the MRS cluster.

    For details about how to log in to FusionInsight Manager, see Accessing MRS Manager.

  3. Choose Cluster > Services > Hive and click Configurations.
  4. Search for hive.metastore.warehouse.dir in the search box and set its value to an OBS path, for example, obs://hivetest/user/hive/warehouse/. hivetest indicates the OBS file system name.

    Figure 1 hive.metastore.warehouse.dir configuration

  5. Click Save to save the configuration, choose Cluster > Services, and restart the Hive service in the service list.
  6. Update the client configuration file.

    1. Log in to the node where the Hive client is located and run the following command to modify hivemetastore-site.xml in the Hive client configuration file directory:
      vi Client installation directory/Hive/config/hivemetastore-site.xml
    2. Change the value of hive.metastore.warehouse.dir to the corresponding OBS path, for example, obs://hivetest/user/hive/warehouse/.
      Figure 2 Configuring the OBS path
    3. Change the value of hive.metastore.warehouse.dir of hivemetastore-site.xml in the HCatalog client configuration file directory to the corresponding OBS path, for example, obs://hivetest/user/hive/warehouse/.
      vi Client installation directory/Hive/HCatalog/conf/hivemetastore-site.xml

  7. Go to the Hive Beeline CLI, create a database, and ensure that the location is an OBS path.

    1. Go to the client installation directory.
      cd Client installation directory
    2. Load the environment variables.
      source bigdata_env
    3. Authenticate the user. Skip this step for clusters with Kerberos authentication disabled.
      kinit User performing Hive operations
    4. Log in to the Hive client.
      beeline
    5. Create a database.
      create database testdb1;
    6. Check the location of the database.
      show create database testdb1;
      Figure 3 Viewing the location of the newly created Hive database

Configuring Ranger Permissions

  • Granting the read and write permissions on OBS paths to the hive user group
    1. Log in to the Ranger web UI as the Ranger administrator. On the home page, click component plug-in name OBS in the EXTERNAL AUTHORIZATION area, and assign the Read and Write permissions on the OBS storage path to the hive user group. If this operation is successful, all users in the hive group can access the Hive data warehouse path.

      For example, assign the Read and Write permissions on the obs://hivetest/user/hive/warehouse/ directory to the hive user group:

      Figure 4 Granting the hive user group permissions for reading and writing OBS paths
    2. Choose Settings > Roles, click Add New Role, and create a role whose Role Name is hive.
      Figure 5 Creating a hive role
  • Granting the read and write permissions on OBS paths to a custom user group
    1. Log in to FusionInsight Manager and choose System > Permission > User Group. On the displayed page, click Create User Group.
    2. Create a user group without a role, for example, hiveobs1, and bind the user group to the corresponding user.
    3. Log in to the Ranger management page as the rangeradmin user.
    4. On the home page, click component plug-in name OBS in the EXTERNAL AUTHORIZATION area.
    5. Grant the Read and Write permissions on the OBS storage path to the hiveobs1 user group. In this case, all users bound to the hiveobs1 user group can access the Hive data warehouse path.
      Figure 6 Granting the custom Hive user group permissions for reading and writing OBS paths
  • Creating a database,table, or partition in a custom location and granting read and write permissions on OBS paths
    1. Log in to the Ranger web UI as the Ranger administrator.
    2. On the home page, click component plug-in name OBS in the EXTERNAL AUTHORIZATION area, and assign the Read and Write permissions on the OBS storage path to the user group of the corresponding user.

      For example, assign the Read and Write permissions on the obs://obs-test/test/ directory to the hgroup1 user group, as shown in the following figure.

      Figure 7 Granting the user group permissions for reading and writing OBS paths
    3. On the home page, click the component plug-in name Hive in the HADOOP SQL area, and add a URL policy that grants the Read and Write permissions on the OBS path to the user group of the corresponding user. For details, see Adding a Ranger Access Permission Policy for Hive.

      For example, create the hive_url_policy URL policy for the hgroup1 user group and assign the Read and Write permissions on the obs://obs-test/test/ directory to the user group, as shown in the following figure.

      Figure 8 Creating a URL policy with permissions for reading and writing OBS paths
    4. Log in to the beeline client and set Location to the OBS file system path when creating a table.

      Go to the client installation directory.

      cd Client installation directory

      Load the environment variables.

      source bigdata_env

      Authenticate the user. Skip this step for clusters with Kerberos authentication disabled.

      kinit User performing Hive operations

      Log in to the Hive client.

      beeline

      For example, to create a table named test whose Location is obs://obs-test/test/Database name/Table name, run the following command:

      create external table test(name string) location " obs://obs-test/test/ Database name/Table name";
  • To authorize a view chart, you need to grant the view chart permission and the physical table path permission corresponding to the view chart.
  • Cascading authorization can be performed only on databases and tables, and cannot be on partitions. If a partition path is not in the table path, you need to manually authorize the partition path.
  • Cascading authorization for Deny Conditions in the Hive Ranger policy is not supported. That is, the Deny Conditions permission only restricts the table permission and cannot generate the permission of the HDFS/OBS storage source.
  • The permission of the HDFS Ranger policy is prior to that of the HDFS/OBS storage source generated by cascading authorization. If the HDFS Ranger permission has been set for the HDFS storage source of the table, the cascading permission does not take effect.
  • alter operations cannot be performed on tables whose storage source is OBS after cascading authorization. To perform the alter operation, you need to grant the Read and Write permissions of the parent directory of the OBS table path to the corresponding user group.
  • Before configuring permission policies for OBS paths on Ranger, ensure that the AccessLabel function has been enabled for OBS. If the function is not enabled, manually enable it. For details, contact OBS O&M personnel.