Hive Tables Supporting Cascading Authorization

This topic is available for MRS 3.3.0 or later versions only. Before using this function, ensure that the following conditions are met:

The AccessLabel function must be enabled on OBS. For details about how to enable this function, contact OBS O&M personnel.
To ensure successful OBS authorization for the table, the following conditions must be met when using OBS as the storage source:
- The Guardian service must have been installed in the cluster.
- Tables stored in OBS can only be authorized to user groups.
- OBS cascading authorization can be used in clusters with Kerberos authentication enabled only.

Scenario

Enabling cascading authorization for a cluster greatly enhances authentication usability. You only need to authorize service tables once on the Ranger page, and the system will automatically associate permissions with the data storage source in a fine-grained manner, without the need to detect the storage path of the tables or require secondary authorization. With Ranger, you can authorize and authenticate tables that have separated storage and compute, effectively eliminating the drawbacks of this method. The cascading authorization function of Hive tables is as follows:

After Ranger cascading authorization is enabled, when creating a policy in Ranger to authorize for a table, you only need to create a Hive policy for the table and do not need to perform secondary authorization on the table's storage source.
When the storage source of an authorized database or table changes, the database or table is periodically associated with the new storage source (HDFS or OBS) to generate corresponding permissions.

Cascading authorization is not supported for view tables.
Cascading authorization can be performed only on databases and tables, and cannot be on partitions. If a partition path is not in the table path, you need to manually authorize the partition path.
Cascading authorization for Deny Conditions in the Hive Ranger policy is not supported. That is, the Deny Conditions permission only restricts the table permission and cannot generate the permission of the HDFS/OBS storage source.
A policy whose database is * and table is * cannot be created in Hive Ranger.
The permission of the HDFS Ranger policy is prior to that of the HDFS/OBS storage source generated by cascading authorization. If the HDFS Ranger permission has been set for the HDFS storage source of the table, the cascading permission does not take effect.
If you have cascading authorization on an OBS storage source table, you won't be able to perform the ALTER operation. To use this operation, you must grant Read and Write permissions to the corresponding user group on the parent directory of the OBS table path. User group names can have up to 52 characters, including numbers (0 to 9), letters (A to Z or a to z), underscores (_), and number signs (#). Otherwise, the policy will fail to add. For how to modify the user group information, see Creating a User Group.

Enabling Cascading Authorization

Log in to FusionInsight Manager, choose Cluster > Services > Ranger, and click Configurations.
Search for the ranger.ext.authorization.cascade.enable parameter and set it to true.
Click Save.
Click Instance and select all RangerAdmin instances. Click More and select Restart Instance. Enter the password, and click OK to restart all RangerAdmin instances.

Connecting to the HDFS Storage Source

The HDFS storage source does not need to be configured.

Connecting to the OBS Storage Source

Setting the location to an OBS path when creating a table
1. Ensure that the storage and compute decoupling has been configured. For details, see "Interconnecting with OBS Using the Guardian Service".
2. Log in to the Ranger management page as the Ranger administrator rangeradmin. On the home page, click OBS in the EXTERNAL AUTHORIZATION area, click Add New Policy, and assign the Read and Write permissions on the OBS storage path to the user group to which the corresponding user belongs. For details, see Adding a Ranger Access Permission Policy for OBS.
  For example, assign the Read and Write permissions on the obs://obs-test/test/ directory to the hgroup1 user group, as shown in the following figure.
3. On the home page, click the component plug-in name Hive in the HADOOP SQL area. On the Access page, click Add New Policy to add a URL policy that assigns the Read and Write permissions on OBS storage paths to the user group to which the corresponding user belongs. For details, see Adding a Ranger Access Permission Policy for Hive.
  For example, create the hive_url_policy URL policy for the hgroup1 user group and assign the Read and Write permissions on the obs://obs-test/test/ directory to the user group, as shown in the following figure.
4. Log in to the beeline client and set Location to the OBS file system path when creating a table.
  cd Client installation directory
  
  kinit Component operation user
  
  beeline
  
  For example, to create a table named test whose Location is obs://obs-test/test/Database name/Table name, run the following command:
  
  create table test(name string) location "obs://obs-test/test/Database name/Table name";
Interconnecting Hive with OBS through Metastore
1. Ensure that the storage and compute decoupling has been configured. For details, see "Interconnecting with OBS Using the Guardian Service".
2. Log in to FusionInsight Manager and choose Cluster > Services > Hive, and click Configurations.
3. Search for hive.metastore.warehouse.dir in the search box and change the parameter value to an OBS path, for example, obs://hivetest/user/hive/warehouse/. hivetest indicates the OBS file system name.
  Figure 1 hive.metastore.warehouse.dir configuration
4. Save the configuration, choose Cluster > Services, and restart the Hive service in the service list.
5. Update the client configuration file.
  1. Log in to the node where the Hive client is located and run the following command to modify hivemetastore-site.xml in the Hive client configuration file directory:
    vi Client installation directory/Hive/config/hivemetastore-site.xml
  2. Change the value of hive.metastore.warehouse.dir to the corresponding OBS path, for example, obs://hivetest/user/hive/warehouse/.
  3. Change the value of hive.metastore.warehouse.dir of hivemetastore-site.xml in the HCatalog client configuration file directory to the corresponding OBS path, for example, obs://hivetest/user/hive/warehouse/.
    vi Client installation directory/Hive/HCatalog/conf/hivemetastore-site.xml
  4. Log in to the Ranger management page as the Ranger administrator rangeradmin. On the home page, click OBS in the EXTERNAL AUTHORIZATION area, click Add New Policy, and assign the Read and Write permissions on the OBS storage path to the user group to which the corresponding user belongs.
    For example, assign the Read and Write permissions on the obs://hivetest/user/hive/warehouse/ directory to the hgroup1 user group:
  5. Choose Settings > Roles, click Add New Role, and create a role whose Role Name is hive.
6. Go to the Hive Beeline CLI, create a table, and ensure that the location is an OBS path.
  cd Client installation directory
  
  kinit Component operation user
  
  beeline
  
  create table test(name string);
  
  desc formatted test;
  If the current database is located in HDFS, any tables created within it will automatically be located in HDFS without the need to specify the location. To change the default policy for creating tables, update the database location to point to OBS.
  
  The procedure is as follows:
  1. Query the location of the database.
    show create database obs_test;
  2. Modify the database location.
    alter database obs_test set location 'obs://test1/'
    
    Run the show create database obs_test command to check whether the location of the database points to OBS.
  3. Modify the table location.
    alter table user_info set location 'obs://test1/'
    
    If the table contains data, migrate the original data file to the new location.