Accessing OBS Using YARN Through Guardian
After Guardian is interconnected with OBS by referring to Disabling Ranger OBS Path Authentication for Guardian or Enabling Ranger OBS Path Authentication for Guardian, you can execute YARN jobs on the cluster client to access OBS.
Prerequisites
If Guardian is connected to OBS by referring to Enabling Ranger OBS Path Authentication for Guardian, ensure that you have the read and write permissions on OBS path in Ranger. For details about how to grant the permissions, see Configuring Ranger Permissions.
Interconnecting YARN with OBS
- Log in to the node where the YARN client is installed as the client installation user.
- Run the following command to switch to the client installation directory.
cd Client installation directory
- Run the following command to configure environment variables:
source bigdata_env
- If the cluster is enabled with Kerberos authentication, run the following command to perform user authentication. The user must have the read and write permissions on the OBS directory. User authentication is not required for clusters with Kerberos authentication disabled.
kinit User performing HDFS operations
- Explicitly add the OBS file system to be accessed in the YARN command line.
- Access the OBS file system.
hdfs dfs -ls obs://OBS parallel file system name/path
- Create a directory in the OBS file system.
hdfs dfs -mkdir obs://OBS parallel file system name/hadoop1
- Execute the YARN task to access OBS.
yarn jar Client installation directory/HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi -Dmapreduce.job.hdfs-servers=NAMESERVICE -fs obs://OBS parallel file system name 1 1
NAMESERVICE indicates the NameService in HDFS. The default value is hdfs://hacluster. If there are multiple NameServices, separate them with ,.
Example:
yarn jar /opt/hadoopclient/HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi -Dmapreduce.job.hdfs-servers=hdfs://hacluster -fs obs://bucketname 1 1
- Run the following command to write data to OBS:
yarn jar Client installation directory/HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar teragen 100 obs://OBS parallel file system name/hadoop1/teragen1
- Run the following command to copy data from OBS to HDFS:
hadoop distcp obs://OBS parallel file system name/hadoop1/teragen1 /tmp
- Access the OBS file system.
Changing the Log Level of OBS Client
- Go to the hadoop directory.
cd Client installation directory/HDFS/hadoop/etc/hadoop
- Edit the file log4j.properties.
vi log4j.properties
Add the following OBS log level configuration to the file and save it.
log4j.logger.org.apache.hadoop.fs.obs=WARN log4j.logger.com.obs=WARN
- Run the following command:
tail -4 log4j.properties
If the command output shown in Figure 1 is displayed, the log level is successfully changed.
Figure 1 Adding an OBS log level
Configuring Ranger Permissions
- Log in to FusionInsight Manager and choose System > Permission > User Group. On the displayed page, click Create User Group to create a user group without any roles, for example, obs_hadoop1.
For details about how to log in to MRS Manager, see Accessing MRS Manager.
- Back to FusionInsight Manager and choose System > Permission > User. On the displayed page, click Create User to create a user that is associated with the obs_hadoop1 user group and the default role, for example, hadoopuser1.
- Log in to the Ranger management page as the rangeradmin user.
- On the home page, click component plug-in name OBS in the EXTERNAL AUTHORIZATION area.
- Click Add New Policy and add the Read and Write permissions on the desired OBS paths to the user group created in Step 1.
The following figure shows the configurations needed for adding the Read and Write permissions on obs://OBS parallel file system name/hadoop1 to user group obs_hadoop1.
Figure 2 Granting the new user group permissions for reading and writing OBS pathsBefore configuring permission policies for OBS paths on Ranger, ensure that the AccessLabel function has been enabled for OBS. If the function is not enabled, manually enable it. For details, contact OBS O&M personnel.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot