Accessing OBS Using Flink Through Guardian
After Guardian is interconnected with OBS by referring to Disabling Ranger OBS Path Authentication for Guardian or Enabling Ranger OBS Path Authentication for Guardian, you can access the OBS parallel file system and run jobs in a Flink job. This section describes how to submit a Flink WordCount job.
Prerequisites
- Before interconnecting Flink with OBS, ensure that YARN is connected to OBS as Flink jobs run on YARN.
- The provided OBS parallel file system name/file name must include the full directory hierarchy.
- If Guardian is connected to OBS by referring to Enabling Ranger OBS Path Authentication for Guardian, ensure that you have the read and write permissions on OBS path in Ranger. For details about how to grant the permissions, see Ranger Permission Configuration.
Interconnecting Flink with OBS
- Log in to the Flink client installation node as the client installation user.
For details about how to download and install the cluster client, see Installing an MRS Cluster Client.
- Initialize environment variables.
source Client installation directory/bigdata_env
- Configure the Flink client. For details, see Using the Flink Client.
- Start a session.
- Normal cluster (Kerberos authentication disabled)
yarn-session.sh -nm "session-name" -d
- Security cluster (Kerberos authentication enabled)
- If the paths of the flink.keystore and flink.truststore files are relative ones:
Run the following command in the directory at the same level as ssl to start the session. ssl/ is a relative path.
Go to the conf directory.
cd Client installation directory/Flink/flink/conf
Start a session.
yarn-session.sh -t ssl/ -nm "session-name" -d
The session is started successfully.
... Cluster started: Yarn cluster with application id application_1624937999496_0017 JobManager Web Interface: http://192.168.1.150:32261
- If the paths of the flink.keystore and flink.truststore files are absolute paths, run the following commands to start a session:
Go to the conf directory.
cd Client installation directory/Flink/flink/conf
Start a session.
yarn-session.sh -nm "session-name" -d
- If the paths of the flink.keystore and flink.truststore files are relative ones:
- Normal cluster (Kerberos authentication disabled)
- Authenticate the user of the cluster with Kerberos authentication enabled. Skip this step for the user of the cluster with Kerberos authentication disabled.
kinit Username
- Add the target OBS file system to the Flink CLI and execute the analysis program.
- Create the test file in the /tmp/test directory.
echo -e 'test' >/tmp/test
- Create a directory for storing the test file in the OBS parallel file system.
hdfs dfs -mkdir -p obs://OBS parallel file system name/tmp/flinkjob
- Upload the test file to the OBS parallel file system.
hdfs dfs -put /tmp/test/ obs://OBS parallel file system name/tmp/flinkjob/
- Run WordCount.jar.
flink run Client installation directory/Flink/flink/examples/batch/WordCount.jar -input obs://OBS parallel file system name/tmp/flinkjob/test -output obs://OBS parallel file system name/tmp/flinkjob/output
- Create the test file in the /tmp/test directory.
- Log in to the OBS console and view the execution result of WordCount.jar in the output path specified by 6.d.
Ranger Permission Configuration
- Log in to FusionInsight Manager of the MRS cluster.
For details about how to log in to FusionInsight Manager, see Accessing MRS Manager.
- Choose System > Permission > User Group. On the displayed page, click Create User Group.
- Create a user group without a role, for example, obs_flink, and bind the user group to the corresponding user.
- Log in to the Ranger management page as the rangeradmin user.
- On the home page, click component plug-in name OBS in the EXTERNAL AUTHORIZATION area.
- Click Add New Policy to add the Read and Write permissions on OBS paths to the user group created in 3. If there are no OBS paths, create one in advance. The OBS path cannot contain wildcard characters (*).
Figure 1 Granting the Flink user group permissions for reading and writing OBS paths
Before configuring permission policies for OBS paths on Ranger, ensure that the AccessLabel function has been enabled for OBS. If the function is not enabled, manually enable it. For details, contact OBS O&M personnel.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot