Updated on 2025-12-10 GMT+08:00

Accessing OBS Using Flink Through Guardian

After Guardian is interconnected with OBS by referring to Disabling Ranger OBS Path Authentication for Guardian or Enabling Ranger OBS Path Authentication for Guardian, you can access the OBS parallel file system and run jobs in a Flink job. This section describes how to submit a Flink WordCount job.

Prerequisites

  • Before interconnecting Flink with OBS, ensure that YARN is connected to OBS as Flink jobs run on YARN.
  • The provided OBS parallel file system name/file name must include the full directory hierarchy.
  • If you interconnected Guardian with OBS by referring to Enabling Ranger OBS Path Authentication for Guardian, ensure that you have the read and write permissions on the OBS path in Ranger. For details about how to obtain the permissions, see Ranger Permission Configuration.

Interconnecting Flink with OBS

  1. Log in to the Flink client installation node as the client installation user.
  2. Run the following command to initialize environment variables:

    source Client installation directory/bigdata_env

  3. Configure the Flink client.
  4. Start the session.

    • Normal cluster (Kerberos authentication disabled)
      yarn-session.sh -nm "session-name" -d
    • Security cluster (Kerberos authentication enabled)
      • If the paths of the flink.keystore and flink.truststore files are relative ones:

        Run the following command in the directory at the same level as the ssl directory to start the session. ssl/ is a relative path.

        Switch to the conf directory.

        cd Client installation directory/Flink/flink/conf

        Start a session.

        yarn-session.sh -t ssl/ -nm "session-name" -d

        The session starts successfully.

        ...
        Cluster started: Yarn cluster with application id application_1624937999496_0017
        JobManager Web Interface: http://192.168.1.150:32261
      • If the paths of the flink.keystore and flink.truststore files are absolute ones, run the following commands to start a session:

        Switch to the conf directory.

        cd Client installation directory/Flink/flink/conf

        Start a session.

        yarn-session.sh -nm "session-name" -d

  5. If Kerberos authentication is enabled for the cluster, run the following command to authenticate the user. Skip this step if Kerberos authentication is disabled.

    kinit Username

  6. Add the target OBS file system to the Flink CLI and execute the analysis program.

    1. Create the test file in the /tmp/test directory.
      echo -e 'test' >/tmp/test
    2. Create a directory for storing the test file in the OBS parallel file system.
      hdfs dfs -mkdir -p obs://Parallel file system name/tmp/flinkjob
    3. Upload the test file to the OBS parallel file system.
      hdfs dfs -put /tmp/test/ obs://Parallel file system name/tmp/flinkjob/
    4. Run WordCount.jar.
      flink run Client installation directory/Flink/flink/examples/batch/WordCount.jar -input obs://Parallel file system name/tmp/flinkjob/test -output obs://Parallel file system name/tmp/flinkjob/output

  7. Log in to the OBS console and check the execution result of WordCount.jar in the output path specified in 6.d.

Ranger Permission Configuration

  1. Log in to FusionInsight Manager of the MRS cluster.

    For details about how to log in to FusionInsight Manager, see Accessing MRS Manager.

  2. Choose System > Permission > User Group. On the displayed page, click Create User Group.
  3. Create a user group without a role, for example, obs_flink, and bind the user group to the corresponding user.
  4. Log in to the Ranger management page as user rangeradmin.
  5. On the home page, click component plug-in name OBS in the EXTERNAL AUTHORIZATION area.
  6. Click Add New Policy to add the Read and Write permissions on OBS paths to the user group created in 3. If there are no OBS paths, create one in advance. The OBS path cannot contain wildcard characters (*).

    Figure 1 Granting the Flink user group permissions for reading and writing OBS paths

    Before configuring permission policies for OBS paths on Ranger, ensure that the AccessLabel function has been enabled for OBS. For how to enable it, contact OBS O&M personnel.