Updated on 2024-04-11 GMT+08:00

Interconnecting Flink with OBS

Interconnecting with OBS

  1. Log in to the Flink client installation node as the client installation user.
  2. Run the following command to initialize environment variables:

    source Client installation directory/bigdata_env

  3. Configure the Flink client.
  4. Start a session.

    • Normal cluster (Kerberos authentication disabled)

      yarn-session.sh -nm "session-name" -d

    • Security cluster (Kerberos authentication enabled)
      • If the flink.keystore and flink.truststore file paths are relative paths:

        Run the following command in the directory at the same level as ssl to start the session. ssl/ is a relative path.

        cd /opt/hadoopclient/Flink/flink/conf/

        yarn-session.sh -t ssl/ -nm "session-name" -d

        ...
        Cluster started: Yarn cluster with application id application_1624937999496_0017
        JobManager Web Interface: http://192.168.1.150:32261
      • If the flink.keystore and flink.truststore file paths are absolute paths:

        Run the following command to start a session:

        cd /opt/hadoopclient/Flink/flink/conf/

        yarn-session.sh -nm "session-name" -d

  5. For a security cluster, run the following command to perform user authentication. If Kerberos authentication is not enabled for the current cluster, you do not need to run this command.

    kinit Username

  6. Explicitly add the OBS file system to be accessed in the Flink command line.

    echo -e 'test' >/tmp/test

    hdfs dfs -mkdir -p obs://Parallel file system name/tmp/flinkjob

    hdfs dfs -put /tmp/test/ obs://Parallel file system name/tmp/flinkjob/

    flink run Client installation directory/Flink/flink/examples/batch/WordCount.jar -input obs://Parallel file system name/tmp/flinkjob/test -output obs://Parallel file system name/tmp/flinkjob/output

  • Flink jobs are running on Yarn. Before configuring Flink to interconnect with the OBS file system, ensure that the interconnection between Yarn and the OBS file system is normal.
  • Name of the OBS parallel file system/File name: The OBS file path must be written to the directory level.
  • If Kerberos authentication has been enabled (security mode) for the cluster, grant the Read and Write permissions on OBS paths to component users in Ranger by referring to Ranger Permission Configuration.

Ranger Permission Configuration

  1. Log in to FusionInsight Manager and choose System > Permission > User Group. On the displayed page, click Create User Group.
  2. Create a user group without a role, for example, obs_flink, and bind the user group to the corresponding user.
  3. Log in to the Ranger management page as the rangeradmin user.
  4. On the home page, click component plug-in name OBS in the EXTERNAL AUTHORIZATION area.
  5. Click Add New Policy to add the Read and Write permissions on OBS paths to the user group created in 2. If there are no OBS paths, create one in advance (wildcard character * is not allowed).