Updated on 2023-08-31 GMT+08:00

Preparing for Security Authentication

Scenario

In a cluster with the security mode enabled, the components must be mutually authenticated before communicating with each other to ensure communication security.

To submit a Flink application, you need to ensure that Flink can communicate with Yarn and HDFS. Security authentication needs to be configured for the Flink application to be submitted.

Flink supports authentication and encrypted transmission. This section describes how to prepare for the authentication and encrypted transmission.

Security Authentication

Figure 1 Authentication mode

Flink supports the following authentication modes:

  • Kerberos authentication is used between Flink Yarn client and Yarn ResourceManager, JobManager and ZooKeeper, JobManager and HDFS, TaskManager and HDFS, Kafka and TaskManager, and TaskManager and ZooKeeper.
  • Security cookie authentication is used between Flink Yarn client and JobManager, JobManager and TaskManager, and TaskManager and TaskManager.
  • Internal authentication of Yarn is used between Yarn ResourceManager and ApplicationMaster (AM).
    • Flink JobManager and Yarn ApplicationMaster are in the same process.
    • If security mode is enabled, you must use the Kerberos authentication and security cookie authentication.
Table 1 Authentication methods

Authentication Mode

Configuration Method

Kerberos authentication (Currently, only keytab is supported.)

  1. Download the user keytab file created in User Information for Cluster Authentication from FusionInsight Manager and save it to a directory on the node where the Flink client is deployed.
  2. Configure the following information in the Client installation/Flink/flink/conf/flink-conf.yaml path:
    1. jobmanager.web.access-control-allow-origin: service IP address of the node where the client is installed; jobmanager.web.allow-access-address: IP address of the Master node
      jobmanager.web.access-control-allow-origin: xx.xx.xxx.xxx,xx.xx.xxx.xxx,xx.xx.xxx.xxx jobmanager.web.allow-access-address: xx.xx.xxx.xxx,xx.xx.xxx.xxx,xx.xx.xxx.xxx
      NOTE:

      The service IP address of a node outside the cluster is the IP address of the ECS where the client is installed. To obtain the service IP address of a node in the cluster, perform the following steps:

      In the navigation pane of the MRS console, choose Clusters > Active Clusters, and click a cluster name to switch to the cluster details page. In the Nodes tab, view the IP address of the node where the client is installed.

    2. Keytab path
      security.kerberos.login.keytab: /home/flinkuser/keytab/flinkuser.keytab
      NOTE:

      /home/flinkuser/keytab/ indicates the directory for storing the keytab file.

    3. Username for running the job
      security.kerberos.login.principal: flinkuser
    4. Kerberos authentication configuration required in HA mode when ZooKeeper is configured
      zookeeper.sasl.disable: false
      security.kerberos.login.contexts: Client
    5. Kerberos authentication (if needed) between the Kafka client and Kafka broker
      security.kerberos.login.contexts: Client,KafkaClient

Security cookie authentication

  1. Place the generate_keystore.sh script in the bin directory on the Flink client and run the generate_keystore.sh script to generate Security Cookie, flink.keystore, and flink.truststore files. For details, see Authentication and Encryption.

    Run the sh generate_keystore.sh command and enter the user-defined password. The password cannot contain #.

    NOTE:

    After the script is executed, the flink.keystore and flink.truststore files are generated in the conf directory on the Flink client. In the flink-conf.yaml file, set the following configuration items:

    • Set security.ssl.keystore to the absolute path of the flink.keystore file.
    • Set security.ssl.truststore to the absolute path of the flink.truststore file.
    • Set security.cookie to a random password automatically generated by the generate_keystore.sh script.
    • By default, security.ssl.encrypt.enabled is set to false in the flink-conf.yaml file. The generate_keystore.sh script sets security.ssl.key-password, security.ssl.keystore-password, and security.ssl.truststore-password to the password entered when the generate_keystore.sh script is called.
    • For MRS 3.x or later, if ciphertext is required and security.ssl.encrypt.enabled is set to true in the flink-conf.yaml file, the generate_keystore.sh script will not set security.ssl.key-password, security.ssl.keystore-password, and security.ssl.truststore-password. To obtain the values, use the Manager plaintext encryption API by running curl -k -i -u Username:Password -X POST -HContent-type:application/json -d '{"plainText":"Password"}' 'https://x.x.x.x:28443/web/api/v2/tools/encrypt'.

      In the preceding command, Username:Password indicates the username and password for logging in to the system. The password of "plainText" indicates the one used to call the generate_keystore.sh script. x.x.x.x indicates the floating IP address of Manager.

  2. Set security.enable to true and configure security cookie.
    security.cookie: ae70acc9-9795-4c48-ad35-8b5adc8071744f605d1d-2726-432e-88ae-dd39bfec40a9

Internal authentication of Yarn

You do not need to do any configuration for this authentication mode.

One Flink cluster belongs to only one user. One user can create multiple Flink clusters.

Encrypted Transmission

Figure 2 Encrypted transmission of Flink

Flink supports the following encrypted transmission:

  • Encrypted transmission inside Yarn is used between Flink Yarn client and Yarn ResourceManager, and Yarn ResourceManager and JobManager.
  • SSL transmission is used between Flink Yarn client and JobManager, JobManager and TaskManager, and TaskManager and TaskManager.
  • Internal encrypted transmission of Hadoop is used between JobManager and HDFS, TaskManager and HDFS, JobManager and ZooKeeper, and TaskManager and ZooKeeper.

You do not need to do any configurations for internal encryption of Yarn and Hadoop. Only SSL configuration is required.

To configure SSL encrypted transmission, perform the following steps to configure the flink-conf.yaml file on the client:

  1. Turn on the SSL switch and set SSL encryption algorithms. Table 2 describes the parameters. Set the parameters based on your need.
    Table 2 Parameters

    Parameter

    Example Value

    Description

    security.ssl.enabled

    true

    Enabling the SSL function

    akka.ssl.enabled

    true

    Enabling Akka SSL

    blob.service.ssl.enabled

    true

    Enabling SSL for the BLOB channel

    taskmanager.data.ssl.enabled

    true

    Enabling SSL for communications between TaskManagers

    security.ssl.algorithms

    TLS_DHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_DHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384

    Setting SSL encryption algorithms

    Enabling SSL for data transmission between TaskManagers may lead to a drop of system performance.

  2. In the bin directory of the Flink client, run sh generate_keystore.sh <password&gt. For details, see Authentication and Encryption. The configuration items in Table 3 are set by default, you can set these parameters as needed.
    Table 3 Parameters

    Parameter

    Example Value

    Description

    security.ssl.keystore

    ${path}/flink.keystore

    Path for storing the keystore. flink.keystore indicates the name of the keystore file generated by the generate_keystore.sh* tool.

    security.ssl.keystore-password

    -

    A user-defined password of keystore.

    security.ssl.key-password

    -

    A user-defined password of the SSL key.

    security.ssl.truststore

    ${path}/flink.truststore

    Path for storing the truststore. flink.truststore indicates the name of the truststore file generated by the generate_keystore.sh* tool.

    security.ssl.truststore-password

    -

    A user-defined password of truststore.

    The path directory is a user-defined directory for storing configuration files of the SSL keystore and truststore. The commands vary according to the relative path and absolute path. For details, see 3 and 4.

  3. If the keystore or truststore file path is a relative path, the Flink client directory where the command is executed needs to access this relative path directly. Either of the following method can be used to transmit the keystore and truststore file:
    • Add -t option to the CLI yarn-session.sh command of Flink to transmit the KeyStore and TrustStore files to each execution node. The following is an example:

      cd /opt/client/Flink/flink

      ./bin/yarn-session.sh -t ssl/

    • Add -yt to the flink run command to transfer the keystore and truststore file to execution nodes. The following is an example:

      ./bin/flink run -yt ssl/ -ys 3 -m yarn-cluster -c com.huawei.SocketWindowWordCount ../lib/flink-eg-1.0.jar --hostname r3-d3 --port 9000

      • In the example, ssl/ is a user-defined subdirectory in the Flink client directory, which is used to store configuration files of the SSL keystore and truststore.
      • The relative path of ssl/ must be accessible from the current path where the Flink client command is run.
  4. If the keystore or truststore file path is an absolute path, the keystore and truststore files must exist in the absolute path on Flink Client and Yarn nodes.
    Either of the following methods can be used to execute applications. The -t or -yt option does not need to be added to transmit the keystore and truststore files.
    • Run the CLI yarn-session.sh command of Flink to execute applications. The following is an example command:

      ./bin/yarn-session.sh

    • Run the flink run command to execute applications. The following is an example command:

      ./bin/flink run -ys 3 -m yarn-cluster -c com.huawei.SocketWindowWordCount ../lib/flink-eg-1.0.jar --hostname r3-d3 --port 9000