Updated on 2022-11-18 GMT+08:00

Scenarios

Scenarios

Assume that a Flink service receives one word record every 1 second.

Develop a Flink application that can generate output of prefixed message contents.

Data Planning

Sample project data of Flink is stored in Kafka. A user with Kafka permission can send data to Kafka and receive data from it.

  1. Ensure that clusters, including HDFS, YARN, Flink, and Kafka are successfully installed.
  2. Create a topic.
    1. Configure permissions of the user to create topic on the server.
      Change the value of allow.everyone.if.no.acl.found, the Broker configuration value of Kafka, to true, as shown in Figure 1. Then restart the Kafka service.
      Figure 1 Configuring permissions of topics on the server
    2. Run Linux command line to create a topic. Before running commands, ensure that the kinit command, for example, kinit flinkuser, is run for authentication.

      Flinkuser requires the user has permission to create Kafka's topic to create it. For details, see Preparing the Developer Account.

      The format of the command is following:

      bin/kafka-topics.sh --create --zookeeper {zkQuorum}/kafka --partitions {partitionNum} --replication-factor {replicationNum} --topic {Topic}

      Table 1

      Parameter

      Description

      {zkQuorum}

      ZooKeeper cluster information. The format is IP:port.

      {PartitionNum}

      The number of partitions for the topic.

      {ReplicationNum}

      The number of copies of each partition for the topic.

      {Topic}

      The topic name.

      Assume that the IP:ports of ZooKeeper clusters are 10.96.101.32:2181, 10.96.101.251:2181, 10.96.101.177:2181, and 10.91.8.160:2181, and the topic named is topic1. The command for creating a topic is as follows:
      bin/kafka-topics.sh --create --zookeeper 10.96.101.32:2181,10.96.101.251:2181,10.96.101.177:2181,10.91.8.160:2181/kafka --partitions 5 --replication-factor 1 --topic topic1
  3. Security authentication

    The Kerberos authentication, SSL encryption authentication, or Kerberos + SSL authentication mode can be used.

    • Configurations about Kerberos authentication
      1. Configuration on the client.

        In the configuration file flink-conf.yaml, add configurations about Kerberos authentication as follows:

        security.kerberos.login.keytab: /home/demo/flink/release/flink-1.12.2/keytab/user.keytab
        security.kerberos.login.principal: flinkuser
        security.kerberos.login.contexts: Client,KafkaClient
        security.kerberos.login.use-ticket-cache: false
      2. Running parameters

        Running parameters about the SASL_PLAINTEXT protocol are as follows:

        --topic topic1 --bootstrap.servers 10.96.101.32:21007 --security.protocol SASL_PLAINTEXT  --sasl.kerberos.service.name kafka --kerberos.domain.name hadoop.System domain name.com //10.96.101.32:21007 indicates the IP address:porter of Kafka server.
    • SSL encryption
      • Configuration about SSL
        Set ssl.mode.enable to true, as shown in Figure 2.
        Figure 2 Configuration on the server
      • Configuration on the client.
        1. Log in to FusionInsight Manager, choose Cluster >Name of the desired cluster > Services > Kafka > More, and click Download Client on the displayed Service Status page to download Kafka client, as shown in Figure 3.
          Figure 3 Configuration on the client
        2. Use the ca.crt certificate on the client root directory to generate the truststore for the client. Run the following command:
          keytool -noprompt -import -alias myservercert -file ca.crt -keystore truststore.jks 

          The command execution result is similar to the following:

        3. Running parameters.

          Run the following command to execute the running parameters (Ensure that the content of thessl.truststore.password parameter must be the same as the password entered needs to be confirmed by the same as the password entered when you create a truststore.):

          --topic topic1 --bootstrap.servers 10.96.101.32:9093 --security.protocol SSL --ssl.truststore.location /home/zgd/software/FusionInsight_XXX_Kafka_ClientConfig/truststore.jks --ssl.truststore.password huawei //10.96.101.32:9093 indicates the IP address:porter of Kafka server, and XXX indicates the version of FunsionInsight.

    • Configuration about Kerberos + SSL mode
      After completing preceding configurations about clients and servers of Kerberos and SSL, modify the port numbers and protocol types in running parameters to start the Kerberos + SSL mode.
      --topic topic1 --bootstrap.servers 10.96.101.32:21009 --security.protocol SASL_SSL  --sasl.kerberos.service.name kafka --ssl.truststore.location --kerberos.domain.name hadoop.System domain name.com /home/zgd/software/FusionInsight_XXX_Kafka_ClientConfig/truststore.jks --ssl.truststore.password huawei //10.96.101.32:21009 indicates the IP address:porter of Kafka server, and XXX indicates the version of FunsionInsight

Development Approach

  1. Start the Flink Kafka Producer to send data to Kafka.
  2. Start Flink Kafka Consumer to receive data from Kafka. Ensure that topics of Kafka Consumer are consistent with that of Kafka Producer.
  3. Add prefix to the data content and print the result.