Updated on 2024-08-10 GMT+08:00

Flink Kafka Sample Application Development Roadmap

Scenarios

Assume that a Flink service receives one word record every 1 second.

Develop a Flink application that can generate output of prefixed message contents.

Data Planning

Sample project data of Flink is stored in Kafka. A user with Kafka permission can send data to Kafka and receive data from it.

  1. Ensure that clusters, including HDFS, YARN, Flink, and Kafka are successfully installed.
  2. Create a topic.
    1. Configure permissions of the user to create topic on the server.
      Change the value of allow.everyone.if.no.acl.found, the Broker configuration value of Kafka, to true, as shown in Figure 1. Then restart the Kafka service.
      Figure 1 Configuring permissions of topics on the server
    2. Run Linux command line to create a topic. Before running commands, ensure that the kinit command, for example, kinit flinkuser, is run for authentication.

      Flinkuser requires the user has permission to create Kafka's topic to create it. For details, see Preparing MRS Application Development User.

      The format of the command is following:

      bin/kafka-topics.sh --create --zookeeper {zkQuorum}/kafka --partitions {partitionNum} --replication-factor {replicationNum} --topic {Topic}

      Table 1

      Parameter

      Description

      {zkQuorum}

      ZooKeeper cluster information. The format is IP:port.

      {PartitionNum}

      The number of partitions for the topic.

      {ReplicationNum}

      The number of copies of each partition for the topic.

      {Topic}

      The topic name.

      Assume that the IP:ports of ZooKeeper clusters are 10.96.101.32:2181, 10.96.101.251:2181, 10.96.101.177:2181, and 10.91.8.160:2181, and the topic named is topic1. The command for creating a topic is as follows:
      bin/kafka-topics.sh --create --zookeeper 10.96.101.32:2181,10.96.101.251:2181,10.96.101.177:2181,10.91.8.160:2181/kafka --partitions 5 --replication-factor 1 --topic topic1
  3. Security authentication

    The Kerberos authentication, SSL encryption authentication, or Kerberos + SSL authentication mode can be used.

    • Configurations about Kerberos authentication
      1. Configuration on the client.

        In the configuration file flink-conf.yaml, add configurations about Kerberos authentication as follows:

        security.kerberos.login.keytab: /home/demo/flink/release/flink-1.12.2/keytab/user.keytab
        security.kerberos.login.principal: flinkuser
        security.kerberos.login.contexts: Client,KafkaClient
        security.kerberos.login.use-ticket-cache: false
      2. Running parameters

        Running parameters about the SASL_PLAINTEXT protocol are as follows:

        --topic topic1 --bootstrap.servers 10.96.101.32:21007 --security.protocol SASL_PLAINTEXT  --sasl.kerberos.service.name kafka --kerberos.domain.name hadoop.System domain name.com //10.96.101.32:21007 indicates the IP address:porter of Kafka server.
    • SSL encryption
      • Configuration about SSL
        Set ssl.mode.enabletotrue, as shown inFigure 2.
        Figure 2 Configuration on the server
      • Configuration on the client.
        1. Log in to FusionInsight Manager, choose Cluster >Name of the desired cluster > Services > Kafka > More, and clickDownload Clienton the displayedService Statuspage to download Kafka client, as shown inFigure 3.
          Figure 3 Configuration on the client
        2. Use the ca.crt certificate on the client root directory to generate the truststore for the client. Run the following command:
          keytool -noprompt -import -alias myservercert -file ca.crt -keystore truststore.jks 

          The command execution result is similar to the following:

        3. Running parameters.

          Run the following command to execute the running parameters (Ensure that the content of thessl.truststore.passwordparameter must be the same as the password entered needs to be confirmed by the same as the password entered when you create atruststore.):

          --topic topic1 --bootstrap.servers 10.96.101.32:9093 --security.protocol SSL --ssl.truststore.location /home/zgd/software/FusionInsight_XXX_Kafka_ClientConfig/truststore.jks --ssl.truststore.password xxx //10.96.101.32:9093 indicates the IP address:porter of Kafka server, and XXX indicates the version of FunsionInsight, xxx indicates the password.
    • Configuration about Kerberos + SSL mode
      After completing preceding configurations about clients and servers of Kerberos and SSL, modify the port numbers and protocol types in running parameters to start the Kerberos + SSL mode.
      --topic topic1 --bootstrap.servers 10.96.101.32:21009 --security.protocol SASL_SSL  --sasl.kerberos.service.name kafka --ssl.truststore.location --kerberos.domain.name hadoop.System domain name.com /home/zgd/software/FusionInsight_XXX_Kafka_ClientConfig/truststore.jks --ssl.truststore.password xxx //10.96.101.32:21009 indicates the IP address:porter of Kafka server, and xxx indicates the password.

Development Approach

  1. Start the Flink Kafka Producer to send data to Kafka.
  2. Start Flink Kafka Consumer to receive data from Kafka. Ensure that topics of Kafka Consumer are consistent with that of Kafka Producer.
  3. Add prefix to the data content and print the result.