Help Center/ MapReduce Service/ User Guide (Paris Region)/ Troubleshooting/ Using Kafka/ Consumer Is Initialized Successfully, But the Specified Topic Message Cannot Be Obtained from Kafka
Updated on 2024-10-11 GMT+08:00

Consumer Is Initialized Successfully, But the Specified Topic Message Cannot Be Obtained from Kafka

Symptom

An MRS cluster is installed, and ZooKeeper, Flume, Kafka, Storm, and Spark are installed in the cluster.

The customer cannot consume any data using Storm, Spark, Flume or self-programmed Consumer code to consume messages of the specified Kafka topic.

Possible Causes

  1. The Kafka service is abnormal.
  2. The IP address for ZooKeeper connection is incorrectly set.
  3. "ConsumerRebalanceFailedException" is thrown.
  4. "ClosedChannelException" caused by network problems is thrown.

Cause Analysis

Storm, Spark, Flume or user-defined Consumer code can be called Consumer.

  1. Check the Kafka service status:
    • MRS Manager: Log in to MRS Manager and choose Services > Kafka. Check the Kafka status. The status is Good, and the monitoring metrics are correctly displayed.
    • FusionInsight Manager: Log in to FusionInsight Manager and choose Cluster > Name of the target cluster > Service > Kafka.

      Check the Kafka status. It is found that the status is good and the monitoring metrics are correctly displayed.

  2. Check whether data can be normally consumed through the Kafka client.

    Suppose the client has been installed in the /opt/client directory, test is the topic name to be consumed, and the IP address of ZooKeeper is 192.168.234.231.

    cd /opt/client
    source bigdata_env
    kinit admin
    kafka-topics.sh --zookeeper 192.168.234.231:2181/kafka --describe --topic testkafka-console-consumer.sh --topic test --zookeeper 192.168.234.231:2181/kafka --from-beginning

    If data can be consumed, the cluster service is running properly.

  3. Check Consumer configurations. The IP address for connecting to ZooKeeper is incorrect.
    • Flume
      server.sources.Source02.type=org.apache.flume.source.kafka.KafkaSource                                            
      server.sources.Source02.zookeeperConnect=192.168.234.231:2181
      server.sources.Source02.topic = test
      server.sources.Source02.groupId = test_01
    • Spark
      val zkQuorum = "192.168.234.231:2181"
    • Storm
      BrokerHosts brokerHosts = new ZKHosts("192.168.234.231:2181");
    • Consumer API
      zookeeper.connect="192.168.234.231:2181"

    On MRS Manager, the root path of ZNode where Kafka is stored on ZooKeeper is /kafka, which is differentiated from the open source. The address for Kafka to connect to ZooKeeper is 192.168.234.231:2181/kafka.

    However, the address for Consumer to connect to ZooKeeper is 192.168.234.231:2181. Therefore, topic information about Kafka cannot be correctly obtained.

    For details about the solution, see 1.

  4. Check Consumer logs. The logs contain "ConsumerRebalanceFailedException".
    2016-02-03 15:55:32,557 | ERROR | [ZkClient-EventThread-75- 192.168.234.231:2181/kafka] |  Error handling event ZkEvent[New session event sent to kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@34b41dfe]  | org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:77)
    kafka.common.ConsumerRebalanceFailedException: pc-zjqbetl86-1454482884879-2ec95ed3 can't rebalance after 4 retries
    at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:633)
    at kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener.handleNewSession(ZookeeperConsumerConnector.scala:487)
    at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472)
    at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)

    The exception shows that the current Consumer does not complete rebalance within the specified retry times. As a result, Kafka Topic-Partition is not allocated to Consumer and Consumer cannot consume messages.

    For details about the solution, see 3.

  5. Check Consumer logs. The logs contain "Fetching topic metadata with correlation id 0 for topics [Set(test)] from broker [id:26,host:192-168-234-231,port:9092] failed" and "ClosedChannelException".
    [2016-03-04 03:33:53,047] INFO Fetching metadata from broker id:26,host: 192-168-234-231,port:9092 with correlation id 0 for 1 topic(s) Set(test) (kafka.client.ClientUtils$)
    [2016-03-04 03:33:55,614] INFO Connected to 192-168-234-231:21005 for producing (kafka.producer.SyncProducer)
    [2016-03-04 03:33:55,614] INFO Disconnecting from 192-168-234-231:21005 (kafka.producer.SyncProducer)
    [2016-03-04 03:33:55,615] WARN Fetching topic metadata with correlation id 0 for topics [Set(test)] from broker [id:26,host: 192-168-234-231,port:21005] failed (kafka.client.ClientUtils$)
    java.nio.channels.ClosedChannelException
    at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
    at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73)
    at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
    at kafka.producer.SyncProducer.send(SyncProducer.scala:113)
    at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58)
    at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:93)
    at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
    [2016-03-04 03:33:55,615] INFO Disconnecting from 192-168-234-231:21005 (kafka.producer.SyncProducer)

    The exception shows that the current Consumer cannot obtain metadata from the Kafka Broker 192-168-234-231 node and cannot connect to the correct Broker for obtaining messages.

  6. Check the network conditions. If the network is normal, check whether mapping between the host and the IP address is configured.
    • Linux

      Run the cat /etc/hosts command.

      Figure 1 Example 1
    • Windows

      Open C:\Windows\System32\drivers\etc\hosts.

      Figure 2 Example 2

      For details about the solution, see 4.

Solution

  1. The IP address for connecting to ZooKeeper is incorrectly configured.
  2. Change the IP address for connecting to ZooKeeper in the Consumer configuration and make it consistent with MRS configuration.

    • Flume
      server.sources.Source02.type=org.apache.flume.source.kafka.KafkaSource
      server.sources.Source02.zookeeperConnect=192.168.234.231:2181/kafka
      server.sources.Source02.topic = test
      server.sources.Source02.groupId = test_01
    • Spark
      val zkQuorum = "192.168.234.231:2181/kafka"
    • Storm
      BrokerHosts brokerHosts = new ZKHosts("192.168.234.231:2181/kafka");
    • Consumer API
      zookeeper.connect="192.168.234.231:2181/kafka"

  3. Rebalance is abnormal.

    Multiple Consumers in the same consumer group are successively started and consume data of multiple partitions at the same time, load balancing is performed for Consumers when consumers are fewer than partitions.

    The temporary node where the Consumer is stored on ZooKeeper determines read/write permission of which partition of which topic the Consumer has. The path is /consumers/consumer-group-xxx/owners/topic-xxx/x.

    After the load balancing is triggered, the original Consumer will be recalculated and release occupied partitions, which takes a while. Therefore, new Consumers may fail to preempt the partitions.
    Table 1 Parameters

    Name

    Function

    Default Value

    rebalance.max.retries

    Maximum number of rebalance retries

    4

    rebalance.backoff.ms

    Interval for each rebalance retry

    2000

    zookeeper.session.timeout.ms

    Maximum time allowed to create a session with ZooKeeper

    15000

    Set the preceding parameters to higher values. The following is for your reference:

    zookeeper.session.timeout.ms = 45000
    rebalance.max.retries = 10
    rebalance.backoff.ms = 5000

    Parameter setting must comply with the following rule:

    rebalance.max.retries * rebalance.backoff.ms > zookeeper.session.timeout.ms

  4. The network is abnormal.

    In the hosts file, mapping between the hostname and IP address is not configured. As a result, information cannot be obtained when using the hostname for access.

  5. Add the hostname to the hosts file and make it correspond to the IP address.

    • Linux
      Figure 3 Example 3
    • Windows
      Figure 4 Example 4