Updated on 2023-01-19 GMT+08:00

Using Kafka Clients

Consumers

  1. Ensure that the owner thread does not exit abnormally. Otherwise, the client may fail to initiate consumption requests and the consumption will be blocked.
  2. Commit messages only after they have been processed. Otherwise, the messages may fail to be processed and cannot be polled again.
  3. If there is a large number of OFFSET_COMMIT requests, CPU usage will be high. For example, if a consumption request pulls 1000 messages and each message is committed separately, the commit TPS will be 1000 times that of the consumption request. The smaller the message body, the larger the difference. Therefore, you are not advised to commit every message separately. You can commit a specific number of messages in batches or enable enable.auto.commit. However, if the client is faulty, some cached consumption offset may be lost, resulting in repeated consumption. Therefore, you are advised to commit messages in batches based on service requirements.
  4. A consumer cannot frequently join or leave a group. Otherwise, the consumer will frequently perform rebalancing, which blocks consumption.
  5. The number of consumers cannot be greater than the number of partitions in the topic. Otherwise, some consumers may fail to poll for messages.
  6. Ensure that the consumer polls at regular intervals to keep sending heartbeats to the server. If the consumer stops sending heartbeats for long enough, the consumer session will time out and the consumer will be considered to have stopped. This will also block consumption.
  7. Ensure that there is a limitation on the size of messages buffered locally to avoid an out-of-memory (OOM) situation.
  8. Set the timeout for the consumer session to 30 seconds: session.timeout.ms=30000.
  9. Kafka supports exactly-once delivery. Therefore, ensure the idempotency of processing messages for services.
  10. Always close the consumer before exiting. Otherwise, consumers in the same group may be blocked within the timeout set by session.timeout.ms.
  11. Do not start a consumer group name with a special character, such as a number sign (#). Otherwise, monitoring data of the consumer group cannot be displayed.

Producers

  1. Synchronous replication: Set acks to all.
  2. Retry message sending: Set retries to 3.
  3. Optimize message sending: For latency-sensitive messages, set linger.ms to 0. For latency-insensitive messages, set linger.ms to a value ranging from 100 to 1000.
  4. Ensure that the producer has sufficient JVM memory to avoid blockages.
  5. Set the timestamp to the local time. Messages will fail to age if the timestamp is a future time.

Topics

Recommended topic configurations: Use 3 replicas, enable synchronous replication, and set the minimum number of in-sync replicas to 2. The number of in-sync replicas cannot be the same as the number of replicas of the topic. Otherwise, if one replica is unavailable, messages cannot be produced.

You can enable or disable automatic topic creation. If it is enabled, a topic will be automatically created with 3 partitions and 3 replicas when a message is produced in or consumed from a topic that does not exist.

The recommended maximum number of partitions for a topic is 100.

Each topic can have 3 replicas (the number of replicas cannot be modified once configured).

Other Suggestions

Maximum number of connections: 3000

Maximum size of a message: 10 MB

Access Kafka using SASL_SSL. Ensure that your DNS service is capable of resolving an IP address to a domain name. Alternatively, map all Kafka broker IP addresses to host names in the hosts file. Prevent Kafka clients from performing reverse resolution. Otherwise, connections may fail to be established.

Apply for a disk space size that is more than twice the size of service data multiplied by the number of replicas. In other words, keep 50% of the disk space idle.

Avoid frequent full GC in JVM. Otherwise, message production and consumption will be blocked.