Updated on 2022-02-22 GMT+08:00

Kafka HA Usage Description

Kafka High Reliability and Availability

Kafka message transmission assurance mechanism ensures message transmission after required parameters are set to meet different performance and reliability requirements.

  • Kafka high availability and high performance

    If HA and high performance are required, configure parameters listed in the following table.

    Parameter

    Default Value

    Description

    unclean.leader.election.enable

    true

    Specifies whether a replica that is not in the ISR can be selected as the leader. If this parameter is set to true, data may be lost.

    auto.leader.rebalance.enable

    true

    Specifies whether the leader automated balancing function is used.

    If this parameter is set to true, the controller periodically balances the leader of each partition on all nodes and assigns the leader to a replica with a higher priority.

    acks

    1

    The leader needs to check whether the message has been received and determine whether the required operation has been processed. This parameter affects message reliability and performance.

    • If this parameter is set to 0, the Producer does not wait for any response from the server and the message is considered successful.
    • If this parameter is set to 1, when the leader of the copy verifies that data has been written into the cluster, the leader makes repose quickly without waiting until all the copies are written. In this case, if the leader is abnormal when the leader makes the confirmation but replica synchronization is not complete, data will be lost.
    • If this parameter is set to -1 (all), the synchronization is successful only after all synchronization copies are confirmed. If min.insync.replicas is also configured, multiple copies can be written successfully. In this case, as long as one copy remains active, the record is not lost.
      NOTE:

      This parameter is configured in the Kafka client configuration file.

    min.insync.replicas

    1

    Specifies the minimum number of replicas to which data is written when acks is set to -1 for the Producer.

    Impact of HA and high performance configurations:

    After HA and high performance are configured, the data reliability decreases. Specifically, data may be lost of disks or nodes are faulty.

  • Kafka high reliability configuration

    If high data reliability is required, configure parameters listed in the following table.

    Parameter

    Recommended Value

    Description

    unclean.leader.election.enable

    false

    Indicates whether a replica that is not in the ISR list can be elected as a leader.

    acks

    -1

    The leader needs to check whether the message has been received and determine whether the required operation has been processed.

    If this parameter is set to -1, the message is successfully received only when all replicas in the ISR list have confirmed to receive the message. The min.insync.replicas parameter must also be set to ensure that multiple copies can be written successfully. As long as one copy is active, the record is not lost.

    NOTE:

    This parameter is configured in the Kafka client configuration file.

    min.insync.replicas

    2

    Specifies the minimum number of replicas to which data is written when acks is set to -1 for the Producer.

    Ensure that the value of Min.insync.replicas is equal to or less than that of replication.factor.

    Impact of high reliability configurations:

    • Deteriorated performance

      All copies in the ISR list are required, and the writing of the minimum number of copies has been verified successful. As a result, the delay of a single message increases and the processing capability of the client decreases. The actual performance depends on the onsite test data.

    • Reduced availability

      A replica that is not in the ISR list cannot be elected as a leader. If the leader goes offline and other replicas are not in the ISR list, the partition remains unavailable until the leader node recovers.

      All copies in the ISR list are required, and the writing of the minimum number of copies has been verified successful. When the node where a copy of a partition is located is faulty, the minimum number of successful copies cannot be met. As a result, service writing fails.

Configuration Impact

Evaluate reliability and performance requirements based on service scenarios and use proper parameter configuration.

  • For valuable data, you are advised to configure raid1 or raid5 for Kafka data directory disks to improve data reliability in case disk fault of a single disk.
  • The acks parameter is named different for different Producer APIs.
    • New Producer API

      Indicates the interface defined in org.apache.kafka.clients.producer.KafkaProducer. The acks parameter name remains unchanged for this API.

    • Old Producer API

      Indicates the interface defined in kafka.producer.Producer. The acks parameter is named as request.required.acks for this API.

  • For parameters that can be modified at the topic level, the service level configurations are used by default. These parameters can be separately configured based on topic reliability requirements.

    For example, you can configure the reliability parameters of the topic named test.

    kafka-topics.sh --zookeeper 192.168.1.205:2181/kafka --alter --topic test --config unclean.leader.election.enable=false --config min.insync.replicas=2

    192.168.1.205 indicates the ZooKeeper service IP address.

  • If modification of the service-level requires the restart of Kafka, you are advised to modify the service-level configuration on the change page.