Updated on 2023-09-15 GMT+08:00

Handling Uneven Service Data

Introduction

Kafka divides each topic into multiple partitions for distributed message storage. Each partition has one or more replicas distributed on different brokers. Each replica stores a copy of full data. Messages are synchronized among replicas. The following figure shows the relationships between topics, partitions, replicas, and brokers.

Uneven service data among brokers and partitions may happen, leading to low performance of Kafka clusters and low resource utilization.

Causes of uneven service data

  • The traffic of some topics is much heavier than that of others.
  • Producers specified partitions when sending messages, leaving unspecified partitions empty.
  • Producers specified message keys to send messages to specific partitions.
  • The system re-implements flawed partition allocation policies.
  • There are new Kafka brokers with no partitions allocated.
  • Cluster changes lead to switches and migration of leader replicas, causing data on some brokers to increase.

Solution

Handling uneven service data:

  • Optimize the topic design. For a topic with considerable data, the data can be split across topics.
  • Producers evenly send messages across partitions.
  • When creating topics, distribute leader replicas across brokers.
  • Kafka features partition reassignment. You can reassign replicas to different brokers to balance load among brokers. For details, see Reassigning Partitions.