Updated on 2025-02-14 GMT+08:00

Kafka Production Rate and CPU Usage

This section describes performance tests on Distributed Message Service (DMS) for Kafka. The performance is measured by the message production rate on the client side and CPU usage on the server side. The tests cover the following scenarios:

  • Scenario 1 (batch size): same Kafka instance, same topics, different message size settings
  • Scenario 2 (cross-AZ or intra-AZ production): same Kafka instance, same topics, different AZ settings for the client and server
  • Scenario 3 (number of replicas): same Kafka instance, different numbers of replicas
  • Scenario 4 (synchronous or asynchronous replication): same Kafka instance, topics with different replication settings
Table 1 Test parameters

Partitions

Replicas

Synchronous Replication

batch.size

Cross-AZ Production

3

1

No

1 KB

No

3

1

No

16 KB

No

3

1

No

1 KB

Yes

3

3

Yes

1 KB

No

3

3

No

1 KB

No

Environment

Perform the following steps to set up the test environment.

  1. Purchase a Kafka instance with parameters specified as follows and retain the default settings for other ones. For details about how to purchase one, see Buying a Kafka Instance.
    • Region: CN-Hong Kong
    • AZ: Select 1.
    • Version: Select 2.7.
    • Architecture: Select Cluster.
    • Broker Flavor: Select kafka.2u4g.cluster.
    • Brokers: Enter 3.
    • Storage Space per Broker: Select Ultra-high I/O and enter 200.
    • VPC: Select a VPC.
    • Subnet: Select a subnet.
    • Security Group: Select a security group.
    • Access Mode: Retain the default settings.
    • Instance Name: Enter "kafka-test".
    • Enterprise Project: Select default.

    After the purchase, obtain Address (Private Network, Plaintext) on the instance details page.

  2. Create three topics with parameters specified as follows for the purchased Kafka instance. For details, see Creating a Kafka Topic.
    • Topic-01: 3 partitions, 1 replica, asynchronous replication
    • Topic-02: 3 partitions, 3 replicas, asynchronous replication
    • Topic-03: 3 partitions, 3 replicas, synchronous replication
  3. Obtain the test tool.

    Obtain Kafka CLI 2.7.2.

  4. Purchase a server for the client.
    Buy two ECSs with the following configurations. For details about how to purchase an ECS, see Purchasing a Custom ECS.
    • One ECS is 4 vCPUs | 8 GB, runs Linux, and is configured with the same region, AZ, VPC, subnet, and security group as the Kafka instance.
    • The other ECS is 4 vCPUs | 8 GB, runs Linux, and is configured with the same region, VPC, subnet, and security group but a different AZ from the Kafka instance.

    Perform the following operations on the ECSs:

    • Install Java JDK and configure the environment variables JAVA_HOME and PATH.
      export JAVA_HOME=/root/jdk1.8.0_231 
      export PATH=$JAVA_HOME/bin:$PATH
    • Download Kafka CLI 2.7.2 and decompress it.
      tar -zxf kafka_2.12-2.7.2.tgz

Script

./kafka-producer-perf-test.sh --producer-props bootstrap.servers=${connection address} acks=1 batch.size=${batch.size} linger.ms=0 --topic ${topic name} --num-records ${num-records} --record-size 1024 --throughput 102400
  • bootstrap.servers: address of the Kafka instance obtained in 1.
  • acks: message synchronization policy. acks=1 indicates asynchronous replication, and acks=-1 indicates synchronous replication.
  • batch.size: size of messages sent in each batch, in bytes.
  • linger.ms: interval between two batches.
  • topic: topic name set in 2.
  • num-records: total number of messages to be sent.
  • record-size: size of each message.
  • throughput: number of messages sent per second.

Procedure

Scenario 1: Varied Batch Sizes

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    Set batch.size to 1 KB, and run the following script:
    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 8000000 --record-size 1024 --throughput 102400

    Result:

    8000000 records sent, 34128.673632 records/sec (33.33 MB/sec), 879.91 ms avg latency, 4102.00 ms max latency, 697 ms 50th, 2524 ms 95th, 2888 ms 99th, 4012 ms 99.9th.

    Message production rate: 34,128 records/second

    Set batch.size to 16 KB, and run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=16384 linger.ms=0 --topic Topic-01 --num-records 100000000 --record-size 1024 --throughput 102400

    Result:

    100000000 records sent, 102399.318430 records/sec (100.00 MB/sec), 4.72 ms avg latency, 914.00 ms max latency, 1 ms 50th, 5 ms 95th, 162 ms 99th, 398 ms 99.9th.

    Message production rate: 102,399 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 1 broker-0 CPU usage (batch.size = 1 KB)

    CPU usage: 58.10%

    Figure 2 broker-0 CPU usage (batch.size = 16 KB)

    CPU usage: 24.10%

    Figure 3 broker-1 CPU usage (batch.size = 1 KB)

    CPU usage: 56.70%

    Figure 4 broker-1 CPU usage (batch.size = 16 KB)

    CPU usage: 25%

    Figure 5 broker-2 CPU usage (batch.size = 1 KB)

    CPU usage: 53.30%

    Figure 6 broker-2 CPU usage (batch.size = 16 KB)

    CPU usage: 23.30%

Scenario 2: Cross-AZ or Intra-AZ Production

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    Configure the same AZ for the client and the instance, and run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 8000000 --record-size 1024 --throughput 102400

    Result:

    8000000 records sent, 34128.673632 records/sec (33.33 MB/sec), 879.91 ms avg latency, 4102.00 ms max latency, 697 ms 50th, 2524 ms 95th, 2888 ms 99th, 4012 ms 99.9th.

    Message production rate: 34,128 records/second

    Configure different AZs for the client and the instance, and run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 4000000 --record-size 1024 --throughput 102400

    Result:

    4000000 records sent, 8523.042044 records/sec (8.32 MB/sec), 3506.20 ms avg latency, 11883.00 ms max latency, 1817 ms 50th, 10621 ms 95th, 11177 ms 99th, 11860 ms 99.9th.

    Message production rate: 8523 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 7 broker-0 CPU usage (same AZ)

    CPU usage: 58.10%

    Figure 8 broker-0 CPU usage (different AZs)

    CPU usage: 17.20%

    Figure 9 broker-1 CPU usage (same AZ)

    CPU usage: 56.70%

    Figure 10 broker-1 CPU usage (different AZs)

    CPU usage: 16.70%

    Figure 11 broker-2 CPU usage (same AZ)

    CPU usage: 53.30%

    Figure 12 broker-2 CPU usage (different AZs)

    CPU usage: 18.80%

Scenario 3: Varied Numbers of Replicas

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    For the one-replica topic, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 8000000 --record-size 1024 --throughput 102400

    Result:

    8000000 records sent, 34128.673632 records/sec (33.33 MB/sec), 879.91 ms avg latency, 4102.00 ms max latency, 697 ms 50th, 2524 ms 95th, 2888 ms 99th, 4012 ms 99.9th.

    Message production rate: 34,128 records/second

    For the three-replica topic, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-02 --num-records 4000000 --record-size 1024 --throughput 102400

    Result:

    4000000 records sent, 14468.325219 records/sec (14.13 MB/sec), 2069.99 ms avg latency, 7911.00 ms max latency, 846 ms 50th, 6190 ms 95th, 6935 ms 99th, 7879 ms 99.9th.

    Message production rate: 14,468 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 13 broker-0 CPU usage (one replica)

    CPU usage: 58.10%

    Figure 14 broker-0 CPU usage (three replicas)

    CPU usage: 86.70%

    Figure 15 broker-1 CPU usage (one replica)

    CPU usage: 56.70%

    Figure 16 broker-1 CPU usage (three replicas)

    CPU usage: 80.60%

    Figure 17 broker-2 CPU usage (one replica)

    CPU usage: 53.30%

    Figure 18 broker-2 CPU usage (three replicas)

    CPU usage: 86.20%

Scenario 4: Synchronous/Asynchronous Replication

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    For asynchronous replication, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-02 --num-records 4000000 --record-size 1024 --throughput 102400

    Result:

    4000000 records sent, 14468.325219 records/sec (14.13 MB/sec), 2069.99 ms avg latency, 7911.00 ms max latency, 846 ms 50th, 6190 ms 95th, 6935 ms 99th, 7879 ms 99.9th.

    Message production rate: 14,468 records/second

    For synchronous replication, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=-1 batch.size=1024 linger.ms=0 --topic Topic-03 --num-records 1000000 --record-size 1024 --throughput 102400

    Result:

    1000000 records sent, 3981.937930 records/sec (3.89 MB/sec), 7356.98 ms avg latency, 19013.00 ms max latency, 6423 ms 50th, 14381 ms 95th, 18460 ms 99th, 18975 ms 99.9th.

    Message production rate: 3981 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 19 broker-0 CPU usage (asynchronous replication)

    CPU usage: 86.70%

    Figure 20 broker-0 CPU usage (synchronous replication)

    CPU usage: 60%

    Figure 21 broker-1 CPU usage (asynchronous replication)

    CPU usage: 80.60%

    Figure 22 broker-1 CPU usage (synchronous replication)

    CPU usage: 55.20%

    Figure 23 broker-2 CPU usage (asynchronous replication)

    CPU usage: 86.20%

    Figure 24 broker-2 CPU usage (synchronous replication)

    CPU usage: 50%

Result

Table 2 Testing results

Partitions

Replicas

Synchronous Replication

batch.size

Cross-AZ Production

Message Production Rate on the Client Side (Records/Second)

CPU Usage on the Server Side (broker-0)

CPU Usage on the Server Side (broker-1)

CPU Usage on the Server Side (broker-2)

3

1

No

1 KB

No

34,128

58.10%

56.70%

53.30%

3

1

No

16 KB

No

102,399

24.10%

25.00%

23.30%

3

1

No

1 KB

Yes

8,523

17.20%

16.70%

18.80%

3

3

Yes

1 KB

No

3981

60.00%

55.20%

50.00%

3

3

No

1 KB

No

14,468

86.70%

80.60%

86.20%

Based on the test results, the following conclusions are drawn (for reference only):

  • When the batch.size of production requests is 16 times larger, the message production rate increases, and the CPU usage decreases.
  • Compared with cross-AZ production, intra-AZ production significantly increases message production rate and CPU usage.
  • When the number of replicas changes from 1 to 3, the message production rate decreases significantly, and the CPU usage increases.
  • Compared with synchronous replication, asynchronous replication increases the message production rate and the CPU usage.