Updated on 2024-05-24 GMT+08:00

Kafka Production Rate and CPU Usage

Scenarios

This section describes performance tests on Distributed Message Service (DMS) for Kafka. The performance is measured by the message production rate on the client side and CPU usage on the server side. The tests cover the following scenarios:

  • Scenario 1 (batch size): same Kafka instance, same topics, different batch.size settings
  • Scenario 2 (cross-AZ or intra-AZ production): same Kafka instance, same topics, different AZ settings for the client and server
  • Scenario 3 (number of replicas): same Kafka instance, different numbers of replicas
  • Scenario 4 (synchronous or asynchronous replication): same Kafka instance, topics with different replication settings
Table 1 Test parameters

Partitions

Replicas

Synchronous Replication

batch.size

Cross-AZ Production

3

1

No

1 KB

No

3

1

No

16 KB

No

3

1

No

1 KB

Yes

3

3

Yes

1 KB

No

3

3

No

1 KB

No

Environment

Perform the following steps to set up the test environment.

  1. Purchase a Kafka instance with parameters specified as follows. For details about how to purchase one, see Buying a Kafka Instance.
    • Region: CN-Hong Kong
    • Project: CN-Hong Kong
    • AZ: Select 1.
    • Instance Name: Enter "kafka-test".
    • Enterprise Project: Select default.
    • Version: Select 2.7.
    • Broker Flavor: Select kafka.2u4g.cluster.
    • Brokers: 3
    • Storage space: ultra-high I/O, 200 GB
    • Capacity Threshold Policy: Select Automatically delete.
    • VPC
    • Security Group
    • Private Network Access: Enable Plaintext Access.
    • Public Network Access: Do not enable it.
    • Advanced Settings: Do not enable Smart Connect and Automatic Topic Creation.

    After the purchase is complete, obtain the address of the Kafka instance on the instance details page.

  2. Create three topics with parameters specified as follows for the purchased Kafka instance. For details, see Creating a Kafka Topic.
    • Topic-01: 3 partitions, 1 replica, asynchronous replication
    • Topic-02: 3 partitions, 3 replicas, asynchronous replication
    • Topic-03: 3 partitions, 3 replicas, synchronous replication
  3. Obtain the test tool.

    Obtain Kafka CLI v2.7.2.

  4. Purchase a server for the client.
    Buy two ECSs with the following configurations. For details about how to purchase an ECS, see Purchasing an ECS.
    • One ECS is 4 vCPUs | 8 GB, runs Linux, and is configured with the same region, AZ, VPC, subnet, and security group as the Kafka instance.
    • The other ECS is 4 vCPUs | 8 GB, runs Linux, and is configured with the same region, VPC, subnet, and security group but a different AZ from the Kafka instance.

    Perform the following operations on the ECSs:

    • Install Java JDK and configure the environment variables JAVA_HOME and PATH.
      export JAVA_HOME=/root/jdk1.8.0_231 
      export PATH=$JAVA_HOME/bin:$PATH
    • Download Kafka CLI v2.7.2 and decompress it.
      tar -zxf kafka_2.12-2.7.2.tgz

Script

./kafka-producer-perf-test.sh --producer-props bootstrap.servers=${connection address} acks=1 batch.size=${batch.size} linger.ms=0 --topic ${topic name} --num-records ${num-records} --record-size 1024 --throughput -102400
  • bootstrap.servers: address of the Kafka instance obtained in 1.
  • acks: message synchronization policy. acks=1 indicates asynchronous replication, and acks=-1 indicates synchronous replication.
  • batch.size: size of messages sent in each batch, in bytes.
  • linger.ms: interval between two batches.
  • topic: topic name set in 2.
  • num-records: total number of messages to be sent.
  • record-size: size of each message.
  • throughput: number of messages sent per second.

Procedure

Scenario 1: Varied Batch Sizes

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    Set batch.size to 1 KB, and run the following script:
    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 8000000 --record-size 1024 --throughput 102400

    Result:

    8000000 records sent, 37696.729809 records/sec (36.81 MB/sec), 796.54 ms avg latency, 3838.00 ms max latency, 322 ms 50th, 2282 ms 95th, 2745 ms 99th, 3593 ms 99.9th.

    Message production rate: 37,697 records/second

    Set batch.size to 16 KB, and run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=16384 linger.ms=0 --topic Topic-01 --num-records 100000000 --record-size 1024 --throughput 102400

    Result:

    100000000 records sent, 102399.318430 records/sec (100.00 MB/sec), 4.62 ms avg latency, 751.00 ms max latency, 1 ms 50th, 3 ms 95th, 164 ms 99th, 406 ms 99.9th.

    Message production rate: 102,399 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 1 broker-0 CPU usage (batch.size = 1 KB)

    CPU usage: 54%

    Figure 2 broker-0 CPU usage (batch.size = 16 KB)

    CPU usage: 26%

    Figure 3 broker-1 CPU usage (batch.size = 1 KB)

    CPU usage: 55%

    Figure 4 broker-1 CPU usage (batch.size = 16 KB)

    CPU usage: 25%

    Figure 5 broker-2 CPU usage (batch.size = 1 KB)

    CPU usage: 51.65%

    Figure 6 broker-2 CPU usage (batch.size = 16 KB)

    CPU usage: 36.45%

Scenario 2: Cross-AZ or Intra-AZ Production

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    Configure the same AZ for the client and the instance, and run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 8000000 --record-size 1024 --throughput 102400

    Result:

    8000000 records sent, 37696.729809 records/sec (36.81 MB/sec), 796.54 ms avg latency, 3838.00 ms max latency, 322 ms 50th, 2282 ms 95th, 2745 ms 99th, 3593 ms 99.9th.

    Message production rate: 37,697 records/second

    Configure different AZs for the client and the instance, and run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 4000000 --record-size 1024 --throughput 102400

    Result:

    4000000 records sent, 15358.152107 records/sec (15.00 MB/sec), 1944.09 ms avg latency, 8179.00 ms max latency, 13 ms 50th, 6049 ms 95th, 6549 ms 99th, 8086 ms 99.9th.

    Message production rate: 15,358 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 7 broker-0 CPU usage (same AZ)

    CPU usage: 54%

    Figure 8 broker-0 CPU usage (different AZs)

    CPU usage: 28%

    Figure 9 broker-1 CPU usage (same AZ)

    CPU usage: 55%

    Figure 10 broker-1 CPU usage (different AZs)

    CPU usage: 28%

    Figure 11 broker-2 CPU usage (same AZ)

    CPU usage: 51.65%

    Figure 12 broker-2 CPU usage (different AZs)

    CPU usage: 23.35%

Scenario 3: Varied Numbers of Replicas

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    For the one-replica topic, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-01 --num-records 8000000 --record-size 1024 --throughput 102400

    Result:

    8000000 records sent, 37696.729809 records/sec (36.81 MB/sec), 796.54 ms avg latency, 3838.00 ms max latency, 322 ms 50th, 2282 ms 95th, 2745 ms 99th, 3593 ms 99.9th.

    Message production rate: 37,697 records/second

    For the three-replica topic, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-02 --num-records 4000000 --record-size 1024 --throughput 102400

    Result:

    4000000 records sent, 15245.877896 records/sec (14.89 MB/sec), 1963.88 ms avg latency, 7471.00 ms max latency, 306 ms 50th, 5854 ms 95th, 6682 ms 99th, 7439 ms 99.9th.

    Message production rate: 15,246 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 13 broker-0 CPU usage (one replica)

    CPU usage: 54%

    Figure 14 broker-0 CPU usage (three replicas)

    CPU usage: 86%

    Figure 15 broker-1 CPU usage (one replica)

    CPU usage: 55%

    Figure 16 broker-1 CPU usage (three replicas)

    CPU usage: 87%

    Figure 17 broker-2 CPU usage (one replica)

    CPU usage: 51.65%

    Figure 18 broker-2 CPU usage (three replicas)

    CPU usage: 87.10%

Scenario 4: Synchronous/Asynchronous Replication

  1. Log in to the client server, go to the kafka_2.12-2.7.2/bin directory, and run the following scripts.

    For asynchronous replication, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=1 batch.size=1024 linger.ms=0 --topic Topic-02 --num-records 4000000 --record-size 1024 --throughput 102400

    Result:

    4000000 records sent, 15245.877896 records/sec (14.89 MB/sec), 1963.88 ms avg latency, 7471.00 ms max latency, 306 ms 50th, 5854 ms 95th, 6682 ms 99th, 7439 ms 99.9th.

    Message production rate: 15,246 records/second

    For synchronous replication, run the following script:

    ./kafka-producer-perf-test.sh --producer-props bootstrap.servers=192.168.0.69:9092,192.168.0.42:9092,192.168.0.66:9092 acks=-1 batch.size=1024 linger.ms=0 --topic Topic-03 --num-records 1000000 --record-size 1024 --throughput 102400

    Result:

    1000000 records sent, 5180.783438 records/sec (5.06 MB/sec), 5692.27 ms avg latency, 10312.00 ms max latency, 5579 ms 50th, 7538 ms 95th, 9481 ms 99th, 10219 ms 99.9th.

    Message production rate: 5181 records/second

  2. Log in to the Kafka console and click the name of the test instance.
  3. In the navigation pane, choose Monitoring.
  4. On the Brokers tab page, view the CPU usage of the server nodes.

    Figure 19 broker-0 CPU usage (asynchronous replication)

    CPU usage: 86%

    Figure 20 broker-0 CPU usage (synchronous replication)

    CPU usage: 62%

    Figure 21 broker-1 CPU usage (asynchronous replication)

    CPU usage: 87%

    Figure 22 broker-1 CPU usage (synchronous replication)

    CPU usage: 61%

    Figure 23 broker-2 CPU usage (asynchronous replication)

    CPU usage: 87.10%

    Figure 24 broker-2 CPU usage (synchronous replication)

    CPU usage: 59.40%

Result

Table 2 Testing results

Partitions

Replicas

Synchronous Replication

batch.size

Cross-AZ Production

Message Production Rate on the Client Side (Records/Second)

CPU Usage on the Server Side (broker-0)

CPU Usage on the Server Side (broker-1)

CPU Usage on the Server Side (broker-2)

3

1

No

1 KB

No

37,697

54%

55%

51.65%

3

1

No

16 KB

No

102,399

26%

25%

36.45%

3

1

No

1 KB

Yes

15,358

28%

28%

23.35%

3

3

Yes

1 KB

No

5181

62%

61%

59.40%

3

3

No

1 KB

No

15,246

86%

87%

87.10%

Based on the test results, the following conclusions are drawn (for reference only):

  • When the batch.size of production requests is 16 times larger, the message production rate increases, and the CPU usage decreases.
  • Compared with cross-AZ production, intra-AZ production significantly increases message production rate and CPU usage.
  • When the number of replicas changes from 1 to 3, the message production rate decreases significantly, and the CPU usage increases.
  • Compared with synchronous replication, asynchronous replication increases the message production rate and the CPU usage.