Updated on 2025-09-03 GMT+08:00

Kafka Data Migration Overview

You can migrate Kafka services to connect message producers and consumers to a new Kafka instance and can even migrate persisted message data to the new Kafka instance. Kafka services can be migrated in the following two scenarios:

  • Migrating services to the cloud without downtime

    Services that have high requirements on continuity must be smoothly migrated to the cloud because they cannot afford a long downtime.

  • Re-deploying services on the cloud

    A Kafka instance deployed within an AZ is not capable of cross-AZ disaster recovery. For higher reliability, you can re-deploy services to an instance that is deployed across AZs.

Scheme Overview

Table 1 Migration scheme overview

Migration Scheme

Migration Tool

Pros

Cons

Migrate production first, then consumption (without migrating data).

-

  • This is a common migration solution in the industry. The procedure is simple and no additional plugins are required.
  • In this scheme, the migration process is controlled by services.
  • Orderly message consumption can be ensured.
  • There is latency during the switchover. The consumer needs to consume the original Kafka messages before consuming the new ones.
  • After the consumption service is migrated, messages may be stacked in the new Kafka instance.

Consume messages from both Kafka instances and migrate the production later (without migrating data).

-

  • The procedure is simple and no additional plugins are required.
  • Messages are not stacked because source and target messages can be consumed at the same time.

Early on in the migration, data is consumed from both the original and new Kafka instances, so the messages may not be consumed in the order that they are produced.

Migrate data using MirrorMaker, then consumption, then production.

MirrorMaker

The new Kafka instance synchronizes full historical data from the original Kafka instance and real-time incremental data.

After the consumer is migrated to the new Kafka instance, it may consume historical messages repeatedly. The consumer should support idempotent messages.

Preparation

  1. Configure the network environment.

    Before accessing a Kafka instance over a private network, configure a security group with the following parameters.

    Table 2 Security group rules

    Direction

    Protocol

    Port

    Source

    Description

    Inbound

    TCP

    9092

    IP address or IP address group of the Kafka client

    • Accessing a Kafka instance within a VPC (without SSL)
    • Accessing a Kafka instance using a peering connection across VPCs (without SSL)

    Inbound

    TCP

    9093

    IP address or IP address group of the Kafka client

    • Accessing a Kafka instance over a private network within a VPC (with SSL)
    • Accessing a Kafka instance using a peering connection across VPCs (with SSL)
  2. Create the target Kafka instance.

    The specifications of the target instance cannot be lower than the original specifications. For more information, see Buying a Kafka Instance.

  3. Create a topic in the target Kafka instance.

    Create a topic with the same configurations as the original Kafka instance, including the topic name, number of replicas, number of partitions, message aging time, and whether to enable synchronous replication and flushing. For more information, see Creating a Kafka Topic.

Migration Scheme 1: Migrating the Production First, then Consumption (Data Not Migrated)

Migrate the message production service to the new Kafka instance. After migration, the original Kafka instance will no longer produce messages. After all messages of the original Kafka instance are consumed, migrate the message consumption service to the new Kafka instance to consume messages of this instance.

This is a common migration scheme. It is simple and easy to control on the service side. During the migration, the message sequence is ensured, so this scheme is suitable for scenarios with strict requirements on the message sequence. However, latency may occur because there is a period when you have to wait for all data to be consumed.

  1. Change the Kafka connection address of the producer to that of the new Kafka instance.
  2. Restart the production service so that the producer can send new messages to the new Kafka instance.
  3. Check the consumption progress of each consumer group in the original Kafka instance until all data in the original Kafka instance is consumed.
  4. Change the Kafka connection addresses of the consumers to those of the new Kafka instance.
  5. Restart the consumption service so that consumers can consume messages from the new Kafka instance.
  6. Check whether consumers consume messages properly from the new Kafka instance.
  7. The migration is complete.

Migration Scheme 2: Consuming Both Messages and Migrating the Production Later (Data Not Migrated)

Use multiple consumers for the consumption service. Some consume messages from the original Kafka instance, and others consume messages from the new Kafka instances. Then, migrate the production service to the new Kafka instance so that all messages can be consumed in time.

For a certain period of time, the consumption service consumes messages from both the original and new Kafka instances. Before the migration, message consumption from the new Kafka instance has already started, so there is no latency. However, early on in the migration, data is consumed from both the original and new Kafka instances, so the messages may not be consumed in the order that they are produced. This scheme is suitable for services that require low latency but do not require strict message sequence.

  1. Start new consumer clients, set the Kafka connection addresses to that of the new Kafka instance, and consume data from the new Kafka instance.

    Original consumer clients must continue running. Messages are consumed from both the original and new Kafka instances.

  2. Change the Kafka connection address of the producer to that of the new Kafka instance.
  3. Restart the producer client to migrate the production service to the new Kafka instance.
  4. After the production service is migrated, check whether the consumption service connected to the new Kafka instance is normal.
  5. After all data in the original Kafka is consumed, close the original consumption clients.
  6. The migration is complete.

Migration Scheme 3: Migrating Data Using MirrorMaker First, then Consumption, and then Production

Use MirrorMaker to synchronize the two Kafka instances, migrate the consumer first and then the producer to the new Kafka instance.

This scheme depends on MirrorMaker. MirrorMaker synchronizes data between the original Kafka and the new Kafka. After data synchronization is complete, migrate the consumer to the new Kafka and then the producer. This scheme applies to the scenario where the producer cannot be stopped, the end-to-end latency cannot be high, but a little repeated message consumption is tolerated.

  1. Synchronize messages between two Kafka instances using MirrorMaker. For details, see Using MirrorMaker to Synchronize Data Across Clusters.
  2. Change the Kafka connection addresses of the consumers to those of the new Kafka instance.
  3. Restart the consumption service so that consumers can consume messages from the new Kafka instance.
  4. Check whether consumers consume messages properly from the new Kafka instance.
  5. Change the Kafka connection address of the producer to that of the new Kafka instance.
  6. Restart the producer client to migrate the production service to the new Kafka instance.
  7. After the production service is migrated, check whether the consumption service connected to the new Kafka instance is normal.
  8. The migration is complete.

How Do I Migrate Persisted Data Along with Services?

You can migrate consumed data from the original instance to a new instance by using the open-source tool MirrorMaker. This tool mirrors the original Kafka producer and consumer into new ones and migrates data to the new Kafka instance. For details, see Using MirrorMaker to Synchronize Data Across Clusters.