This topic describes how to use the open source migration tool to migrate data from a self-managed Apache Kafka cluster to an ApsaraMQ for Kafka instance.
Migration process
Usage notes
If you want to migrate the metadata and message data of a self-managed Apache Kafka cluster deployed on Alibaba Cloud, we recommend that you purchase an ApsaraMQ for Kafka instance in the same region as the self-managed Apache Kafka cluster, deploy it in the same virtual private cloud (VPC), and then migrate the metadata and message data in the VPC.
In this topic, the metadata and message data of a self-managed Apache Kafka cluster is migrated to an Internet- and VPC-connected ApsaraMQ for Kafka instance.
Step 1: Evaluate specifications
ApsaraMQ for Kafka provides the specification evaluation feature that assesses and recommends the ApsaraMQ for Kafka instance specifications required by a migration task based on the information of the self-managed Apache Kafka cluster, such as cluster traffic, disk capacity, and disk type. For more information, see Evaluate specifications.
Step 2: Purchase an instance
Purchase an ApsaraMQ for Kafka instance based on the evaluated instance specifications and deploy it. For more information, see Purchase and deploy an Internet- and VPC-connected instance.
Step 3: Migrate topics and consumer groups
Implement migration
Log on to the server of your self-managed Apache Kafka cluster.
Download and install JDK 8 or 11. For more information, see Java Downloads.
Download the migration tool: kafka-migration-assessment.jar.
Migrate topics and consumer groups using the following methods.
Migrate topics
In the directory where the migration tool is located, run the following command to perform a precheck on the topics to be migrated:
java -jar kafka-migration-assessment.jar TopicMigrationFromZk \ --sourceZkConnect 192.168.XX.XX \ --destAk <yourdestAccessKeyId> \ --destSk <yourdestAccessKeySecret> \ --destRegionId <yourdestRegionId> \ --destInstanceId <yourdestInstanceId>Parameter
Description
sourceZkConnect
The IP address of the source ZooKeeper cluster.
destAk
The AccessKey ID of the Alibaba Cloud account to which the destination ApsaraMQ for Kafka instance belongs.
destSk
The AccessKey Secret of the Alibaba Cloud account to which the destination ApsaraMQ for Kafka instance belongs.
destRegionId
The region ID of the destination ApsaraMQ for Kafka instance.
destInstanceId
The ID of the destination ApsaraMQ for Kafka instance.
Sample returned result to be confirmed:
13:40:08 INFO - Begin to migrate topics:[test] 13:40:08 INFO - Total topic number:1 13:40:08 INFO - Will create topic:test, isCompactTopic:false, partition number:1Run the following command to migrate topics.
java -jar kafka-migration-assessment.jar TopicMigrationFromZk \ --sourceZkConnect 192.168.XX.XX \ --destAk <yourdestAccessKeyId> \ --destSk <yourdestAccessKeySecret> \ --destRegionId <yourdestRegionId> \ --destInstanceId <yourdestInstanceId> \ --commitParameter
Description
commit
Commits the migration task
Sample returned result after committing the migration task:
13:51:12 INFO - Begin to migrate topics:[test] 13:51:12 INFO - Total topic number:1 13:51:13 INFO - cmd=TopicMigrationFromZk, request=null, response={"code":200,"requestId":"7F76C7D7-AAB5-4E29-B49B-CD6F1E0F508B","success":true,"message":"operation success"} 13:51:13 INFO - TopicCreate success, topic=test, partition number=1, isCompactTopic=false
Migrate consumer groups
Create a configuration file named kafka.properties.
The kafka.properties file is used to initialize a Kafka consumer to obtain the consumer offsets of the self-managed Apache Kafka cluster. The file contains the following content:
## The endpoint. bootstrap.servers=localhost:9092 ## The group ID. To ensure that consumption starts from the first message, the group cannot contain consumer offset information. group.id=XXX ## If no security settings are required, you do not need to configure the following parameters. ## The SASL-based authentication. #sasl.mechanism=PLAIN ## The access protocol. #security.protocol=SASL_SSL ## The path of the Secure Sockets Layer (SSL) root certificate. #ssl.truststore.location=/Users/***/Documents/code/aliware-kafka-demos/main/resources/kafka.client.truststore.jks ## The SSL password. #ssl.truststore.password=*** ## The SASL path. #java.security.auth.login.config=/Users/***/kafka-java-demo/vpc-ssl/src/main/resources/kafka_client_jaas.confIn the directory where the migration tool is located, run the following command to perform a precheck on the consumer groups to be migrated:
java -jar kafka-migration-assessment.jar ConsumerGroupMigrationFromTopic \ --propertiesPath /usr/local/kafka_2.12-2.4.0/config/kafka.properties \ --destAk <yourAccessKeyId> \ --destSk <yourAccessKeySecret> \ --destRegionId <yourRegionId> \ --destInstanceId <yourInstanceId>Parameter
Description
propertiesPath
The path of the kafka.properties configuration file.
destAk
The AccessKey ID of the Alibaba Cloud account to which the destination ApsaraMQ for Kafka instance belongs.
destSk
The AccessKey Secret of the Alibaba Cloud account to which the destination ApsaraMQ for Kafka instance belongs.
destRegionId
The region ID of the destination ApsaraMQ for Kafka instance.
destInstanceId
The ID of the destination ApsaraMQ for Kafka instance.
Sample returned result to be confirmed:
15:29:45 INFO - Will create consumer groups:[XXX, test-consumer-group]Run the following command to migrate consumer groups:
java -jar kafka-migration-assessment.jar ConsumerGroupMigrationFromTopic \ --propertiesPath /usr/local/kafka_2.12-2.4.0/config/kafka.properties \ --destAk <yourAccessKeyId> \ --destSk <yourAccessKeySecret> \ --destRegionId <yourRegionId> \ --destInstanceId <yourInstanceId> \ --commitParameter
Description
commit
Commit the migration task.
Sample returned result after committing the migration task:
15:35:51 INFO - cmd=ConsumerGroupMigrationFromTopic, request=null, response={"code":200,"requestId":"C9797848-FD4C-411F-966D-0D4AB5D12F55","success":true,"message":"operation success"} 15:35:51 INFO - ConsumerCreate success, consumer group=XXX 15:35:57 INFO - cmd=ConsumerGroupMigrationFromTopic, request=null, response={"code":200,"requestId":"3BCFDBF2-3CD9-4D48-92C3-385C8DBB9709","success":true,"message":"operation success"} 15:35:57 INFO - ConsumerCreate success, consumer group=test-consumer-group
View the migration progress
Log on to the ApsaraMQ for Kafka console. In the Resource Distribution section of the Overview page, select the region where the ApsaraMQ for Kafka instance that you want to manage resides.
In the left-side navigation pane, click Migration. On the page that appears, click the Metadata Import tab.
On the Metadata Import tab, find the destination ApsaraMQ for Kafka instance in your migration task and view the migration progress of the topics and consumer groups.
Verify the results
Log on to the ApsaraMQ for Kafka console. In the Resource Distribution section of the Overview page, select the region where the ApsaraMQ for Kafka instance that you want to manage resides.
In the left-side navigation pane, click Instances.
On the Instances page, click the name of the instance that you want to manage.
View the topics and groups on the instance.
In the left-side navigation pane, click Topics. On the page that appears, view the created topics on the instance.
In the left-side navigation pane, click Groups. On the page that appears, view the created groups on the instance.
(Optional) Step 4: Migrate data
Kafka provides the mirroring feature for you to back up data in Kafka clusters. You can use MirrorMaker to copy data in the source cluster to the destination cluster. As shown in the following figure, MirrorMaker uses a built-in consumer to consume messages from the source self-managed Apache Kafka cluster, then uses a built-in producer to send the messages to the destination ApsaraMQ for Kafka cluster. For more information, see Mirroring data between clusters & Geo-replication.
Prerequisites
Topics are migrated.
Usage notes
Topic names must be consistent.
The number of partitions can be different.
Data in the same partition may not remain in the same partition after migration.
By default, messages with the same key are distributed in the same partition.
When a node fails, normal messages may be out of order while partitionally ordered messages can retain their order.
If both the self-managed Apache Kafka cluster and the ApsaraMQ for Kafka instance use password-based authentication and the two passwords are inconsistent, migration is not supported.
Implement migration
You can choose to access from the Internet or a VPC.
Access from the Internet
Download the SSL certificate: mix.4096.client.truststore.jks.
Configure the
kafka_client_jaas.conffile.KafkaClient { org.apache.kafka.common.security.plain.PlainLoginModule required username="your username" password="your password"; };Configure the
producer.propertiesfile.## The SSL endpoint of the ApsaraMQ for kafka instance. You can obtain the endpoint in the ApsaraMQ for Kafka console. bootstrap.servers=XXX.XXX.XXX.XXX:9093 ## The data compression method. compression.type=none ## Configure the truststore using the file downloaded in Step 1. ssl.truststore.location=kafka.client.truststore.jks ssl.truststore.password=KafkaOnsClient security.protocol=SASL_SSL sasl.mechanism=PLAIN ## The following parameter is required only if you want to use Simple Authentication and Security Layer (SASL) authentication for an ApsaraMQ for Kafka 2.x instance. ssl.endpoint.identification.algorithm=Configure the
java.security.auth.login.configfile.export KAFKA_OPTS="-Djava.security.auth.login.config=kafka_client_jaas.conf"Run the following command to start the migration process:
sh bin/kafka-mirror-maker.sh --consumer.config config/consumer.properties --producer.config config/producer.properties --whitelist topicName
Access from a VPC
Configure the
consumer.propertiesfile.## The endpoint of the self-managed Apache Kafka cluster. bootstrap.servers=XXX.XXX.XXX.XXX:9092 ## The consumer policy for distributing messages to partitions. partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor ## The name of the consumer group. group.id=test-consumer-groupConfigure the
producer.propertiesfile.## The default endpoint of the ApsaraMQ for Kafka instance. You can obtain the endpoint in the ApsaraMQ for Kafka console. bootstrap.servers=XXX.XXX.XXX.XXX:9092 ## The data compression method. compression.type=noneRun the following command to start the migration process:
sh bin/kafka-mirror-maker.sh --consumer.config config/consumer.properties --producer.config config/producer.properties --whitelist topicName
Verify the results
You can use one of the following methods to verify whether MirrorMaker is running as expected:
Check the consumer progress of the self-managed Apache Kafka cluster using the
kafka-consumer-groups.shcommand.bin/kafka-consumer-groups.sh --new-consumer --describe --bootstrap-server endpoint of the self-managed Apache Kafka cluster --group test-consumer-groupSend messages to the self-managed Apache Kafka cluster. In the ApsaraMQ for Kafka console, check the partition status of the topic, and check whether the total number of messages in the current broker is correct. You can view the content of a message in the ApsaraMQ for Kafka console. For more information, see Query messages.
What to do next
Enable new consumer groups for the ApsaraMQ for Kafka instance to consume messages in the instance.
Enable new producers for the ApsaraMQ for Kafka instance, shut down the original producers, and allow the original consumer groups to continue consuming messages in the self-managed Apache Kafka cluster.
After all messages in the self-managed Apache Kafka cluster are consumed by the original consumer groups, shut down them and the self-managed Apache Kafka cluster.