This topic describes how to use Data Transmission Service (DTS) to migrate data from a PolarDB for PostgreSQL (Compatible with Oracle) cluster to a Message Queue for Apache Kafka instance.
Prerequisites
The value of the wal_level parameter is set to logical for the source PolarDB for PostgreSQL (Compatible with Oracle) cluster. This adds the information required for logical decoding to the write-ahead log (WAL). For more information, see Set cluster parameters.
A destination Message Queue for Apache Kafka instance is created. The destination instance must have more available disk space than the disk space used by the source PolarDB for PostgreSQL (Compatible with Oracle) instance.
NoteFor information about the supported database versions, see Migration solutions.
A topic is created in the destination Message Queue for Apache Kafka instance to receive the migrated data. For more information, see Step 1: Create a topic.
Precautions
Type | Description |
Source database limits |
|
Other limits |
|
Billing
Migration type | Link configuration fees | Data transfer cost |
Schema migration and full data migration | Free of charge. | You are charged for data transfer when you migrate data out of Alibaba Cloud over the public network. For more information, see Billing overview. |
Incremental data migration | Charged. For more information, see Billing overview. |
SQL operations that support incremental migration
Operation type | SQL statement |
DML | INSERT, UPDATE, DELETE |
DDL |
Important
|
Permissions required for database accounts
Database | Required permissions | Account creation and authorization method |
PolarDB for PostgreSQL (Compatible with Oracle) cluster | Privileged account |
Procedure
Go to the migration task list page of the destination region. You can use one of the following methods.
From the DTS console
Log on to the Data Transmission Service (DTS) console.
In the navigation pane on the left, click Data Migration.
In the upper-left corner of the page, select the region where the migration instance is located.
From the DMS console
NoteThe actual operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode console and Customize the layout and style of the DMS console.
Log on to the Data Management (DMS) console.
In the top navigation bar, choose .
To the right of Data Migration Tasks, select the region where the migration instance is located.
Click Create Task to go to the task configuration page.
Configure the source and destination databases.
NoteFor information about how to obtain the parameters of the destination Message Queue for Apache Kafka instance, see Configure the parameters of a Message Queue for Apache Kafka instance.
Category
Configuration
Description
N/A
Task Name
DTS automatically generates a task name. We recommend that you specify a descriptive name for easy identification. The name does not have to be unique.
Source Database
Select Existing Connection
To use a database instance that has been added to the system (created or saved), select the desired database instance from the drop-down list. The database information below will be automatically configured.
NoteIn the DMS console, this parameter is named Select a DMS database instance..
If you have not registered the database instance with the system, or do not need to use a registered instance, manually configure the database information below.
Database Type
Select PolarDB (Compatible with Oracle).
Access Method
Select Alibaba Cloud Instance.
Instance Region
Select the region where the source PolarDB for PostgreSQL (Compatible with Oracle) cluster resides.
Replicate Data Across Alibaba Cloud Accounts
This example migrates data within the same Alibaba Cloud account. Select No.
Instance ID
Select the ID of the source PolarDB for PostgreSQL (Compatible with Oracle) cluster.
Database Name
Enter the name of the database that contains the objects to be migrated from the source PolarDB for PostgreSQL (Compatible with Oracle) cluster.
Database Account
Enter the database account of the source PolarDB for PostgreSQL (Compatible with Oracle) cluster. For information about the required permissions, see Permissions required for database accounts.
Database Password
Enter the password that corresponds to the database account.
Destination Database
Select Existing Connection
To use a database instance that has been added to the system (created or saved), select the desired database instance from the drop-down list. The database information below will be automatically configured.
NoteIn the DMS console, this parameter is named Select a DMS database instance..
If you have not registered the database instance with the system, or do not need to use a registered instance, manually configure the database information below.
Database Type
Select Kafka.
Access Method
Select Express Connect, VPN Gateway, or Smart Access Gateway.
NoteHere, the Message Queue for Apache Kafka instance is configured as a self-managed Kafka database for the migration instance.
Instance Region
Select the region where the destination Message Queue for Apache Kafka instance resides.
Connected VPC
Select the ID of the virtual private cloud (VPC) to which the destination Message Queue for Apache Kafka instance belongs.
Domain Name or IP
Enter any IP address from the Default Endpoint of the target ApsaraMQ for Kafka instance.
Port Number
Enter the service port of the target ApsaraMQ for Kafka instance. The default value is 9092.
Database Account
You do not need to fill in this parameter for this example.
Database Password
Kafka Version
Select the version of the Kafka instance.
Encryption
Select Non-encrypted or SCRAM-SHA-256 based on your business and security requirements.
Topic
Select the topic to receive data from the drop-down list.
Use Kafka Schema Registry
Kafka Schema Registry is a metadata service layer that provides a RESTful interface for storing and retrieving Avro Schema.
No: Do not use Kafka Schema Registry.
Yes: Use Kafka Schema Registry. You must enter the URL or IP address of the Avro schema registered in Kafka Schema Registry in the URL or IP Address of Schema Registry text box.
After you complete the configuration, click Test Connectivity and Proceed at the bottom of the page.
NoteEnsure that the IP address segment of the DTS service is automatically or manually added to the security settings of the source and destination databases to allow access from DTS servers. For more information, see Add DTS server IP addresses to a whitelist.
If the source or destination database is a self-managed database (the Access Method is not Alibaba Cloud Instance), you must also click Test Connectivity in the CIDR Blocks of DTS Servers dialog box that appears.
Configure the task objects.
On the Configure Objects page, configure the objects to be migrated.
Configuration
Description
Migration Types
If you only need to perform a full migration, select both Schema Migration and Full Data Migration.
To perform a zero-downtime migration, select Schema Migration, Full Data Migration, and Incremental Data Migration.
NoteIf the Access Method of the destination Kafka instance is Alibaba Cloud Instance, Schema Migration is not supported.
If you do not select Incremental Data Migration, do not write new data to the source instance during data migration to ensure data consistency.
Processing Mode of Conflicting Tables
Precheck and Report Errors: Checks whether tables with the same names exist in the destination database. If no tables with the same names exist, the precheck item is passed. If tables with the same names exist, an error is reported during the precheck phase, and the data migration task does not start.
NoteIf a table in the destination database has the same name but cannot be easily deleted or renamed, you can change the name of the table in the destination database. For more information, see Object name mapping.
Ignore Errors and Proceed: Skips the check for tables with the same names.
WarningSelecting Ignore Errors and Proceed may cause data inconsistency and business risks. For example:
If the table schemas are consistent and a record in the destination database has the same primary key value as a record in the source database:
During full migration, DTS keeps the record in the destination cluster. The record from the source database is not migrated to the destination database.
During incremental migration, DTS does not keep the record in the destination cluster. The record from the source database overwrites the record in the destination database.
If the table schemas are inconsistent, only some columns of data may be migrated, or the migration may fail. Proceed with caution.
Data Format in Kafka
Select the desired data format for storage in the Kafka instance.
If you select Canal JSON, see Canal JSON for parameter descriptions and examples.
NoteCurrently, only the China (Qingdao) and China (Beijing) regions support selecting Canal JSON.
If you select DTS Avro, you must parse the data based on the DTS Avro schema definition. For more information, see DTS Avro schema definition and DTS Avro deserialization sample code.
If you select Shareplex JSON, see Shareplex Json for parameter descriptions and examples.
Kafka Data Compression Format
Select the compression format for Kafka messages based on your requirements.
LZ4 (Default): Low compression ratio, high compression speed.
GZIP: High compression ratio, low compression speed.
NoteCPU usage is high.
Snappy: Medium compression ratio, medium compression speed.
Policy for Shipping Data to Kafka Partitions
Select the desired policy.
Message acknowledgement mechanism
Select the desired message acknowledgment mechanism.
Topic That Stores DDL Information
Select a topic from the drop-down list to store DDL information.
NoteIf you do not select a topic, the DDL information is stored in the topic that receives data by default.
Capitalization of Object Names in Destination Instance
You can configure the case sensitivity policy for the English names of migrated objects, such as databases, tables, and columns, in the destination instance. By default, DTS default policy is selected. You can also choose to keep it consistent with the default policy of the source or destination database. For more information, see Case sensitivity of object names in the destination database.
Source Objects
In the Source Objects box, click the objects to migrate, and then click
to move them to the Selected Objects box.NoteYou can select tables as the objects to be migrated.
Selected Objects
No extra configuration is needed for this example. You can use the mapping feature to set the topic name, number of topic partitions, and partition key for the source table in the destination Kafka instance. For more information, see Mapping information.
NoteIf you use the object name mapping feature, the migration of other objects that depend on this object may fail.
To select the SQL operations for incremental migration, right-click the migration object in the Selected Objects section, and select the desired SQL operations in the dialog box that appears.
Click Next: Advanced Settings to configure advanced parameters.
Configuration
Description
Dedicated Cluster for Task Scheduling
By default, DTS schedules tasks on a shared cluster. You do not need to select one. If you want more stable tasks, you can purchase a dedicated cluster to run DTS migration tasks.
Retry Time for Failed Connections
After the migration task starts, if the connection to the source or destination database fails, DTS reports an error and immediately starts continuous retry attempts. The default retry duration is 720 minutes. You can also customize the retry time within a range of 10 to 1440 minutes. We recommend that you set it to more than 30 minutes. If DTS reconnects to the source and destination databases within the set time, the migration task automatically resumes. Otherwise, the task fails.
NoteFor multiple DTS instances that share the same source or destination, the network retry time is determined by the setting of the last created task.
Because you are charged for the task during the connection retry period, we recommend that you customize the retry time based on your business needs, or release the DTS instance as soon as possible after the source and destination database instances are released.
Retry Time for Other Issues
After the migration task starts, if other non-connectivity issues occur in the source or destination database (such as a DDL or DML execution exception), DTS reports an error and immediately starts continuous retry attempts. The default retry duration is 10 minutes. You can also customize the retry time within a range of 1 to 1440 minutes. We recommend that you set it to more than 10 minutes. If the related operations succeed within the set retry time, the migration task automatically resumes. Otherwise, the task fails.
ImportantThe value of Retry Time for Other Issues must be less than the value of Retry Time for Failed Connections.
Enable Throttling for Full Data Migration
During the full migration phase, DTS consumes some read and write resources of the source and destination databases, which may increase the database load. As needed, you can choose whether to set speed limits for the full migration task. You can set Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s) to reduce the pressure on the destination database.
NoteThis configuration item is available only if you select Full Data Migration for Migration Types.
You can also adjust the full migration speed after the migration instance is running.
Enable Throttling for Incremental Data Migration
As needed, you can also choose whether to set speed limits for the incremental migration task. You can set RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s) to reduce the pressure on the destination database.
NoteThis configuration item is available only if you select Incremental Data Migration for Migration Types.
You can also adjust the incremental migration speed after the migration instance is running.
Environment Tag
You can select an environment tag to identify the instance if needed. This is not required for this example.
Configure ETL
Choose whether to enable the extract, transform, and load (ETL) feature. For more information, see What is ETL? Valid values:
Yes: Enables the ETL feature. Enter data processing statements in the code editor. For more information, see Configure ETL in a data migration or data synchronization task.
No: Disables the ETL feature.
Monitoring and Alerting
Select whether to set alerts and receive alert notifications based on your business needs.
No: Does not set an alert.
Yes: Sets an alert. You must also set the alert threshold and alert notifications. The system sends an alert notification if the migration fails or the latency exceeds the threshold.
Save the task and run a precheck.
To view the parameters for configuring this instance when you call the API operation, move the pointer over the Next: Save Task Settings and Precheck button and click Preview OpenAPI parameters in the bubble.
If you do not need to view or have finished viewing the API parameters, click Next: Save Task Settings and Precheck at the bottom of the page.
NoteBefore the migration task starts, a precheck is performed. The task starts only after it passes the precheck.
If the precheck fails, click View Details next to the failed check item, fix the issue based on the prompt, and then run the precheck again.
If a warning is reported during the precheck:
For check items that cannot be ignored, click View Details next to the failed item, fix the issue based on the prompt, and then run the precheck again.
For check items that can be ignored and do not need to be fixed, you can click Confirm Alert Details, Ignore, OK, and Precheck Again to skip the alert item and run the precheck again. If you choose to shield an alert item, it may cause issues such as data inconsistency and pose risks to your business.
Purchase the instance.
When the Success Rate is 100%, click Next: Purchase Instance.
On the Purchase page, select the link specification for the data migration instance. For more information, see the following table.
Category
Parameter
Description
New Instance Class
Resource Group Settings
Select the resource group to which the instance belongs. The default value is default resource group. For more information, see What is Resource Management?
Instance Class
DTS provides migration specifications with different performance levels. The link specification affects the migration speed. You can select a specification based on your business scenario. For more information, see Data migration link specifications.
After the configuration is complete, read and select Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, and in the OK dialog box that appears, click OK.
You can view the progress of the migration instance on the Data Migration Tasks list page.
NoteIf the migration instance does not include an incremental migration task, it stops automatically. After the instance stops, its Status is Completed.
If the migration instance includes an incremental migration task, it does not stop automatically, and the incremental migration task continues to run. While the incremental migration task is running normally, the Status of the instance is Running.
Mapping information
In the Selected Objects area, hover the mouse pointer over the destination topic name at the table level.
For the target topic, click Edit.
In the Edit Table dialog box, you can configure the mapping information.
NoteAt the schema level, the dialog box is named Edit Schema and contains fewer configurable parameters. At the table level, the dialog box is named Edit Table.
If the migration granularity is not an entire schema, you cannot modify the Name of target Topic or Number of Partitions in the Edit Schema dialog box.
Configuration
Description
Name of target Topic
The name of the destination Topic to which the source table is migrated. By default, this is the Topic selected in the Destination Database section during the Configurations for Source and Destination Databases step.
ImportantIf the destination database is a Message Queue for Apache Kafka instance, the specified topic name must exist in the destination Kafka instance. Otherwise, the data migration fails. If the destination database is a self-managed Kafka database and the migration instance includes a schema migration task, DTS attempts to create the topic you specify in the destination database.
If you modify the Name of target Topic, the data is written to the topic you specify.
Filter Conditions
For more information, see Set filter conditions.
Number of Partitions
The number of partitions for the destination topic to which data is written.
Partition Key
When Policy for Shipping Data to Kafka Partitions is set to Ship Data to Separate Partitions Based on Hash Values of Primary Keys, you must configure this parameter. Specify one or more columns as the Partition Key to calculate a hash value. DTS delivers different rows to each partition of the destination topic based on the calculated hash values. Otherwise, this delivery policy does not take effect during the incremental write phase.
NoteYou can select Partition Key only in the Edit Table dialog box.
Click OK.