Data Transmission Service (DTS) streams change data from a Db2 for LUW (Linux, UNIX, and Windows) database into a self-managed Kafka cluster using CDC replication. Use this guide to configure the synchronization task from prerequisites to a running task.
Prerequisites
Before you begin, ensure that you have:
A Kafka cluster running version 0.10.1.0 to 2.7.0
Enough free storage on the Kafka cluster to hold all data in the source Db2 for LUW database (required for full data synchronization)
Database administrator permissions on the source Db2 for LUW database
Log archiving enabled on the source Db2 for LUW database — set
LOGARCHMETH1orLOGARCHMETH2(or both). See logarchmeth1 - Primary log archive method configuration parameter and logarchmeth2 - Secondary log archive method configuration parameter
Limitations
Foreign keys
DTS does not synchronize foreign keys. Cascade and delete operations on the source database are not replicated to the destination.
Source database limits
| Limit | Details |
|---|---|
| Outbound bandwidth | The source server must have sufficient outbound bandwidth. Insufficient bandwidth reduces synchronization speed. |
| Primary key or unique constraints | Tables to be synchronized must have PRIMARY KEY or UNIQUE constraints with all fields unique. Without these, the destination may contain duplicate records. |
| Table count per task | If you select tables as objects and plan to rename tables or columns in the destination, a single task supports up to 5,000 tables. Exceeding this limit causes a request error — split into multiple tasks or synchronize at the database level instead. |
| Log retention for incremental-only tasks | Retain logs for more than 24 hours. If DTS cannot read the logs, the task may fail or data inconsistency may occur. |
| Log retention for full + incremental tasks | Retain logs for at least seven days before starting the task. After full synchronization completes, you can reduce the retention period to more than 24 hours. Make sure that you set the retention period based on the preceding requirements. Otherwise, the service reliability or performance stated in the SLA of DTS cannot be guaranteed. |
CDC-specific limits
DTS uses Db2 for LUW CDC replication technology for incremental data. This technology has its own restrictions — see General data restrictions for SQL Replication.
Other limits
Schedule synchronization during off-peak hours. Full data synchronization consumes read and write resources on both source and destination databases and may increase server load.
After full synchronization, the destination tablespace is larger than the source due to fragmentation from concurrent INSERT operations.
Write data to the destination only through DTS during synchronization to prevent data inconsistency. After synchronization completes, you can run DDL statements online using Data Management (DMS) — see Perform lock-free DDL operations.
If a primary/secondary switchover occurs on the source while the task is running, the task fails.
If the destination ApsaraMQ for Kafka instance is scaled during synchronization, restart the instance.
Synchronization latency
DTS calculates latency based on the timestamp of the latest synchronized record in the destination versus the current source timestamp. If no DML operations occur on the source for an extended period, the reported latency may be inaccurate. Run a DML operation on the source to refresh the latency value.
If you synchronize an entire database, create a heartbeat table. DTS updates the heartbeat table every second to keep the latency reading accurate.
Billing
| Synchronization type | Fee |
|---|---|
| Schema synchronization and full data synchronization | Free of charge |
| Incremental data synchronization | Charged — see Billing overview |
Supported synchronization topologies
One-way one-to-one synchronization
One-way one-to-many synchronization
One-way cascade synchronization
One-way many-to-one synchronization
For details, see Synchronization topologies.
SQL operations that can be synchronized
| Operation type | SQL statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
Create a synchronization task
Step 1: Open the Data Synchronization Tasks page
Log on to the Data Management (DMS) console.
In the top navigation bar, click Data + AI.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.
The navigation path may vary by console mode and layout. See Simple mode and Customize the layout and style of the DMS console. You can also go directly to the Data Synchronization Tasks page.
Step 2: Select the region
On the right side of Data Synchronization Tasks, select the region where your synchronization instance resides.
In the new DTS console, select the region from the top navigation bar.
Step 3: Configure source and destination databases
Click Create Task. In the wizard, configure the following parameters.
Task information
| Parameter | Description |
|---|---|
| Task Name | A name for the DTS task. DTS generates a name automatically. Specify a descriptive name to help identify the task — uniqueness is not required. |
Source database
| Parameter | Description |
|---|---|
| Select a DMS database instance | Select an existing database instance, or leave blank and configure manually. If you select an existing instance, DTS auto-fills the remaining parameters. |
| Database Type | Select DB2 for LUW. |
| Connection Type | Select the access method based on where the source database is deployed. This example uses Self-managed Database on ECS. If your source is a self-managed database, set up the network environment first — see Preparation overview. |
| Instance Region | The region where the source Db2 for LUW database resides. |
| Replicate Data Across Alibaba Cloud Accounts | Whether to synchronize data across Alibaba Cloud accounts. This example uses No. |
| ECS Instance ID | The ID of the Elastic Compute Service (ECS) instance hosting the source database. |
| Port Number | The service port of the source Db2 for LUW database. Default: 50000. |
| Database Name | The name of the source Db2 for LUW database. |
| Database Account | The username for connecting to the source database. The account requires database administrator permissions. |
| Database Password | The password for the database account. |
Destination database
| Parameter | Description |
|---|---|
| Select a DMS database instance | Select an existing database instance, or leave blank and configure manually. |
| Database Type | Select Kafka. |
| Connection Type | Select the access method based on where the Kafka cluster is deployed. This example uses Self-managed Database on ECS. See Preparation overview for network setup requirements. |
| Instance Region | The region where the destination Kafka cluster resides. |
| ECS Instance ID | The ID of the ECS instance hosting the Kafka cluster. For a multi-node cluster, select any one node — DTS automatically discovers topic information for all nodes. |
| Port number | The service port of the Kafka cluster. Default: 9092. |
| Database Account | The username for connecting to the Kafka cluster. Leave blank if authentication is not enabled. |
| Database Password | The password for the Kafka account. Leave blank if authentication is not enabled. |
| Kafka Version | The version of the self-managed Kafka cluster. For version 1.0 or later, select Later Than 1.0. |
| Encryption | The connection encryption method. Select Non-encrypted or SCRAM-SHA-256 based on your business and security requirements. |
| Topic | The destination topic. Select from the drop-down list. |
| Topic That Stores DDL Information | The topic for storing DDL information. If left blank, DDL information is stored in the topic specified by Topic. |
| Use Kafka Schema Registry | Whether to use Kafka Schema Registry for Avro schema storage and retrieval via a RESTful API. Select No to skip, or Yes and provide the URL or IP address of your Kafka Schema Registry. |
Step 4: Test connectivity
Click Test Connectivity and Proceed at the bottom of the page.
DTS automatically adds its server CIDR blocks to the security settings of Alibaba Cloud database instances and ECS-hosted databases. For databases in data centers or on third-party clouds, manually add the DTS server CIDR blocks to the database whitelist — see Add the CIDR blocks of DTS servers.
Adding DTS CIDR blocks to whitelists or security group rules introduces security exposure. Before proceeding, take preventive measures such as: strengthening username and password security, restricting exposed ports, authenticating API calls, auditing whitelist and security group rules regularly, and removing unauthorized CIDR blocks. For higher security, connect the database to DTS over Express Connect, VPN Gateway, or Smart Access Gateway.
Step 5: Configure objects and advanced settings
Basic settings
| Parameter | Description |
|---|---|
| Synchronization Types | By default, Incremental Data Synchronization is selected. Also select Schema Synchronization and Full Data Synchronization. DTS runs full synchronization first to copy existing data, which serves as the baseline for incremental synchronization. |
| Processing Mode of Conflicting Tables | How DTS handles destination tables that share names with source tables: Precheck and Report Errors (default) — fails the precheck if identical table names exist; Clear Destination Table — clears data from matching destination tables before synchronization (use with caution); Ignore Errors and Proceed — skips the name conflict check. If you choose this option, data inconsistency may occur: during full synchronization, existing records with matching primary keys are retained; during incremental synchronization, existing records are overwritten. If schemas differ, some columns may not be synchronized or the task may fail. To resolve name conflicts without deleting destination tables, use the object name mapping feature — see Map object names. |
| Data Format in Kafka | The format for records stored in the destination Kafka topic. Default: DTS Avro. For format details, see Data formats in a message queue. |
| Policy for Shipping Data to Kafka Partitions | How DTS routes records to Kafka partitions. See Specify the policy for synchronizing data to Kafka partitions. |
| Capitalization of Object Names in Destination Instance | Controls whether database, table, and column names in the destination are uppercased or lowercased. Default: DTS default policy. See Specify the capitalization of object names in the destination instance. |
| Source Objects | Select objects from the Source Objects section and click the right-arrow icon to move them to Selected Objects. You can select columns, tables, or databases. Selecting tables or columns excludes views, triggers, and stored procedures. |
| Selected Objects | To rename a single object in the destination, right-click it in this section — see Map the name of a single object. To rename multiple objects at once, click Batch Edit — see Map multiple object names at a time. To filter rows by SQL conditions, right-click an object and specify WHERE conditions — see Specify filter conditions. |
Advanced settings
| Parameter | Description |
|---|---|
| Monitoring and Alerting | Whether to enable alerting for task failures or high synchronization latency. Select No to skip, or Yes and configure the alert threshold and notification settings — see Configure monitoring and alerting when you create a DTS task. |
| Retry Time for Failed Connections | How long DTS retries failed connections after the task starts. Range: 10–1440 minutes. Default: 720 minutes. We recommend that you set this to more than 30 minutes. If DTS reconnects within this window, the task resumes; otherwise, it fails. If multiple tasks share the same source or destination database, the shortest retry window takes effect. Note that DTS charges for the instance during retry attempts. |
| Configure ETL | Whether to enable extract, transform, and load (ETL) processing. Select Yes to enter data transformation statements in the code editor — see Configure ETL in a data migration or data synchronization task. Select No to skip. For an ETL overview, see What is ETL? |
Step 6: Run the precheck
Click Next: Save Task Settings and Precheck.
To preview the API parameters for this task configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
DTS runs a precheck before starting the task. If any item fails:
Click View Details next to the failed item, resolve the issue, and run the precheck again.
For alert items that can be ignored: click Confirm Alert Details, then Ignore in the dialog, click OK, and click Precheck Again. Ignoring alerts may lead to data inconsistency.
Step 7: Purchase the instance
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the purchase page, configure the following:
| Parameter | Description |
|---|---|
| Billing Method | Subscription — pay upfront for a fixed term (1–9 months, or 1, 2, 3, or 5 years). More cost-effective for long-term use. Pay-as-you-go — billed hourly. Suitable for short-term use. Release the instance when no longer needed to stop charges. |
| Resource Group Settings | The resource group for this instance. Default: default resource group. See What is Resource Management? |
| Instance Class | The synchronization throughput class. Select based on your data volume and latency requirements. See Instance classes of data synchronization instances. |
| Subscription Duration | The subscription term. Available only for the Subscription billing method. |
Read and select Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. In the confirmation dialog, click OK.
The task appears in the task list. Monitor its progress there.