Use Data Transmission Service (DTS) to continuously synchronize data from a self-managed SQL Server database to a PolarDB for MySQL cluster — including schema, full historical data, and ongoing changes.
Prerequisites
Before you begin, ensure that you have:
A self-managed SQL Server database running a version supported by DTS. For supported versions, see Overview of data synchronization scenarios
A PolarDB for MySQL cluster already created. See Purchase an Enterprise Edition cluster or Purchase a subscription cluster
Enough available storage on the PolarDB for MySQL cluster to hold all data from the source SQL Server database
If the source database meets any of the following conditions, split the task into multiple subtasks:
Contains more than 10 databases
Backs up logs more frequently than once per hour
Executes more than 100 DDL statements per hour per database
Writes logs at more than 20 MB/s per database
Requires change data capture (CDC) to be enabled for more than 1,000 tables
Choose an incremental synchronization mode
Before configuring the task, decide which incremental synchronization mode fits your source database. This choice affects both the source database preparation steps and the limits that apply.
| Mode | Supported table types | Impact on source database | Best for |
|---|---|---|---|
| Hybrid log-based parsing | All table types, including heap tables, tables without primary keys, compressed tables, and tables with computed columns | Creates trigger dts_cdc_sync_ddl, heartbeat table dts_sync_progress, and DDL history table dts_cdc_ddl_history; enables CDC on the source database and specific tables | Most SQL Server deployments where the source can tolerate DTS writing artifacts |
| Incremental synchronization based on source database logs | Tables with clustered indexes and primary key columns only; no heap tables, tables without primary keys, compressed tables, or tables with computed columns | Creates trigger dts_cdc_sync_ddl, heartbeat table dts_sync_progress, and DDL history table dts_cdc_ddl_history in the source database | Sources where you cannot modify the database |
| Polling CDC instances | Amazon RDS for SQL Server, Microsoft Azure SQL Database, Microsoft Azure SQL Managed Instance, Microsoft Azure SQL Server on Virtual Machine, and Google Cloud SQL for SQL Server | DTS enables database-level and table-level CDC; causes brief table locks (a few seconds) | Third-party cloud SQL Server databases |
For ApsaraDB RDS for SQL Server instances running the Web edition, you must use Incremental synchronization based on source database logs.
To check whether your tables are heap tables, tables without primary keys, compressed tables, or tables with computed columns, see the FAQ.
Billing
| Synchronization type | Cost |
|---|---|
| Schema synchronization and full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
Supported synchronization topologies
One-way one-to-one synchronization
One-way one-to-many synchronization
One-way many-to-one synchronization
For all supported topologies, see Synchronization topologies.
SQL operations that can be synchronized
| Operation type | Statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | CREATE TABLE, ADD COLUMN, DROP COLUMN, DROP TABLE, CREATE INDEX, DROP INDEX |
The following DDL operations are not synchronized:
CREATE TABLEfor partitioned tables or tables containing functionsDDL operations involving user-defined types
Online DDL operations
Transactional DDL statements
Limitations
Source database limitations
Tables must have a PRIMARY KEY or UNIQUE constraint, with all fields unique. Without this, the destination database may contain duplicate records.
If you select tables as objects and rename them in the destination database: maximum 5,000 tables per task. For more tables, configure multiple tasks or synchronize at the database level.
Maximum 10 databases per synchronization task. For more, configure multiple tasks.
DTS uses the
fn_logfunction to read source logs. Do not clear source database logs before the task completes — doing so may cause the task to fail.Data logging requirements:
The backup mode must be set to Full, and a full physical backup must be performed.
For incremental-only synchronization: retain logs for at least 24 hours.
For full plus incremental synchronization: retain logs for at least 7 days. After full synchronization completes, you can reduce the retention to 24 hours.
If the source is a read-only instance, DDL operations cannot be synchronized.
If the source database is an Azure SQL Database, a data synchronization task can synchronize data from only one database.
For ApsaraDB RDS for SQL Server instances: disable Transparent Data Encryption (TDE) before starting the task. See Disable TDE.
During schema synchronization and full data synchronization, do not execute DDL statements. Doing so causes the task to fail.
For full data synchronization, use the
READ_COMMITTED_SNAPSHOTtransaction isolation mode on the source database to prevent shared locks from blocking data writes. If you skip this, data inconsistency or instance failures may occur and are not covered by the DTS service level agreement (SLA).
CDC requirements (applies when CDC needs to be enabled on source tables):
The
srvnamefield insys.sysserversmust match the return value of theSERVERPROPERTYfunction.For self-managed SQL Server databases: the database owner must be the
sauser. For ApsaraDB RDS for SQL Server: the database owner must be thesqlsauser.Enterprise edition: SQL Server 2008 or later is required.
Standard edition: SQL Server 2016 SP1 or later is required.
If running SQL Server 2017 (Standard or Enterprise edition), update to a newer version.
The number of CDC-enabled tables in a task cannot exceed the maximum supported by DTS. If it does, the precheck fails.
If a single field in a CDC-enabled table exceeds 64 KB, run:
exec sp_configure 'max text repl size', -1;
Mode-specific limitations
Hybrid log-based parsing:
Do not add or remove columns within a 10-minute window. For example, the following sequence within 10 minutes causes a task error:
ALTER TABLE test_table DROP COLUMN Flag; ALTER TABLE test_table ADD Remark nvarchar(50) not null default('');You cannot run
SELECT INTOorTRUNCATEon CDC-enabled tables. Do not delete the triggers DTS creates.The CDC component retains incremental data for 3 days by default. To change the retention period, run:
exec console.sys.sp_cdc_change_job @job_type = 'cleanup', @retention = <time>;<time>is the retention period in minutes. If daily incremental records exceed 10 million, set<time>to1,440.Keep the CDC-enabled table count at or below 1,000 per task to avoid latency or instability.
Set the maximum write rate for CDC-enabled tables to 1,000 records per second.
Enabling CDC causes brief table locks (a few seconds) on the source database.
Incremental synchronization based on source database logs:
Only tables with clustered indexes that include primary key columns are supported.
Polling CDC instances:
The DTS account must have
sysadminrole membership to enable database-level CDC. For table-level CDC, a privileged account is required.Microsoft Azure SQL Database (vCore model): all databases are supported. Azure SQL Database (DTU model): only databases with an S3 or higher service tier are supported.
Amazon RDS for SQL Server: privileged accounts can enable CDC at the database level via stored procedures.
Clustered columnstore indexes are not supported for CDC.
Maximum 1,000 tables per task.
Cannot add or remove columns more than twice per minute.
Do not modify CDC instances on the source database during synchronization — this may cause data loss.
CDC retains incremental data for 3 days by default. Use the same retention command as hybrid mode.
Enabling CDC causes brief table locks (a few seconds) on the source database.
General limitations
Schema synchronization supports databases, schemas, and tables. Views, triggers, and stored procedures are not synchronized in cross-database-engine scenarios.
Unsupported data types:
TIMESTAMP,CURSOR,ROWVERSION,SQL_VARIANT,HIERARCHYID,POLYGON,GEOMETRY, andGEOGRAPHY.If synchronized data contains 4-byte characters (rare characters or emojis), the destination database and tables must use the
UTF8mb4character set. Set thecharacter_set_serverparameter on the destination database toUTF8mb4.DTS does not support complex DDL operations, DDL operations involving user-defined types, online DDL operations, or transactional DDL statements.
Do not perform reindexing during incremental synchronization — this may cause the task to fail and data may be lost. DTS also cannot synchronize DDL operations related to primary keys on CDC-enabled tables.
If DTS fails to write DDL statements to the destination database, the task continues running. View the failed DDL statements in task logs. See View task logs.
Full data synchronization uses concurrent INSERT operations, which causes fragmentation in destination tables. After full synchronization completes, the destination tablespace is larger than the source.
Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses the read and write resources of the source and destination databases, which may increase the loads on the database servers.
Do not write data from other sources to the destination database during synchronization. Doing so causes data inconsistency.
You cannot add or remove databases when modifying synchronization objects after the task starts.
If the source instance is an ApsaraDB RDS for SQL Server instance, DTS creates an account named
rdsdt_dtsacct. Do not delete this account or change its password while the task runs. For more information, see System accounts.If a DTS task fails, DTS technical support attempts to restore it within 8 hours. During restoration, the task may be restarted and task parameters (not database parameters) may be modified.
Prepare the source database
Configure log settings on the self-managed SQL Server database before creating the synchronization task. If you are synchronizing from multiple databases, repeat these steps for each database.
Change the recovery model to Full. You can also do this in SQL Server Management Studio (SSMS). See View or Change the Recovery Model of a Database (SQL Server).
use master; GO ALTER DATABASE <database_name> SET RECOVERY FULL WITH ROLLBACK IMMEDIATE; GOReplace
<database_name>with the name of your source database. Example:use master; GO ALTER DATABASE mytestdata SET RECOVERY FULL WITH ROLLBACK IMMEDIATE; GOCreate a logical backup of the source database. Skip this step if a logical backup already exists.
<database_name>: source database name<physical_backup_device_name>: path and file name for the backup file
BACKUP DATABASE <database_name> TO DISK='<physical_backup_device_name>'; GOExample:
BACKUP DATABASE mytestdata TO DISK='D:\backup\dbdata.bak'; GOCreate a log backup for the source database.
BACKUP LOG <database_name> to DISK='<physical_backup_device_name>' WITH init; GOExample:
BACKUP LOG mytestdata TO DISK='D:\backup\dblog.bak' WITH init; GO
Configure the synchronization task
Go to the Data Synchronization Tasks page.
Log on to the Data Management (DMS) console.
In the top navigation bar, click Data + AI.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.
The navigation path may vary based on the DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console. You can also go directly to the Data Synchronization Tasks page.
Select the region where the synchronization instance resides. In the new DTS console, select the region in the top navigation bar.
Click Create Task. Configure the source and destination databases using the following parameters.
Section Parameter Description N/A Task Name A descriptive name to identify the task. Does not need to be unique. DTS generates a default name. Source Database Select a DMS database instance Select an existing DMS database instance to auto-populate the parameters, or leave blank to configure manually. Database Type Select SQL Server. Access Method Select Express Connect, VPN Gateway, or Smart Access Gateway. Instance Region The region where the self-managed SQL Server database resides. If it's a self-managed database, deploy the required network environment first. See Preparation overview. Replicate Data Across Alibaba Cloud Accounts Select No for this scenario. Connected VPC The ID of the virtual private cloud (VPC) where the source SQL Server database is deployed. IP address The server IP address of the source database. Port Number The service port. Default: 1433. Database Account The account used to connect to the source database. Must have sysadminrole permissions. See CREATE USER and GRANT (Transact-SQL).Database Password The password for the database account. Destination Database Select a DMS database instance Select an existing DMS database instance to auto-populate the parameters, or leave blank to configure manually. Database Type Select PolarDB for MySQL. Access Method Select Alibaba Cloud Instance. Instance Region The region where the PolarDB for MySQL cluster resides. PolarDB Cluster ID The ID of the destination PolarDB for MySQL cluster. Database Account The account used to connect to the destination cluster. Must have read and write permissions on the destination database. Database Password The password for the database account. Click Test Connectivity and Proceed. DTS automatically adds its server CIDR blocks to the whitelist of Alibaba Cloud database instances and to the security group rules of Elastic Compute Service (ECS) instances. For self-managed databases in data centers or on third-party cloud platforms, manually add the DTS server CIDR blocks to the database whitelist. See CIDR blocks of DTS servers.
WarningAdding DTS CIDR blocks to whitelists or security group rules introduces security exposure. Before proceeding, take preventive measures such as using strong credentials, limiting exposed ports, authenticating API calls, and regularly auditing whitelist rules. For additional security, connect the database to DTS through Express Connect, VPN Gateway, or Smart Access Gateway.
Configure the objects to synchronize and advanced settings.
Parameter Description Synchronization Types Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. Full data synchronization runs first to copy historical data, providing the baseline for incremental synchronization. Processing Mode of Conflicting Tables Precheck and Report Errors (default): the precheck fails if the destination contains tables with the same names as source tables. Use object name mapping if you need to rename destination tables. Ignore Errors and Proceed: skips the name conflict check. During full synchronization, records with conflicting primary or unique keys are skipped. During incremental synchronization, they overwrite destination records. If schemas differ, initialization may fail or partial data may be lost. SQL Server Incremental Synchronization Mode Select the mode you chose in Choose an incremental synchronization mode. The UI labels are: Log-based Parsing for Non-heap Tables and CDC-based Incremental Synchronization for Heap Tables (Hybrid Log-based Parsing), Incremental Synchronization Based on Logs of Source Database (Heap tables are not supported), and Polling and querying CDC instances for incremental synchronization. Source Objects Select the objects to synchronize and click the arrow icon to move them to Selected Objects. In this heterogeneous database scenario, only tables can be synchronized. Selected Objects To rename a single object in the destination, right-click it and use the rename option. See Map the name of a single object. To rename multiple objects at once, click Batch Edit. See Map multiple object names at a time. To select which SQL operations to synchronize for a specific table, right-click it and configure the operations. To filter rows by condition, right-click the object and specify WHERE conditions. See Specify filter conditions. Click Next: Advanced Settings and configure the following parameters.
Parameter Description Monitoring and Alerting Select Yes to receive notifications when the task fails or synchronization latency exceeds a threshold. Configure the alert threshold and notification settings. See Configure monitoring and alerting. Retry Time for Failed Connections How long DTS retries after a connection failure. Range: 10–1,440 minutes. Default: 720. Set to at least 30 minutes. If DTS reconnects within this period, the task resumes. If multiple tasks share the same source or destination database, the shortest retry time takes effect. When DTS retries a connection, you are charged for the DTS instance. We recommend that you specify the retry time range based on your business requirements, and release the DTS instance promptly after the source and destination instances are released. Retry Time for Other Issues How long DTS retries after DDL or DML failures. Range: 1–1,440 minutes. Default: 10. Set to at least 10 minutes. Must be less than Retry Time for Failed Connections. Configure ETL Select Yes to use extract, transform, and load (ETL) to transform data during synchronization. See What is ETL? and Configure ETL. Click Next: Save Task Settings and Precheck. To preview the API parameters for this configuration, hover over the button and click Preview OpenAPI parameters before clicking through.
DTS runs a precheck before the task starts. If any check fails, click View Details to see the cause and troubleshoot. Then click Precheck Again. If an alert item can be ignored, click Confirm Alert Details, then Ignore, and then Precheck Again. Ignoring an alert may result in data inconsistency.
Wait for the Success Rate to reach 100%, then click Next: Purchase Instance.
Configure the billing method and instance class for the synchronization instance.
Section Parameter Description New Instance Class Billing Method Subscription: pay upfront for a fixed term — more cost-effective for long-term use. Pay-as-you-go: billed hourly — suitable for short-term use. Release the instance when you no longer need it to stop billing. Resource Group Settings The resource group for the synchronization instance. Default: default resource group. See What is Resource Management? Instance Class The synchronization throughput class. See Instance classes of data synchronization instances. Subscription Duration Available only for the Subscription billing method. Choose a term of one to nine months, or one, two, three, or five years. Read and accept Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start. In the confirmation dialog, click OK.
The task appears in the task list. You can monitor synchronization progress there.
What's next
Synchronization topologies — set up one-to-many or many-to-one topologies
Map object names — rename objects in the destination database
Configure monitoring and alerting — set up latency and failure alerts
View task logs — inspect DDL failures and task activity