Replication tooling is often treated as a single capability, but moving data between systems involves several distinct engineering problems: transferring a snapshot, keeping two systems aligned as the source changes, and delivering change events to downstream consumers that may not be databases at all. Alibaba Cloud DTS exposes these as three task types: migration, synchronisation, and subscription, each implemented with a different execution model. Selecting the wrong type for a workload is a common source of pipeline defects.
Migration is a bounded, one-time transfer of schema and existing data rows from the source to the target. It suits region relocations, engine upgrades, and consolidations where the source is being decommissioned. Internally, the task executes three sequential phases: schema migration, full data migration, and incremental migration up to cutover.
The schema phase translates source DDL to target DDL, applying type mapping rules for heterogeneous pairs such as Oracle to PolarDB. Type mapping is the most frequent source of defects. Oracle NUMBER without precision, SQL Server DATETIME2 with sub-millisecond precision, and PostgreSQL JSONB columns each map to several possible target types. The pre-check report identifies unmapped types before execution; resolving these warnings upfront is materially cheaper than reconciling them after data has begun flowing.
Full migration copies current table contents, with internal parallelism across tables and partitioned worker threads within large tables. DTS records the source log position at the start of full migration; this becomes the resume point for the incremental phase, ensuring changes during the copy window are not lost. The incremental phase continues until the operator initiates cutover the moment application traffic is redirected to the target. The task ends after cutover; it is not designed for indefinite operation.
Synchronisation is a similar concept but applies to continuous operation. It employs a similar schema and full-data phases as an initialization phase, followed by a continuous incremental phase intended to be stable for many months. The operation works well for cross-region replica, analytical replica, disaster recovery standby, and active-active topologies.
Object selections permit individual table, view, or column selections, with optional filters on the rows being replicated. The column filters can be helpful when working with an analytical replica. The reason is that some source tables have audit information or even encryption that is unnecessary to replicate to the target. Thus, removing the information from the list ensures that the data will not get to the target.
DDL replication scope issues arise frequently when dealing with lengthy tasks in a system. DTS provides limited replication capabilities of DDLs, which include CREATE TABLE, ALTER TABLE, ADD COLUMN, and DROP TABLE. This scope does not support replication of DDLs such as RENAME TABLE and engine-specific DDLs. If development teams run any other DDL commands regularly, the best practice would be to arrange a process of schema changes using allowed DDLs.
For bidirectional active-active topologies, DTS prevents replication loops through server_id divergence and origin tagging. The dominant design risk is overlapping writes to the same row modified at both endpoints within the latency window. Conflict-handling modes (ignore, overwrite, task-stop) provide mechanical resolution, but none produce semantically correct results when conflicting writes carry independent business meaning. Partitioning write ownership at the row range, table, or shard level is a robust pattern; conflict handling serves as a safety net.
Latency, the difference between the source commit timestamp and target apply timestamp, is the primary observability metric. Sustained latency above operational tolerance typically indicates missing indexes on frequently updated target tables, target resource saturation, or source log growth exceeding target apply throughput. Per-table apply latency is exposed in the console, providing the appropriate surface for isolating slow tables from the rest of the pipeline.
Subscription operates on a different model: it has no target endpoint in the conventional sense. The task captures source change events and exposes them as a consumable stream over an SDK or Kafka-compatible interface, leaving the consumer to decide how each event is processed. The model suits cache invalidation services, search-index maintainers projecting changes into Elasticsearch, audit pipelines persisting changes to immutable storage, and event-driven workflows reacting to specific table updates.
The Kafka-compatible interface allows consumers built on standard Kafka tooling to subscribe without DTS-specific client code; the native SDK exposes consumption position directly, which suits consumers requiring fine-grained position management. Events are serialised in Canal JSON, Avro, or DTS Maxwell-compatible format, each carrying operation type, before and after row images, and source metadata.
Ordering is guaranteed per primary key events affecting the same row arrive in source-commit order, but events affecting different rows may be parallelised. Consumers requiring strict global ordering must restrict themselves to a single partition or implement reordering against the embedded commit timestamp. The configurable retention window determines how long change events are kept; the value should be sized against the measured recovery time for the downstream consumer fleet rather than left at a default.
The three task types are not interchangeable. Migration is bounded and ends at cutover; synchronisation is continuous alignment between database endpoints; subscription is a change-event stream consumed by non-database systems. The decision is best made by examining the downstream system: if it is a database that should mirror the source for a defined period and then take over, migration fits; if it should mirror indefinitely, synchronisation fits; if it is anything else a cache, a search index, an event processor subscription fits, and the consumer owns the logic that translates events into its own state model.
Disclaimer: The views expressed herein are for reference only and don’t necessarily represent the official views of Alibaba Cloud.
100 posts | 2 followers
FollowAlibaba Clouder - February 22, 2021
Alibaba Cloud Product Launch - January 22, 2019
Alibaba Clouder - March 30, 2018
ApsaraDB - July 12, 2023
ApsaraDB - July 12, 2023
Alibaba Clouder - July 9, 2020
100 posts | 2 followers
Follow
PolarDB for MySQL
Alibaba Cloud PolarDB for MySQL is a cloud-native relational database service 100% compatible with MySQL.
Learn More
Data Transmission Service
Supports data migration and data synchronization between data engines, such as relational database, NoSQL and OLAP
Learn More
Security Center
A unified security management system that identifies, analyzes, and notifies you of security threats in real time
Learn More
Database Migration Solution
Migrating to fully managed cloud databases brings a host of benefits including scalability, reliability, and cost efficiency.
Learn MoreMore Posts by PM - C2C_Yuan