Migration, Synchronisation, or Subscription: Picking the Right DTS Task Type

Alibaba Cloud Data Transmission Service (DTS) supports three task types: migration, synchronisation, and subscription, each with distinct mechanics and use cases.

Replication tooling is often treated as a single capability, but moving data between systems involves several distinct engineering problems: transferring a snapshot, keeping two systems aligned as the source changes, and delivering change events to downstream consumers that may not be databases at all. Alibaba Cloud DTS exposes these as three task types: migration, synchronisation, and subscription, each implemented with a different execution model. Selecting the wrong type for a workload is a common source of pipeline defects.

Data Migration

Migration is a bounded, one-time transfer of schema and existing data rows from the source to the target. It suits region relocations, engine upgrades, and consolidations where the source is being decommissioned. Internally, the task executes three sequential phases: schema migration, full data migration, and incremental migration up to cutover.

The schema phase translates source DDL to target DDL, applying type mapping rules for heterogeneous pairs such as Oracle to PolarDB. Type mapping is the most frequent source of defects. Oracle NUMBER without precision, SQL Server DATETIME2 with sub-millisecond precision, and PostgreSQL JSONB columns each map to several possible target types. The pre-check report identifies unmapped types before execution; resolving these warnings upfront is materially cheaper than reconciling them after data has begun flowing.

Full migration copies current table contents, with internal parallelism across tables and partitioned worker threads within large tables. DTS records the source log position at the start of full migration; this becomes the resume point for the incremental phase, ensuring changes during the copy window are not lost. The incremental phase continues until the operator initiates cutover the moment application traffic is redirected to the target. The task ends after cutover; it is not designed for indefinite operation.

Data Synchronisation

Synchronisation is a similar concept but applies to continuous operation. It employs a similar schema and full-data phases as an initialization phase, followed by a continuous incremental phase intended to be stable for many months. The operation works well for cross-region replica, analytical replica, disaster recovery standby, and active-active topologies.

Object selections permit individual table, view, or column selections, with optional filters on the rows being replicated. The column filters can be helpful when working with an analytical replica. The reason is that some source tables have audit information or even encryption that is unnecessary to replicate to the target. Thus, removing the information from the list ensures that the data will not get to the target.

DDL replication scope issues arise frequently when dealing with lengthy tasks in a system. DTS provides limited replication capabilities of DDLs, which include CREATE TABLE, ALTER TABLE, ADD COLUMN, and DROP TABLE. This scope does not support replication of DDLs such as RENAME TABLE and engine-specific DDLs. If development teams run any other DDL commands regularly, the best practice would be to arrange a process of schema changes using allowed DDLs.

For bidirectional active-active topologies, DTS prevents replication loops through server_id divergence and origin tagging. The dominant design risk is overlapping writes to the same row modified at both endpoints within the latency window. Conflict-handling modes (ignore, overwrite, task-stop) provide mechanical resolution, but none produce semantically correct results when conflicting writes carry independent business meaning. Partitioning write ownership at the row range, table, or shard level is a robust pattern; conflict handling serves as a safety net.

Latency, the difference between the source commit timestamp and target apply timestamp, is the primary observability metric. Sustained latency above operational tolerance typically indicates missing indexes on frequently updated target tables, target resource saturation, or source log growth exceeding target apply throughput. Per-table apply latency is exposed in the console, providing the appropriate surface for isolating slow tables from the rest of the pipeline.

Change Tracking

Subscription operates on a different model: it has no target endpoint in the conventional sense. The task captures source change events and exposes them as a consumable stream over an SDK or Kafka-compatible interface, leaving the consumer to decide how each event is processed. The model suits cache invalidation services, search-index maintainers projecting changes into Elasticsearch, audit pipelines persisting changes to immutable storage, and event-driven workflows reacting to specific table updates.

The Kafka-compatible interface allows consumers built on standard Kafka tooling to subscribe without DTS-specific client code; the native SDK exposes consumption position directly, which suits consumers requiring fine-grained position management. Events are serialised in Canal JSON, Avro, or DTS Maxwell-compatible format, each carrying operation type, before and after row images, and source metadata.

Ordering is guaranteed per primary key events affecting the same row arrive in source-commit order, but events affecting different rows may be parallelised. Consumers requiring strict global ordering must restrict themselves to a single partition or implement reordering against the embedded commit timestamp. The configurable retention window determines how long change events are kept; the value should be sized against the measured recovery time for the downstream consumer fleet rather than left at a default.

Choosing the Right Task Type

The three task types are not interchangeable. Migration is bounded and ends at cutover; synchronisation is continuous alignment between database endpoints; subscription is a change-event stream consumed by non-database systems. The decision is best made by examining the downstream system: if it is a database that should mirror the source for a defined period and then take over, migration fits; if it should mirror indefinitely, synchronisation fits; if it is anything else a cache, a search index, an event processor subscription fits, and the consumer owns the logic that translates events into its own state model.

Disclaimer: The views expressed herein are for reference only and don’t necessarily represent the official views of Alibaba Cloud.

Community

Migration, Synchronisation, or Subscription: Picking the Right DTS Task Type

Data Migration

Data Synchronisation

Change Tracking

Choosing the Right Task Type

Read previous post:

Read next post:

PM - C2C_Yuan

You may also like

Comments

PM - C2C_Yuan

Related Products

PolarDB for MySQL

Data Transmission Service

Security Center

Database Migration Solution