All Products
Search
Document Center

DataWorks:Real-time sync tasks for single tables

Last Updated:Nov 14, 2025

DataWorks provides a real-time data synchronization feature. You can use this feature to synchronize data changes from a single table or an entire database to a destination database in real time. This ensures that the destination database remains consistent with the source.

Core features

Real-time synchronization supports the features described in the following table.

Capabilities

Description

Data synchronization between various data sources

Real-time synchronization supports various data sources. You can combine different input and output data sources to create a sync link. For more information, see Supported data sources and sync solutions.

Data synchronization in complex network environments

Real-time synchronization supports data synchronization in environments such as Alibaba Cloud databases, on-premises data centers, self-managed databases on ECS instances, or databases outside Alibaba Cloud. Before you configure the task, ensure that the resource group can connect to the source and destination. For more information about the configuration, see Network connectivity solutions.

Sync scenarios

Real-time synchronization supports synchronizing data in real time from a single table to a single destination table. It also supports synchronizing incremental data from sharded databases and tables to a single destination table.

  • Real-time incremental synchronization for a single table

    • Data Studio: You can configure single-table-to-single-table extract, transform, and load (ETL) synchronization using a drag-and-drop interface. This method supports data processing features such as data filtering, string replacement, and data masking.

    • Data Integration: You can configure single-table-to-single-table ETL synchronization using a wizard. In addition to various data processing features, Data Integration also supports advanced features such as data sampling, simulation runs, and advanced parameter settings.

  • Real-time full and incremental synchronization from sharded databases and tables to a single table

    Currently, this feature is supported only for synchronizing data from MySQL and PolarDB to MaxCompute. Sharding synchronization can merge tables that have the same schema from the source into a logical table and write the data to a single destination table.

Real-time sync task configuration

The following features are supported when you configure a real-time sync task. You can perform real-time ETL on data from a single table using simple configurations without writing code. For more information, see Configure a real-time sync task for a single table and Synchronize data from sharded databases and tables to MaxCompute.

Real-time synchronization for a single table:

  • Configuration method: You can use a drag-and-drop graphical user interface (GUI) or a wizard for low-code development. No code is required. This allows novice users to get started easily.

  • Field mapping: You can map fields with the same name or in the same order. You can also customize field relationships. If an ancestor table field does not have a corresponding field in the destination table, you can specify a dynamic field processing policy to add a column, ignore the field, or report an error. The sync task also lets you dynamically assign values to destination fields using constants, variables, and functions.

  • Data processing: You can process source data using features such as Data Filtering, String Replace, Data Masking, and JSON Parsing. The processed data is then written to the destination database.

  • Code debugging: You can sample data from the source data source and view the intermediate results of each data processing step. You can use the Simulation feature to simulate the final data output. The simulation data is not written to the destination table. This ensures that the debugging process does not affect your production data.

Real-time synchronization for sharded databases and tables:

  • Logical table rule settings: You can use regular expressions to define the scope for searching and integrating source tables. These tables are then used as the sharded source and configured as a logical table. You can also set the mapping between the logical table and the destination table.

  • DDL and DML rule settings: You can set rules for how Data Definition Language (DDL) and Data Manipulation Language (DML) changes in the source affect the destination table. You can select specific responses for the destination table based on the change type.

Real-time sync task O&M

You can monitor sync tasks and configure alerts.

  • You can use the resuming from a breakpoint feature. If a task is interrupted or data is lost due to abnormalities, you can specify a time point from which to resume the task. This ensures data integrity.

  • You can configure monitoring and alerting for business latency, failover, DDL policies, and heartbeat checks. For more information, see Real-time sync task O&M.

  • DataWorks can send alert notifications to specified recipients by email, text message, phone call, or DingTalk. This helps you promptly identify and handle task exceptions.

  • Alert fatigue control is supported. To avoid generating many alerts in a short period, DataWorks lets you set a rule to send an alert notification only once within a specified interval.

  • Heartbeat detection is supported. The heartbeat alert feature is automatically enabled or disabled when the task starts or stops. If you manually disable the feature, the setting is retained.

Note
  • Real-time sync tasks cannot be run from the Data Studio page. You must save and submit the real-time sync node, and then run the node from the Operation Center in the production environment.

  • Real-time sync tasks do not support synchronizing views.

Supported data sources

Important
  • The data sources supported by Data Studio and Data Integration partially overlap. If Data Integration supports the data source type that you need, we recommend that you create the real-time sync task in Data Integration.

  • The supported source and destination data sources in Data Integration have specific supported pairings. For information about supported combinations, refer to the Sync Type options that are available when you configure the source and destination data sources.

Data Studio

Source: MySQL, DataHub, LogHub, Kafka, and PolarDB.

Destination: MaxCompute, Hologres, AnalyticDB for MySQL 3.0, Elasticsearch, DataHub, and Kafka.

Data processing: Data filtering, string replacement, and data masking.

Data Integration

Source: Kafka, Hologres, Oracle, LogHub, and DataHub.

Destination: ApsaraDB for OceanBase, Data Lake Formation (DLF), Doris, Hologres, MaxCompute, OSS, OSS-HDFS, StarRocks, and Tablestore.

Data processing: Data filtering, string replacement, data masking, JSON parsing, and field editing and assignment.

Get started

FAQ

For answers to frequently asked questions about real-time sync tasks, see FAQ about real-time synchronization.