What features does single-table real-time synchronization support? What capabilities does real-time synchronization offer? - DataWorks

DataWorks lets you synchronize data changes from a source to a destination database in real time. You can synchronize a single table or an entire database, ensuring the destination consistently mirrors the source.

Core capabilities

The following figure illustrates the capabilities of real-time synchronization.

Capability	Description
Data synchronization between various data sources	Real-time synchronization supports a wide range of data sources. You can combine various input and output data sources to create data synchronization pipelines. For more information, see Data sources and synchronization.
Data synchronization across complex network environments	Real-time synchronization supports data transfer from environments such as Alibaba Cloud database services, on-premises IDCs, self-managed databases on Elastic Compute Service (ECS), and third-party cloud databases. Before you begin, ensure your resource group can connect to the source and destination. For configuration details, see Network connectivity solutions.
Synchronization scenarios	Real-time synchronization supports synchronizing data from a single table to another single table and supports consolidating incremental data from sharded tables (sharding) into a single destination table. Data Development (Old): Configure single-table ETL synchronization tasks using a drag-and-drop interface. This mode supports data processing features like data filtering, string replacement, and data masking. Data Integration & Data Development (New): Configure single-table ETL synchronization tasks using a wizard-based interface. In addition to a rich set of data processing features, this mode supports advanced functions such as data sampling, trial run, and advanced parameters.
Real-time synchronization task configuration	Real-time synchronization tasks support the following configuration capabilities. No coding is required, and you can perform single-table ETL with simple task configuration. For more information, see Configure a single-table real-time synchronization task. Single-table real-time synchronization: Configuration mode: Use a low-code approach with a graphical drag-and-drop or wizard-based interface. No coding is required, which makes it easy to get started. Field mapping: Supports name-based and position-based mapping, as well as custom field relationships. If a source field lacks a counterpart in the destination table, you can define a dynamic handling policy to add a new column, ignore the field, or report an error. The synchronization task also allows you to dynamically assign constants, variables, and functions to destination fields. Data processing: Supports processing source data using features like data filtering, string replacement, data masking, and JSON parsing before writing it to the destination. Debugging: You can sample data from the source and preview intermediate results at each processing step. Use the Trial Run feature to simulate the final data output. A trial run does not write data to the destination table, so you can debug without affecting your actual data.
Real-time synchronization task O&M	You can set up monitoring and alerts for synchronization tasks. Supports resuming from a breakpoint. If a task is interrupted or data is lost due to an anomaly, you can resume the task from a specific time point to ensure data integrity. You can set up monitoring and alerts for service latency, failovers, DDL policies, and heartbeat checks. For more information, see O&M for real-time sync tasks. Alerts can be sent via email, SMS, phone, and DingTalk, helping you promptly detect and handle exceptions. Provides alert fatigue control. To avoid a flood of notifications in a short period, you can configure rules to send only one alert for a specific issue within a defined time interval. DataWorks supports a heartbeat check. The corresponding alert is automatically enabled or disabled when the task starts or stops. If you manually disable the alert, it remains disabled.

Note

Real-time synchronization tasks cannot be run from the Data Development interface. You must save and submit the real-time synchronization node, then run it in the production environment from O&M.
Real-time synchronization tasks cannot synchronize views.

Supported data sources

Important

Some data sources are supported in both Data Development and Data Integration. If the data source you need to use is available in Data Integration, we recommend that you create the real-time synchronization task there.
Not all source and destination data sources in Data Integration are compatible. For supported combinations, refer to the available Synchronization Type options when you configure the source and destination.

Data Development (Old)

Source: MySQL, DataHub, LogHub, Kafka, and PolarDB.

Destination: MaxCompute, Hologres, AnalyticDB for MySQL 3.0, Elasticsearch, DataHub, and Kafka.

Data processing: Data filtering, string replacement, and data masking.

Data Integration and Data Development (New)

Source: Kafka, Hologres, Oracle, LogHub, and DataHub.

Destination: ApsaraDB for OceanBase, Data Lake Formation (DLF), Doris, Hologres, MaxCompute, Object Storage Service (OSS), OSS-HDFS, StarRocks, Tablestore, and Lindorm.

Data processing: Data filtering, string replacement, data masking, JSON parsing, and field editing and assignment.

Get started

To create a single-table real-time synchronization task, see Configure a real-time synchronization task (legacy version) and Single-table real-time synchronization task configuration.

Frequently asked questions

For frequently asked questions about real-time synchronization tasks, see Real-time synchronization.