DataWorks lets you synchronize data changes from a source to a destination database in real time. You can synchronize a single table or an entire database, ensuring the destination consistently mirrors the source.
Core capabilities
The following figure illustrates the capabilities of real-time synchronization.
Capability | Description |
Data synchronization between various data sources | Real-time synchronization supports a wide range of data sources. You can combine various input and output data sources to create data synchronization pipelines. For more information, see Data sources and synchronization. |
Data synchronization across complex network environments | Real-time synchronization supports data transfer from environments such as Alibaba Cloud database services, on-premises IDCs, self-managed databases on Elastic Compute Service (ECS), and third-party cloud databases. Before you begin, ensure your resource group can connect to the source and destination. For configuration details, see Network connectivity solutions. |
Synchronization scenarios | Real-time synchronization supports synchronizing data from a single table to another single table and supports consolidating incremental data from sharded tables (sharding) into a single destination table.
|
Real-time synchronization task configuration | Real-time synchronization tasks support the following configuration capabilities. No coding is required, and you can perform single-table ETL with simple task configuration. For more information, see Configure a single-table real-time synchronization task. Single-table real-time synchronization:
|
Real-time synchronization task O&M | You can set up monitoring and alerts for synchronization tasks.
|
Real-time synchronization tasks cannot be run from the Data Development interface. You must save and submit the real-time synchronization node, then run it in the production environment from O&M.
Real-time synchronization tasks cannot synchronize views.
Supported data sources
Some data sources are supported in both Data Development and Data Integration. If the data source you need to use is available in Data Integration, we recommend that you create the real-time synchronization task there.
Not all source and destination data sources in Data Integration are compatible. For supported combinations, refer to the available Synchronization Type options when you configure the source and destination.
Data Development (Old)
Source: MySQL, DataHub, LogHub, Kafka, and PolarDB.
Destination: MaxCompute, Hologres, AnalyticDB for MySQL 3.0, Elasticsearch, DataHub, and Kafka.
Data processing: Data filtering, string replacement, and data masking.
Data Integration and Data Development (New)
Source: Kafka, Hologres, Oracle, LogHub, and DataHub.
Destination: ApsaraDB for OceanBase, Data Lake Formation (DLF), Doris, Hologres, MaxCompute, Object Storage Service (OSS), OSS-HDFS, StarRocks, Tablestore, and Lindorm.
Data processing: Data filtering, string replacement, data masking, JSON parsing, and field editing and assignment.
Get started
To create a single-table real-time synchronization task, see Configure a real-time synchronization task (legacy version) and Single-table real-time synchronization task configuration.
Frequently asked questions
For frequently asked questions about real-time synchronization tasks, see Real-time synchronization.