The real-time data synchronization feature in DataWorks replicates data changes from a source to a destination database in real time. You can synchronize single tables or entire databases, ensuring your destination database remains consistent with the source.
Core capabilities
The following figure shows the core capabilities of real-time synchronization.
Capability | Description |
Data synchronization across various data sources | Real-time synchronization supports a wide range of data sources. You can combine various source and destination data sources to build a synchronization pipeline. For more information, see Supported data sources and sync solutions. |
Data synchronization in complex network environments | Real-time synchronization supports various environments, including Alibaba Cloud databases, on-premises IDCs, self-managed databases on ECS, and databases from other cloud providers. Before you start, ensure Network Connectivity between the Resource Group and the source and destination endpoints. For configuration details, see Network connectivity solutions. |
Use cases | Real-time synchronization supports both single-table-to-single-table synchronization and synchronizing incremental data from sharded sources to a single destination table.
|
Real-time synchronization task configuration | When configuring a real-time synchronization task, you can use the following capabilities to perform codeless, real-time ETL on single-table data. For more information, see Configure a real-time sync task in Data Integration. Real-time Synchronization for Single Tables:
|
Task O&M for real-time synchronization | You can configure monitoring and alerting for synchronization tasks.
|
You cannot run real-time synchronization tasks from the Data Studio interface. You must save and submit the real-time synchronization task and then run it in the Operation Center of your production environment.
Real-time synchronization tasks do not support synchronizing views.
Supported data sources
The data sources supported by Data Studio and Data Integration partially overlap. If your required data source type is available in Data Integration, we recommend using it to create your real-time synchronization task.
Not all source and destination data sources in Data Integration can be combined. Refer to the Sync Type options available during configuration to see the supported pairings.
Data Integration and Data Studio (new version)
Source: Kafka, Hologres, Oracle, LogHub, and DataHub.
Destination: ApsaraDB for OceanBase, Data Lake Formation (DLF), Doris, Hologres, Kafka, MaxCompute, Object Storage Service (OSS), OSS-HDFS, StarRocks, Tablestore, and Lindorm.
Data processing: Data Filtering, String Replacement, Data Masking, JSON Parsing, and field editing and assignment.
Data Studio (legacy)
Source: MySQL, DataHub, LogHub, Kafka, and PolarDB.
Destination: MaxCompute, Hologres, AnalyticDB for MySQL 3.0, Elasticsearch, DataHub, and Kafka.
Data processing: Data Filtering, String Replacement, and Data Masking.
Get started
To create a real-time synchronization task for a single table, see Configure a real-time sync task in Data Integration and Configure a real-time sync task in DataStudio (legacy).
FAQ
For answers to frequently asked questions about real-time synchronization tasks, see FAQ about real-time synchronization.