All Products
Search
Document Center

DataWorks:Supported data sources and synchronization solutions

Last Updated:Mar 26, 2026

DataWorks Data Integration moves data between databases, object storage, message queues, and SaaS systems through batch and real-time synchronization. This page helps you choose a synchronization approach and verify whether your data source supports the read or write operations you need.

Choose a synchronization approach

Two factors drive the choice:

  1. Latency — Is a daily update sufficient (T+1 batch), or do you need changes reflected within seconds or minutes (real-time)?

  2. Scale — Are you moving a handful of complex, heterogeneous tables, or replicating hundreds of uniform tables at once?

The following table maps those factors to the available approaches.

Approach Latency Best for Key trade-off
Single-table batch T+1 or periodic A small number of core tables with complex transformation logic, or non-standard sources such as APIs and log files High configuration overhead and resource cost at scale; 100 single-table tasks consume roughly 100 CUs versus 2 CUs for one whole-database task
Whole-database batch T+1 or periodic Hundreds of homogeneous tables — building an Operational Data Store (ODS) layer, cloud migration, and periodic backups No per-table transformation logic
Single-table real-time Second-to-minute latency Complex processing of real-time change streams from a single, critical table Higher operational complexity than batch
Whole-database real-time Second-to-minute latency Real-time data warehouses, database disaster recovery, and real-time data lake integration Requires Change Data Capture (CDC) support or a message queue source
Whole-database full and incremental Full sync: batch; incremental: T+1 Append-only targets (such as non-Delta MaxCompute tables) that cannot process CDC updates directly Final merged state is visible only after the T+1 merge task completes
Serverless T+1 or periodic See Serverless synchronization task

Prerequisite for incremental batch synchronization: The source table must have a field that tracks changes, such as a timestamp column (gmt_modified) or an auto-incrementing ID. Without one, use periodic full synchronization instead.

Prerequisite for real-time synchronization: The source must support CDC or act as a message queue. For MySQL, binary logging must be enabled.

How the whole-database full and incremental approach works

Append-only storage systems like non-Delta MaxCompute tables cannot process physical Update or Delete operations. Writing a CDC stream to them directly produces inconsistent data — for example, deleted rows remain visible in the target.

Data Integration addresses this with the Base + Log pattern, implemented as a whole-database full and incremental task:

  • A Base table holds the latest full snapshot.

  • A Log table captures the real-time CDC stream.

CDC changes are written to the Log table within minutes. On a T+1 schedule, the system automatically merges the Log table into the Base table to produce an updated full snapshot. This balances near-real-time data capture with the eventual consistency that batch data warehouses require.

For setup details, see Whole-database full and incremental (near real-time) task.

Synchronization types at a glance

The following table summarizes each synchronization type by source granularity, target granularity, latency, and typical scenario.

Type Source granularity Target granularity Timeliness Synchronization scenario
Single-table batch Single table Single table/partition T+1 or periodic Periodic full or incremental synchronization
Sharding batch Multiple tables with identical schema Single table/partition T+1 or periodic Periodic full or incremental synchronization
Single-table real-time Single table Single table/partition Second-to-minute latency Change Data Capture (CDC)
Whole-database batch Whole database or multiple tables Matching tables and partitions One-time or periodic One-time or periodic full/incremental synchronization. Supports an initial full synchronization followed by periodic incremental updates.
Whole-database real-time Whole database or multiple tables Matching tables and partitions Second-to-minute latency Full synchronization + Change Data Capture (CDC)
Whole-database full and incremental Whole database or multiple tables Matching tables and partitions Initial full synchronization: batch processing; subsequent incremental synchronization: T+1 Full synchronization + Change Data Capture (CDC)

Data source read/write capabilities

The table below lists every supported data source and its read/write capabilities across each synchronization type. Read means the source can act as a data source; Write means it can act as a destination. A dash (—) indicates the combination is not supported.

Sources are grouped by category to help you find your data source faster.

Alibaba Cloud databases

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
AnalyticDB for MySQL 2.0 Read/Write
AnalyticDB for MySQL 3.0 Read/Write Write Read Write
AnalyticDB for PostgreSQL Read/Write Read
ApsaraDB for OceanBase Read/Write Write Read/Write
ApsaraDB for Memcache Write
DataHub Read/Write Read/Write Write
Data Lake Formation Read/Write Write Write Write
Hologres Read/Write Read/Write Read/Write Write
Lindorm Read/Write Write Write
MaxCompute Read/Write Write Write Write Write
MaxGraph Write
MetaQ Read
Milvus Read/Write
Object Storage Service (OSS) Read/Write Write Write
OpenSearch Write
OSS-HDFS Read/Write Write Write
PolarDB Read/Write Read Read Read Read
PolarDB-X 2.0 Read/Write Read Read
Simple Log Service (SLS) Read/Write Read
Tablestore Read/Write Write

Relational databases

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
DB2 Read/Write Read
DM (Dameng) Read/Write Read
DRDS (PolarDB-X 1.0) Read/Write Read
KingbaseES (Renda Jingcang) Read/Write
MariaDB Read/Write
MySQL Read/Write Read Read Read Read
Oracle Read/Write Read Read Read Read
PostgreSQL Read/Write Read Read
SAP HANA Read/Write
SQL Server Read/Write Read
TiDB Read/Write
GBase8a

GBase8a

Read/Write
Vertica

Vertica

Read/Write

Analytical and columnar databases

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
ClickHouse Read/Write Read
Doris Read/Write Write Read
StarRocks Read/Write Write Write Write

NoSQL and search

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
Elasticsearch

Elasticsearch

Read/Write Write Write Write
HBase

HBase

HBase: Read/Write; HBase 20xsql: Read; HBase 11xsql: Write
MongoDB Read/Write Read
Redis Write
TSDB Write

Object storage and file systems

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
Amazon S3 Read/Write
Azure Blob Storage Read
COS Read
FTP Read/Write
HDFS Read/Write
Hive

Hive

Read/Write Read/Write
HttpFile Read
TOS Read

Message queues

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
Kafka Read/Write Read/Write Write

Cloud data warehouses and analytics platforms

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
Amazon Redshift Read/Write
BigQuery Read
Databricks Read
Public dataset Read
Snowflake Read/Write

SaaS and APIs

Data source Single-table batch Single-table real-time Whole-database batch Whole-database real-time Whole-database full and incremental
RestAPI Read/Write
Salesforce Read/Write
Sensors Data (Shen Ce) Write
If your data source is not listed, use the RestAPI connector for sources that expose RESTful APIs.

What's next

References