Data Integration provides the offline synchronization, real-time synchronization, and solution-based synchronization features. You can select a feature based on whether you want to synchronize data in offline mode or real-time mode and whether you want to synchronize full data or incremental data. In addition, you must take the data source types and the number of involved tables or databases into consideration when you select a feature. This topic describes the capabilities of the features from different dimensions. This topic also describes the core dimensions that you can refer to when you select a feature.

Capabilities of the data synchronization features

The following table provides an overview on the support of the features for batch synchronization and real-time synchronization, full synchronization and incremental synchronization, and the numbers of tables and databases. You can determine the features that you can use in your data synchronization scenario based on information described in the following table. Then, you can select the most suitable feature by referring to the core dimensions described in the next section.

Dimension Batch synchronization Real-time synchronization Solution-based synchronization (batch synchronization of data from all tables in a database) Solution-based synchronization (one-time full synchronization and real-time incremental synchronization)
Dimension 1: Batch synchronization and real-time synchronization Batch synchronization ×
Real-time synchronization × ×
Dimension 2: Incremental synchronization and full synchronization Full synchronization × √ (One-time full synchronization, periodical full synchronization, and one-time full synchronization and periodical incremental synchronization) √ (One-time full synchronization and real-time incremental synchronization)
Incremental synchronization √ (One-time incremental synchronization and periodical incremental synchronization)
Dimension 3: Number of tables or databases Single table (data synchronization from one table to another table) ×
Single database (data synchronization from multiple tables to multiple tables) ×
Sharded database (data synchronization from multiple tables to one table)
Note Some types of data sources support data synchronization from multiple tables to one table.
×
Dimension 4: Supported data sources DataWorks provides readers and writers for you to read data from and write data to data sources. For more information about the supported data source types, readers, and writers, see Supported data source types, readers, and writers. Multiple types of data sources are supported. You can synchronize data between different types of data sources. For more information about the supported data source types, see Plug-ins for data sources that support real-time synchronization. DataWorks provides data synchronization solutions that are used to synchronize data between multiple types of data sources in different scenarios. For more information about the supported data source types, see Supported data source types and read and write operations.
References Overview of the batch synchronization feature Description of real-time data synchronization capabilities Overview
Note
  • One-time full synchronization and real-time incremental synchronization: Synchronize full data at a time and incremental data in real time to a destination.

    The first time you run a data synchronization solution, full data in a single source table or multiple source tables is written to the specified partition in a single destination table or the specified partitions in multiple destination tables. Then, incremental data and full data in the single source table or the source tables is merged and written to the specified partition in the single destination table or the specified partitions in the destination tables in real time.

  • One-time full synchronization and periodical incremental synchronization: Synchronize full data at a time and incremental data periodically to a destination.

    The first time you run a data synchronization solution, full data in a single source table or multiple source tables is written to the specified partition in a single destination table or the specified partitions in multiple destination tables. Then, incremental table in the single source table or the source tables is periodically written to the specified partition in the single destination table or the specified partitions in the destination tables.

Core dimensions

You can select the feature that you need to use to configure a data synchronization solution or node based on the following core dimensions:

  • Supported data source types, readers, and writers

    Before you configure a data synchronization solution or node, you can select the feature that you want to use to configure the solution or node based on the types of your data sources.

    Note In real-time data synchronization scenarios, you must also be familiar with the support of different types of destinations for DDL and DML operations on sources. For more information, see Supported DML and DDL operations.
  • Number of involved databases or tables

    You can also select the required feature based on the number of databases or tables from which you want to read data and the number of tables to which you want to write data.

Appendix: Description of destination partitions

  • Batch synchronization

    In incremental synchronization scenarios, you can use the data backfill feature provided in Operation Center to write historical data to the related time partition in the destination table. For more information, see Synchronize incremental data.

  • Real-time synchronization: Incremental data in a source table is written to the T-1 partition in a destination table in real time.
  • Solution-based synchronization
    Synchronization solution Description for data write
    One-time full synchronization and periodical incremental synchronization
    1. If a data synchronization solution is configured on the T day, full data in a source table is written to the T-1 partition in a destination table at a time.
    2. On the T+N day, incremental data in the source table is periodically written to the T+N-1 partition in the destination table.
    One-time full synchronization and real-time incremental synchronization
    1. If a synchronization solution is configured on the T day, full data in a source table is written to the T-1 partition in a destination table at a time, and incremental data and full data in the source table is merged and written to the T-1 partition in the destination table in real time.
    2. On the T+N day, incremental data and full data in the source table is merged and written to the T+N-1 partition in real time.
    Note If you use the one-click real-time synchronization to MaxCompute solution to synchronize data, incremental data in a source table is written to the Log table for incremental data on the day on which you configure the solution, and the incremental data and full data in the source table is merged and written to the Base table for full data on the next day. One-click real-time synchronization to MaxCompute
    Periodical full synchronization Full data in a source table is periodically written to the T-1 partition in a destination table.
    One-time full synchronization If a data synchronization solution is configured on the T day, full data in a source table is written to the T-1 partition in a destination table at a time.
    Periodical incremental synchronization Incremental data in a source table is periodically written to the T-1 partition in a destination table.
    One-time incremental synchronization If a data synchronization solution is configured on the T day, incremental data in a source table is written to the T-1 partition in a destination table at a time.