All Products
Search
Document Center

DataWorks:Data Integration: Integration of data from various data sources

Last Updated:Aug 22, 2023

Data Integration is a stable, efficient, and scalable data synchronization service. It can be used to migrate and synchronize data among a wide range of heterogeneous data sources that reside in complex network environments in a fast and stable manner.

Overview

DataWorks Data Integration supports batch synchronization, real-time synchronization, and full or incremental synchronization that combines batch and real-time synchronization.

  • You can configure a scheduling cycle for a batch synchronization node.

  • Data synchronization among more than 50 types of heterogeneous data sources such as relational databases, data warehouses, file storage systems, and message queues is supported.

  • Network connectivity solutions for data source connections in complex network environments are provided. You can use Data Integration to connect data sources that reside on the Internet, or in data centers or virtual private clouds (VPCs).

  • Security control and O&M monitoring are supported to ensure that the data synchronization process is secure and controllable.

Core technology and architecture

  • Engine architectureEngine architectureA star-shaped engine architecture is provided. After a data source is added to Data Integration, the data source can be connected to another data source in Data Integration to form a data synchronization link. Then, data can be synchronized between the data sources. For more information about the supported data sources, see Supported data source types, Reader plug-ins, and Writer plug-ins and Data source types that support real-time synchronization.

  • Resource groups for Data Integration and network connectivityData synchronizationBefore you synchronize data, you must connect the data sources to your resource group for Data Integration, as shown in the preceding figure. DataWorks allows you to use exclusive or custom resource groups for Data Integration to synchronize data. You can select a resource group type based on your business scenario. For more information about network connectivity solutions that can be used, see Establish a network connection between a resource group and a data source.

Use scenarios

DataWorks Data Integration is suitable for data transmission scenarios such as data ingestion into data warehouses or data lakes, sharding, real-time data archiving, and data forwarding between clouds.

Billing

You may be charged the following fees for running data synchronization nodes in Data Integration:

Note

When you run data synchronization nodes in Data Integration, fees for node configurations may be generated. The fees are not charged by DataWorks, and the bills for the fees are not generated in DataWorks. For example, when you run a data synchronization node, fees for using data sources, computing and storage features of the related compute engine instance, and network services in the node may be generated. These fees are not charged by DataWorks. Network service fees include fees for using Express Connect, Internet Shared Bandwidth, and Elastic lP Address (EIP).

Activate DataWorks

After you activate DataWorks of a specific edition, you can purchase a resource group for Data Integration based on your business requirements and select an appropriate network connectivity solution to develop a data synchronization node in Data Integration. For more information about how to use Data Integration, see Overview.