Data Integration

Data Integration is an all-in-one data synchronization platform. The platform supports online real-time and offline data exchange between all data sources, networks, and locations.

Data Integration is a distribution service providing data transmission, data conversion and synchronization based on an advanced distribution architecture with multiple modules (such as dirty data processing and flow control). Data Integration supports multiple features, including support for multiple data sources, fast transmission, high reliability, scalability, and mass synchronization.

Benefits

Support For Multiple Disparate Data Sources
Data Integration supports data synchronization between more than 400 pairs of disparate data sources( including RDS databases, semi-structured storage, non-structured storage (such as audio, video, and images), NoSQL databases, and big data storage). Data Integration also supports real-time data reading and writing between data sources such as Oracle, MySQL, and DataHub.
Scheduled Tasks
Data Integration allows you to schedule offline tasks by setting a specific trigger time (including year, month, day, hour, and minute). It only requires a few steps to configure periodical incremental data extraction. Data integration works perfectly with DataWorks data modeling. The entire workflow is an integration of operations and maintenance.
Mass Upload to Cloud
Data Integration leverages the computing capability of Hadoop clusters to synchronize the HDFS data from clusters to MaxCompute. This is called Mass Cloud Upload. Data Integration can transmit up to 5TB of data per day. The maximum transmission rate is 2 GB/s.
Monitoring and Alarms
With 19 built-in monitoring rules, Data Integration applies to most monitoring scenarios. You can set alarm rules based on these monitoring rules. Additionally, you can pre-define the task failure notification mode for Data Integration.

Features

  • Data Source Management

    By leveraging the data sources and datasets that define the source and destination of data, Data Integration provides two data management plug-ins. The Reader plug-in is used to read data and the Writer plug-in is used to write data. Based on this framework, a set of simplified intermediate data transmission formats is developed to exchange data between arbitrary structured and semi-structured data sources.

  • Local Data Collection

    Data Integration supports data synchronization in Alibaba Cloud classic networks and VPCs, as well as data collection in local IDCs.

  • Full Database Migration

    Full Database migration is a tool provided by Data Integration, which allows the creation of multiple data synchronization tasks and imports all data tables in a MySQL database to MaxCompute. By using full database migration, you no longer need to create synchronization tasks one at a time.

  • Incremental Synchronization

    By using the WHERE clause, Data Integration supports business data filtering by date. Data with different dates is synchronized to the relevant MaxCompute partition tables. By setting the synchronization interval to 1 hour or 10 minutes, Data Integration is capable of performing quasi-real-time incremental synchronization.

    Pricing Overview

    Overall

    Data Integration provides one purchase method . You can choose to Pay-As-You-Go.

    Pay-As-You-Go bills you for the DMU actually used. You can activate or stop resources at any time. With the Pay-As-You-Go payment method, you can activate and stop resources as needed with no maintenance costs needed. Alibaba Cloud Data Integration supports multiple features, including support for multiple data sources, fast transmission, high reliability, scalability, and mass synchronization.

    If the system resource group is used when your synchronization task runs, the charge is calculated as follows:USD 0.056/hour for per DMU

    If a custom resource group is used when your synchronization task runs, the charge is calculated as follows:USD 0.022/hour