The offline integration feature of Data Management (DMS) provides a low-code tool that you can use to develop data processing tasks. You can combine various task nodes to form a data flow and configure periodic scheduling to process or synchronize data.
Supported database types
- MySQL: ApsaraDB RDS for MySQL, PolarDB for MySQL, MyBase for MySQL, PolarDB-X, AnalyticDB for MySQL V3.0, and MySQL databases that are not on Alibaba Cloud
- SQL Server: ApsaraDB RDS for SQL Server, MyBase for SQL Server, and SQL Server databases that are not on Alibaba Cloud
- PostgreSQL: ApsaraDB RDS for PostgreSQL, PolarDB for PostgreSQL, MyBase for PostgreSQL, AnalyticDB for PostgreSQL, and PostgreSQL databases that are not on Alibaba Cloud
- HologresNote Hologres is supported only for Data Import nodes.
- OSSNote OSS is supported only for Data Output nodes.
The offline integration feature supports the batch processing of data. You can use the feature in the following scenarios:
- You can construct an offline data warehouse by using this low-code tool in a visualized way. Then, you can use this data warehouse to perform ad hoc query, data analysis from multiple dimensions, data mining, and offline computing.
- You can process a large amount of complex big data in scenarios such as refined enterprise operations, digital marketing, and intelligent recommendation.
- You can use the offline integration feature that is developed based on Spark SQL to significantly improve the efficiency of Spark SQL nodes on a Hadoop-based platform.
- Log on to the DMS console.
- In the top navigation bar, click DTS. In the left-side navigation pane, choose Data integration > Data processing.
- Click Create Data Flow.
- In the Create Data Flow dialog box, set the Data Flow Name and Description parameters. Then, click OK.
- On the details page of the data flow, create nodes for the data flow. For more information, see Create a data flow.
- Click the blank area on the canvas to configure the data flow.
- Click the Data Flow Information tab. In the Properties section, set the Data Flow Name, Description, Owner, and Stakeholders parameters.
- In the Scheduling Settings section, turn on Enable Scheduling to schedule the data flow based on your needs. For more information, see Table 1.
- Click the Advanced Settings tab and configure variables. For more information, see Configure time variables.
- Publish the data flow. For more information, see Publish a data flow.
- Optional: In the upper-right corner of the pages, click Go to O&M to perform O&M operations on the data flow. For more information, see Manage a data flow.