The task orchestration feature of Data Management (DMS) is used to orchestrate and schedule tasks. You can create a task flow composed of one or more task nodes to implement complex scheduling and improve data development efficiency.
Supported database types
- Relational databases
- MySQL: ApsaraDB RDS for MySQL, PolarDB for MySQL, MyBase for MySQL, PolarDB-X, and MySQL databases from other sources
- SQL Server: ApsaraDB RDS for SQL Server, MyBase for SQL Server, and SQL Server databases from other sources
- PostgreSQL: ApsaraDB RDS for PostgreSQL, PolarDB for PostgreSQL, MyBase for PostgreSQL, and PostgreSQL databases from other sources
- OceanBase: ApsaraDB for OceanBase in MySQL mode, ApsaraDB for OceanBase in Oracle mode, and self-managed OceanBase databases
- PolarDB for Oracle
- Oracle
- DM
- Db2
- Data warehouses
- AnalyticDB for MySQL
- AnalyticDB for PostgreSQL
- DLA
- MaxCompute
- Hologres
- Object storage: OSS
Procedure
Task node types
Category | Task node type | Description | References |
---|---|---|---|
Data integration | DTS data migration | Migrates data of selected tables or all tables from a database to another database. This type of node supports full data migration and can migrate both data and schemas. | Configure a DTS data migration node |
Batch Integration | Synchronizes data between data sources. You can use this type of node in scenarios such as data migration and data transmission. | Configure a batch integration node | |
Data processing | Single Instance SQL | Executes SQL statements in a specific relational database. | N/A |
Cross-Database Spark SQL | Uses the Spark engine to process and transmit a large amount of data across databases. This type of node applies to cross-database data synchronization and processing. | Configure a cross-database Spark SQL node | |
Cross-Database SQL | Uses dynamic SQL (DSQL) statements for data queries across databases. You can use this type of node to analyze data across databases and migrate a small amount of data. | N/A | |
DLA Serverless Spark | Configures Spark jobs based on the serverless Spark engine of Data Lake Analytics (DLA). | Create and run Spark jobs | |
DLA Spark SQL | Uses SQL statements to submit jobs to the Spark clusters of DLA. | N/A | |
General operations | SQL Assignment for Single Instance | Assigns the data that is obtained by using the SELECT statement to its output variables. The output variables can be used as the input variables of the downstream node. | Configure an SQL assignment node |
Conditional Branch | Makes conditional judgment in task flows. During the execution of a task flow, if the conditional expression of a conditional branch node evaluates to true, the subsequent tasks are run. Otherwise, the subsequent tasks are not run. | Configure a conditional branch node | |
Script | Uses Database Gateway-based script tasks to execute scripts periodically or at a specific point in time. | Configure a script node | |
Status check | Check Whether Data Exists in Table After Specified Time | Checks whether incremental data exists in a table after a specific point in time. | N/A |
Lindorm File Check | Checks whether a file exists in an ApsaraDB for Lindorm instance that supports Hadoop Distributed File System (HDFS). | N/A | |
Audit Task | Checks the data quality of a table. After you specify a quality rule for the table and a scheduling cycle for the audit task, DMS checks the data quality of the table and generates a report. | N/A | |
Check for Task Flow Dependency | Configures self-dependency for a task flow and dependencies across task flows. You can enable the task flow to depend on another task flow or a task node. | N/A |