All Products
Search
Document Center

Data Management:Overview

Last Updated:Dec 11, 2023

The task orchestration feature of Data Management (DMS) is used to orchestrate task nodes and execute tasks based on a timed scheduling policy or an event scheduling policy that you specify. You can create a task flow that contains one or more task nodes to implement complex scheduling and improve data development efficiency.

Background information

The rapid development of the Internet and IoT causes a surge in data volume, the diversification of data sources, the increasing demand for data analysis, and the increasing complexity of business processes. Therefore, the traditional method of manually processing data can not meet the current business requirements. An automated solution is required to build appropriate workflows for data processing, analysis, backup, and other data-related requirements.

DMS provides the task orchestration feature that facilitates automatic data processing. This feature helps improve data development efficiency, reduce the error rate, and improve data value and reliability.

Supported database types

  • Relational databases

    • MySQL: ApsaraDB RDS for MySQL, PolarDB for MySQL, ApsaraDB MyBase for MySQL, PolarDB for Xscale, and MySQL databases from other sources

    • SQL Server: ApsaraDB RDS for SQL Server, ApsaraDB MyBase for SQL Server, and SQL Server databases from other sources

    • PostgreSQL: ApsaraDB RDS for PostgreSQL, PolarDB for PostgreSQL, ApsaraDB MyBase for PostgreSQL, and PostgreSQL databases from other sources

    • OceanBase: ApsaraDB for OceanBase in MySQL mode, ApsaraDB for OceanBase in Oracle mode, and self-managed OceanBase databases

    • PolarDB for PostgreSQL (Compatible with Oracle)

    • Oracle

    • Dameng (DM)

    • Db2

  • NoSQL database: Lindorm

  • Data warehouses:

    • AnalyticDB for MySQL

    • AnalyticDB for PostgreSQL

    • Data Lake Analytics (DLA)

    • MaxCompute

    • Hologres

  • Object storage: Object Storage Service (OSS)

Task node types

Category

Task node type

Description

References

Data processing

Single Instance SQL

Executes SQL statements in a specific relational database.

Note

If you enable the lock-free schema change feature for the specified database instance, DMS preferentially applies this feature when you run Single Instance SQL tasks. This prevents tables from being locked. For more information, see Enable the lock-free schema change feature.

N/A

General operations

SQL Assignment for Single Instance

Assigns the data that is obtained by using the SELECT statement to the output variables of the current node. The output variables can be used as the input variables of the downstream node.

Configure an SQL assignment node

Conditional Branch

Makes conditional judgment in task flows. During the execution of a task flow, if the conditional expression of a conditional branch node evaluates to true, the subsequent tasks are run. Otherwise, the subsequent tasks are not run.

Configure a conditional branch node

ECS Remote Commands

Runs shell, PowerShell, or batch scripts on a remote Elastic Compute Service (ECS) instance by using Cloud Assistant.

Configure an ECS remote commands node

Status checking

Check Whether Data Exists in Table After Specified Time

Checks whether incremental data exists in a table after a specific point in time.

N/A

Audit Task

Checks the data quality of a table. After you specify a quality rule for the table and a scheduling cycle for the audit task, DMS checks the data quality of the table and generates a report.

N/A

Check for Task Flow Dependency

Configures self-dependency for a task flow and dependencies across task flows. You can configure the task flow to depend on another task flow or a task node.

Configure a dependency check node for a task flow

References