Airflow is an open-source workflow orchestration and scheduling tool designed for big data development. It supports job development, Directed Acyclic Graph (DAG) scheduling, and batch-oriented workflow monitoring. Airflow allows you to define workflows in Python and provides the Python plugin to enable flexible integration with most external technologies and systems.
Currently, Airflow is in invitational preview. To use Airflow, contact DMS technical support.
Scenarios
Orchestrate data development tasks in Data Management (DMS), such as SQL execution and data purging.
Schedule batch processing tasks for AnalyticDB for MySQL Spark.
Schedule data integration tasks in Data Transmission Service (DTS).
Precautions
Airflow is available only in the Singapore region.
Overview
Prepare resources required for Airflow.
Prepare ApsaraDB RDS for PostgreSQL, Tair (Redis OSS-compatible), and Object Storage Service (OSS) instances, and enable Internet access for Airflow.
Create a dedicated Git account to perform operations in the code repository. These operations are invisible to other users (including Alibaba Cloud accounts and RAM users) in the same workspace.
Create a code repository in Git or other version control systems to store and manage Airflow DAG files.
Perform operations in the repository, such as editing and publishing code.
View Airflow execution status.
View the status of DAG scheduling and batch-oriented workflow monitoring in the Airflow workspace.