Kubernetes clusters for distributed Argo workflows (workflow clusters or Serverless Argo Workflows) are deployed on a serverless architecture. This cluster type uses Alibaba Cloud Container Compute Service (ACS) or Elastic Container Instance (ECI) to run Argo workflows. It optimizes the performance of the open-source Workflow Engine and adjusts cluster configurations for efficient, elastic, and cost-effective scheduling of large-scale workflows. Additionally, it leverages BestEffort instances or preemptible elastic container instances to optimize costs. This topic describes the console, benefits, architecture, and network design of workflow clusters.
Console
Scenarios
Argo Workflows is a powerful cloud-native workflow engine and a graduated project of Cloud Native Computing Foundation (CNCF). Graduation indicates that the project meets the highest maturity level in user adoption, security, and widespread use. Its primary use cases include batch data processing, machine learning pipelines, infrastructure automation, and CI/CD pipeline. It is widely implemented in industries such as autonomous driving, scientific computing, financial quantization, and digital media.
Argo Workflows stands out in the field of batch task orchestration due to the following key features:
Cloud-native: Designed specifically for Kubernetes, where each task is a pod, making it the most popular workflow engine on Kubernetes.
Lightweight and scalable: Lightweight with no VM overhead. It is elastically scalable, capable of launching thousands of tasks in parallel.
Powerful orchestration capabilities: Various types of tasks can be orchestrated, including regular jobs, Spark jobs, Ray jobs, and Tensor jobs.
Serverless Argo Workflows benefits
Workflow clusters are developed based on open source Argo Workflows and comply with the standards of open source workflows. If you have Argo workflows running in existing Container Service for Kubernetes (ACK) clusters or Kubernetes clusters, you can seamlessly upgrade the clusters to workflow clusters without modifying the workflows.
By using workflow clusters, you can easily manage workflow orchestration and run each workflow step in containers. This builds a high-efficiency CI/CD pipeline that allows you to quickly launch many containers for compute-intensive jobs such as machine learning and data processing jobs.
Workflow clusters are developed based on open-source Argo Workflows. You can seamlessly upgrade Kubernetes clusters that run Argo workflows to workflow clusters without modifying the workflows.
Workflow clusters are ready to use out of the box, require zero operational overhead, and let you focus on workflow development without worrying about version upgrades.
Workflow clusters offer extreme elasticity and auto-scaling capabilities. Resources are released after use to minimize compute costs.
Workflow clusters support high scheduling reliability and multi-zone load balancing.
Workflow clusters use control planes whose performance, efficiency, stability, and observability are optimized.
Workflow clusters support enhanced OSS management capabilities, such as large object uploading, artifacts garbage collection (GC), and data streaming.
Technical support from the community is available to assist your teams in optimizing workflows, effectively improving performance and reducing costs.
Architecture
Workflow clusters are serverless workflow engines built on Kubernetes clusters that host the open-source Argo Workflows.
Network design
Workflow clusters are available in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Zhangjiakou), China (Heyuan), China (Guangzhou), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Japan (Tokyo), Germany (Frankfurt), UK (London), and Thailand (Bangkok). To use workflow clusters in other regions, join the DingTalk group 35688562 for technical support from product technical experts.
Create a virtual private cloud (VPC) or select an existing VPC.
Create vSwitches or select existing vSwitches.
Ensure that the CIDR blocks of the vSwitches you use can provide sufficient IP addresses for Argo workflows. Argo workflows may create many pods, each requiring an IP address from your vSwitches.
Create a vSwitch in each zone of your selected region. When creating a workflow engine, specify multiple vSwitch IDs in the input parameters. After creation, the workflow engine automatically creates ACS pods or elastic container instances in zones with sufficient stock to run workflows. If ACS pods or elastic container instances are out of stock in all zones in your selected region, you cannot run workflows because elastic container instances cannot be created.