Auto scaling is a service that can dynamically scale computing resources to meet your business requirements. Auto scaling provides a more cost-effective method to manage your resources. This topic introduces auto scaling and the related components.

Background

Auto scaling is widely used in Container Service for Kubernetes (ACK) clusters. Typically, auto scaling is used in scenarios such as online workload scaling, large-scale computing and training, GPU-accelerated deep learning, inference and training based on shared GPU resources, and periodical workload scheduling. Auto scaling enables elasticity from the following aspects:
  • Workload scaling. Auto scaling can adjust workloads, such as pods. For example, Horizontal Pod Autoscaler (HPA) is a typical workload scaling component that can change the number of replicated pods to scale the workload.
  • Resource scaling. If the resources of a cluster cannot meet the scaling requirements of workloads, Elastic Compute Service (ECS) instances or elastic container instances are added to the cluster.
The components for workload scaling and resource scaling can be used separately or in combination. If you want to decouple the components, you must scale the workload within the resource limit of the cluster.

Scaling components for ACK clusters

Auto scalingComponents for workload scaling
ComponentDescriptionScenarioLimitsReferences
HPAA built-in component of Kubernetes. HPA is used for online applications. Online businessHPA uses Deployments and StatefulSets to scale workloads. HPA
VPA (alpha)An open source component. Vertical Pod Autoscaler (VPA) is used for monolithic applications. Monolithic applicationsVPA is used for applications that cannot be horizontally scaled. Typically, VPA is used when pods are recovered from anomalies. Vertical pod autoscaling
CronHPAAn open source component provided by ACK. CronHPA is used for applications whose resource usage periodically changes. Periodically changing workloadsCronHPA uses Deployments and StatefulSets to scale workloads. CronHPA is compatible with HPA. You can use CronHPA and HPA in combination to scale workloads. CronHPA
Elastic-WorkloadA component provided by ACK. Elastic-Workload is used in scenarios where fine-grained scaling is required. For example, you can use Elastic-Workload if you want to distribute a workload across different zones. Scenarios where fine-grained scaling is requiredElastic-Workload is applicable to online workloads that require fine-grained scaling. For example, some pods of a Deployment are scheduled to an ECS instance, and the rest of the pods are scheduled to elastic container instances. Install ack-kubernetes-elastic-workload
Components for resource scaling
ComponentDescriptionScenarioTime cost for deliveryReferences
cluster-autoscalercluster-autoscaler is an open source component provided by Kubernetes that can scale nodes in a cluster. cluster-autoscaler is integrated with auto scaling to provide more elastic and cost-effective scaling services. Kubernetes Autoscaler is applicable especially for online workloads, deep learning tasks, and large-scale computing tasks. The time required to add 100 nodes to a cluster:
  • Standard mode: 120 seconds.
  • Fast mode: 60 seconds.
  • Standard mode with images that support quick boot (Qboot): 90 seconds.
  • Fast mode with images that support Qboot: 45 seconds.

    For more information about images that support Qboot, see Alibaba Cloud Linux 2 (Quick Start).

Auto scaling of nodes
virtual-nodevirtual-node is an open source component provided by ACK. virtual-node provides the runtime for serverless applications. Developers do not need to handle node resources and only need to create, manage, and pay for pods based on the actual usage. virtual-node is used to handle traffic spikes, continuous integration and continuous delivery (CD/CD), and big data computing. The time required to create 1,000 pods in a cluster:
  • When image caching is disabled: 30 seconds.
  • When image caching is enabled: 15 seconds.
Deploy the virtual node controller and use it to create Elastic Container Instance-based pods
virtual-kubelet-autoscalervirtual-kubelet-autoscaler is a component provided by ACK. virtual-kubelet-autoscaler is used to scale serverless applications. virtual-node is used to handle traffic spikes, CD/CD, and big data computing. The time required to create 1,000 pods in a cluster:
  • When image caching is disabled: 30 seconds.
  • When image caching is enabled: 15 seconds.
Install virtual-kubelet-autoscaler

Logs of auto scaling activities

For more information about how to collect the logs of auto scaling activities, see Collect log files of system components.