Auto scaling is a service that can dynamically scale computing resources to meet your business requirements. Auto scaling provides a more cost-effective method to manage your resources. This topic introduces auto scaling and the related components.
Background information
Auto scaling is widely used in Container Service for Kubernetes (ACK) clusters. Typically,
auto scaling is used in scenarios such as online workload scaling, large-scale computing
and training, GPU-accelerated deep learning, inference and training based on shared
GPU resources, and periodical workload scheduling. Auto scaling enables elasticity
from the following aspects:
- Workload scaling. Auto scaling can scale workloads, such as pods. For example, Horizontal Pod Autoscaler (HPA) is a typical workload scaling component that can change the number of replicated pods to scale the workload.
- Resource scaling. If the resources of a cluster cannot meet the scaling requirements of workloads, Elastic Compute Service (ECS) instances or elastic container instances are added to the cluster.
Introduction to auto scaling
Components for workload scalingComponent | Description | Scenario | Limits | References |
---|---|---|---|---|
HPA | A built-in component of Kubernetes. HPA is used for online applications. | Online applications | HPA uses Deployments and StatefulSets to scale workloads. | Horizontal Pod Autoscaling |
Vertical Pod Autoscaler (VPA) | An open source component. VPA is used for monolithic applications. | Monolithic applications | VPA is used for applications that cannot be horizontally scaled. Typically, VPA is used when pods are recovered from anomalies. | Vertical pod auto scaling |
CronHPA | An open source component provided by ACK. CronHPA is used for applications whose resource usage periodically changes. | Periodically fluctuating workloads | CronHPA uses Deployments and StatefulSets to scale workloads. CronHPA is compatible with HPA. You can use CronHPA and HPA in combination to scale workloads. | CronHPA |
Elastic-Workload | A component provided by ACK. Elastic-Workload is used in scenarios where fine-grained scaling is required. For example, you can use Elastic-Workload if you want to distribute a workload across different zones. | Scenarios where fine-grained scaling is required | Elastic-Workload is applicable to online workloads that require fine-grained scaling. For example, some pods of a Deployment are scheduled to an ECS instance, and the rest of the pods are scheduled to elastic container instances. | Install ack-kubernetes-elastic-workload |
Component | Description | Scenario | Time cost for delivery | References |
---|---|---|---|---|
cluster-autoscaler | cluster-autoscaler is an open source component provided by Kubernetes that can scale nodes in a cluster. cluster-autoscaler is integrated with auto scaling to provide more elastic and cost-effective scaling services. | cluster-autoscaler is applicable to all scenarios, especially online workloads, deep learning, and large-scale computing. | The amount of time that is required to add 1,000 nodes to a cluster:
|
Auto scaling of nodes |