Auto scaling is a feature that can dynamically scale computing resources to meet your business requirements. Auto scaling provides a more cost-effective method to manage your resources. This topic introduces auto scaling and the related components.
Background information
Auto scaling is widely used in Kubernetes. It is typically used in scenarios such as online workload scaling and periodic workload scheduling. Container Compute Service (ACS) supports Horizontal Pod Autoscaler (HPA), Cron Horizontal Pod Autoscaler (CronHPA), Advanced Horizontal Pod Autoscaler (AHPA), Kubernetes Event-driven Autoscaling (KEDA), and Automatic Vertical Pod Autoscaler (AVPA). It dynamically adjusts the number of application replicas or resource specifications by monitoring workloads or setting a schedule to ensure efficient resource usage and service stability.
Auto scaling components
Component | Description | Use scenario | Limit | References |
HPA | A built-in component of Kubernetes. HPA is used for online applications. | Online businesses | You can use the CronHPA to scale workloads that support the scale operation, such as Deployments and StatefulSets. | |
CronHPA | An open source component. CronHPA is applicable to applications whose resource usage periodically changes. | Periodically changing workloads | The CronHPA uses Deployments and StatefulSets to scale workloads. The CronHPA is compatible with the HPA. You can use the CronHPA and HPA in combination to scale workloads. | |
AHPA | An open source component intended for workloads that fluctuate periodically. Example: livestreaming, online education, and gaming services. | Periodically fluctuating workloads | You can use the CronHPA to scale workloads that support the scale operation, such as Deployments and StatefulSets. In addition, AHPA requires historical data within at least seven days to perform predictive scaling. | |
KEDA | An open source component suitable for scenarios such as offline audio and video transcoding, event-driven jobs, and stream data processing. | Event-driven workloads | You can use KEDA to scale workloads that support the scale operation, such as Deployments and StatefulSets. | |
AVPA | A marketplace component mainly for stateful workloads or jobs that are not suitable for horizontal scaling, such as gaming services and offline jobs, along with businesses that need dynamic adjustment due to improper specification settings. | Application startup acceleration Short-term load fluctuation workloads | AVPA is applicable to all workloads. AVPA is suitable for short-term load changes that last 10 minutes or longer and currently supports only CPU resource adjustment. AVPA is not suitable for second-level load burst changes or memory resource adjustment. |