You can create an application that has Horizontal Pod Autoscaling (HPA) enabled in
the Container Service for Kubernetes (ACK) console. HPA can automatically scale container
resources for your application. You can also use a YAML file to configure HPA settings.
Background information
In Kubernetes 1.18 and later versions, the v2beta2 API version allows you to use the
behavior
parameter of HPA to configure scaling settings. You can specify the scaleUp
and scaleDown
fields in the behavior
parameter to specify scale-out and scale-in settings. If you want HPA to perform
only scale-out operations or only scale-in operations, you can enable HPA in the ACK
console and then click Disable Scale-in or Disable Scale-out.
Create an application that has HPA enabled in the ACK console
ACK is integrated with HPA. You can create an application that has HPA enabled in
the ACK console. You can enable HPA when you create an application or after the application
is created.
Method 1: Enable HPA when you create an application
- Log on to the ACK console.
- In the left-side navigation pane of the ACK console, click Clusters.
- On the Clusters page, find the cluster that you want to manage and click the name of the cluster
or click Details in the Actions column. The details page of the cluster appears.
- In the left-side navigation pane of the details page, choose .
- On the Deployments page, click Create from Image.
- On the Basic Information wizard page, enter a name for your application, set the parameters, and then click
Next.
Parameter |
Description |
Namespace |
Select the namespace to which the application belongs. The default namespace is automatically
selected.
|
Name |
Enter a name for the application. |
Replicas |
The number of pods that you want to provision for the application. Default value:
2.
|
Type |
The type of the application. You can select Deployment, StatefulSet, Job, CronJob, or DaemonSet.
|
Label |
Add labels to the application. The labels are used to identify the application. |
Annotations |
Add annotations to the application. |
Synchronize Timezone |
This parameter is supported only by ACK clusters. Serverless Kubernetes (ASK) clusters
do not support this parameter. This parameter specifies whether to synchronize the
time zone between nodes and containers.
|
- On the Container wizard page, set the container parameters, select an image, and then configure the
required computing resources. Click Next. For more information, see Configure containers.
Note You must configure the computing resources required by the Deployment. Otherwise,
you cannot enable HPA.
- On the Advanced wizard page, click Create to the right of Services in the Access Control section, and then set the parameters. For more information, see Configure advanced settings.
- On the Advanced wizard page, select Enable for HPA and configure the scaling threshold and related settings.
Parameter |
Description |
Metric |
Select CPU Usage or Memory Usage. The selected resource type must be the same as that
specified in the Required Resources field.
|
Condition |
Specify the resource usage threshold. HPA triggers scale-out events when the threshold
is exceeded. For more information about the algorithms that are used to perform horizontal
pod autoscaling, see Algorithm details.
|
Max. Replicas |
Specify the maximum number of pods to which the Deployment can be scaled. |
Min. Replicas |
Specify the minimum number of pods that must run for the Deployment. |
Disable Scale-in |
If you want HPA to perform only scale-out operations or only scale-in operations,
click Disable Scale-in or Disable Scale-out. For more information, see the Background information.
|
Disable Scale-out |
- In the lower-right corner of the Advanced wizard page, click Create to create the application that has HPA enabled.
Verify the result
-
Click View Details or choose . On the page that appears, click the name of the created application or click Details in the Actions column. Then, click the Pod Scaling tab to view information about the scaling group of the application.
- After the application starts running, container resources are automatically scaled
based on the CPU utilization of the application. You can also check whether HPA is
enabled in the staging environment by performing a CPU stress test on the pods of
the application. Verify that the pods are automatically scaled within 30 seconds.
Method 2: Enable HPA for an existing application
The following example describes how to enable HPA for a Deployment.
- Log on to the ACK console.
- In the left-side navigation pane of the ACK console, click Clusters.
- On the Clusters page, find the cluster that you want to manage and click the name of the cluster
or click Details in the Actions column. The details page of the cluster appears.
- In the left-side navigation pane of the details page, choose .
- On the Deployments page, click the name of the application that you want to manage.
- Click the Pod Scaling tab and click Create.
- In the Create dialog box, configure the HPA settings. For more information about how to set the
parameters, see HPA settings in Step 9.
- Click OK.
Create an application that has HPA enabled by using kubectl
You can also create an HPA by using an orchestration template and associate the HPA
with the Deployment for which you want to enable HPA. Then, you can run kubectl commands to enable HPA.
In the following example, HPA is enabled for an NGINX application.
- Create a file named nginx.yml and copy the following content to the file.
Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9 # replace it with your exactly <image_name:tags>
ports:
- containerPort: 80
resources:
requests: ## This parameter is required to run the HPA.
cpu: 500m
- Run the following command to create an NGINX application:
kubectl create -f nginx.yml
- Create an HPA.
Use the
scaleTargetRef parameter to associate the HPA with the
nginx Deployment.
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Note You must configure resource requests for the pods of the application. Otherwise, the
HPA cannot be started.
- Run the
kubectl describe hpa name
command. The following output is an example of a warning that is returned: Warning FailedGetResourceMetric 2m (x6 over 4m) horizontal-pod-autoscaler missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5897-mqzs7
Warning FailedComputeMetricsReplicas 2m (x6 over 4m) horizontal-pod-autoscaler failed to get cpu utilization: missing request for cpu on container nginx in pod default/nginx-deployment-basic-75675f5
- After the HPA is created, run the
kubectl describe hpa name
command. If the following output is returned, the HPA is running as expected:
Normal SuccessfulRescale 39s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
If the CPU utilization of the pod of the NGINX application exceeds 50% as specified
in the HPA settings, the HPA automatically creates new pods. If the CPU utilization
of the pod of the NGINX application drops below 50%, the HPA automatically removes
pods.