This topic provides answers to some frequently asked questions about auto scaling in Container Service for Kubernetes (ACK).
What do I do if the current field shows unknown in the HPA metrics?
Name: kubernetes-tutorial-deployment
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 10 Jun 2019 11:46:48 0530
Reference: Deployment/kubernetes-tutorial-deployment
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 2%
Min replicas: 1
Max replicas: 4
Deployment pods: 1 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 3m3s (x1009 over 4h18m) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
- Cause 1: The data sources from which resource metrics are fetched are unavailable.
Run the
kubectl top pod
command to check whether metric data of monitored pods is returned. If no metric data is returned, run thekubectl get apiservice
command to check whether the metrics-server component is available. The following output shows an example of the returned data:NAME SERVICE AVAILABLE AGE v1. Local True 29h v1.admissionregistration.k8s.io Local True 29h v1.apiextensions.k8s.io Local True 29h v1.apps Local True 29h v1.authentication.k8s.io Local True 29h v1.authorization.k8s.io Local True 29h v1.autoscaling Local True 29h v1.batch Local True 29h v1.coordination.k8s.io Local True 29h v1.monitoring.coreos.com Local True 29h v1.networking.k8s.io Local True 29h v1.rbac.authorization.k8s.io Local True 29h v1.scheduling.k8s.io Local True 29h v1.storage.k8s.io Local True 29h v1alpha1.argoproj.io Local True 29h v1alpha1.fedlearner.k8s.io Local True 5h11m v1beta1.admissionregistration.k8s.io Local True 29h v1beta1.alicloud.com Local True 29h v1beta1.apiextensions.k8s.io Local True 29h v1beta1.apps Local True 29h v1beta1.authentication.k8s.io Local True 29h v1beta1.authorization.k8s.io Local True 29h v1beta1.batch Local True 29h v1beta1.certificates.k8s.io Local True 29h v1beta1.coordination.k8s.io Local True 29h v1beta1.events.k8s.io Local True 29h v1beta1.extensions Local True 29h ... [v1beta1.metrics.k8s.io kube-system/metrics-server True 29h] ... v1beta1.networking.k8s.io Local True 29h v1beta1.node.k8s.io Local True 29h v1beta1.policy Local True 29h v1beta1.rbac.authorization.k8s.io Local True 29h v1beta1.scheduling.k8s.io Local True 29h v1beta1.storage.k8s.io Local True 29h v1beta2.apps Local True 29h v2beta1.autoscaling Local True 29h v2beta2.autoscaling Local True 29h
If the apiservice for v1beta1.metrics.k8s.io is not kube-system/metrics-server, check whether metrics-server is overwritten by Prometheus Operator. If metrics-server is overwritten by Prometheus Operator, use the following YAML template to redeploy metrics-server:apiVersion: apiregistration.k8s.io/v1beta1 kind: APIService metadata: name: v1beta1.metrics.k8s.io spec: service: name: metrics-server namespace: kube-system group: metrics.k8s.io version: v1beta1 insecureSkipTLSVerify: true groupPriorityMinimum: 100 versionPriority: 100
If the issue still exists, see Troubleshooting.
- Cause 2: Metrics cannot be fetched during a rolling update or scale-out activity.
By default, metrics-server fetches metrics at intervals of 1 second. However, metrics-server must wait a few seconds before it can collect metrics after a rolling update or scale-out activity. We recommend that you query metrics 2 seconds after a rolling update or scale-out activity.
- Cause 3: The request field is not specified for the pod.
By default, HPA obtains the CPU or memory usage of the pod by calculating the value of used resources/requested resources. If the requested resources are not specified in the pod configurations, HPA cannot calculate the resource usage. Therefore, you must ensure that the request field is specified in the resource parameter of the pod configurations.
What do I do if excess pods are added by HPA during a rolling update?
## Add the following parameter to the startup settings of metrics-server.
--enable-hpa-rolling-update-skipped=true
What do I do if HPA does not scale pods when the scaling threshold is reached?
HPA may not scale pods even if the CPU or memory usage drops below the scale-in threshold or exceeds the scale-out threshold. HPA also takes other factors into consideration when it scales pods. For example, HPA checks whether the current scale-out activity triggers a scale-in activity or the current scale-in activity triggers a scale-out activity. This avoids repetitive scaling and prevents unnecessary resource consumption.
How do I configure the metric collection interval of HPA?
For metrics-server versions later than 0.2.1-b46d98c-aliyun, specify the --metric_resolution
parameter in the startup settings. Example: --metric_resolution=15s
.
Can CronHPA and HPA interact without conflicts?
CronHPA and HPA can interact without conflicts. ACK modifies the CronHPA configurations by setting scaleTargetRef to the scaling object of HPA. This way, only HPA scales the application that is specified by scaleTargetRef. This also enables CronHPA to detect the state of HPA. CronHPA does not directly change the number of pods for the Deployment. CronHPA triggers HPA to scale the pods. This way, conflicts between CronHPA and HPA are resolved. For more information about how to enable CronHPA and HPA to interact without conflicts, see CronHPA.
How do I fix the issue that excess pods are added by HPA when CPU or memory usage rapidly increases?
When the pods of Java applications or applications powered by Java frameworks start to run, the CPU and memory usage may be high for a few minutes during the warm-up period. This may trigger HPA to scale out the pods. To fix this issue, upgrade metrics-server to the latest version and add annotations to the pod configurations to prevent HPA from triggering scaling activities. For more information about how to upgrade metrics-server, see Install the metrics-server component.
## A Deployment is used in this example.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment-basic
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
annotations:
HPAScaleUpDelay: 3m # This setting indicates that HPA takes effect 3 minutes after the pods are created. Valid units: s and m. s indicates seconds and m indicates minutes.
spec:
containers:
- name: nginx
image: nginx:1.7.9 # Replace it with your exactly <image_name:tags>.
ports:
- containerPort: 80
FAQ about node auto scaling
- Why does the auto scaling component fail to add nodes after a scale-out event is triggered?
Check whether the following situations exist:
- The instance types in the scaling group cannot fulfill the resource request from pods. By default, system components are installed for each node. Therefore, the requested pod resources must be less than the resource capacity of the instance type.
- The worker RAM role does not have the required permissions. You must configure RAM roles for each Kubernetes cluster that is involved in the scale-out activity.
- Why does the auto scaling component fail to remove nodes after a scale-in event is
triggered?
Check whether the following situations exist:
- The requested resource threshold of each pod is higher than the configured scale-in threshold.
- Pods that belong to the kube-system namespace are running on the node.
- A hard scheduling policy is configured to force the pods to run on the current node. Therefore, the pods cannot be scheduled to other nodes.
- PodDisruptionBudget is set for the pods on the node and the minimum value of PodDisruptionBudget is reached.
For more information about FAQ, see open source component.
- How does the system choose a scaling group for a scaling event?
When pods cannot be scheduled to nodes, the auto scaling component simulates the scheduling of the pods based on the configurations of scaling groups. The configurations include labels, taints, and instance specifications. If a scaling group can simulate the scheduling of the pods, this scaling group is selected for the scale-out activity. If more than one scaling groups meet the requirements, the system selects the scaling group that has the fewest idle resources after simulation.
FAQ about HPA based on Alibaba Cloud metrics
-
What do I do if the TARGETS column shows <unknow> after I run the
kubectl get hpa
command?Perform the following operations to troubleshoot the issue:- Run the
kubectl describe hpa <hpa_name>
command to check why HPA does not function as normal.- If the value of AbleToScale is False in the Conditions field, check whether the Deployment is created as normal.
- If the value of ScalingActive is False in the Conditions field, proceed to the next step.
- Run the
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/"
command. If Error from server (NotFound): the server could not find the requested resource is returned, verify the status of alibaba-cloud-metrics-adapter.If the status of alibaba-cloud-metrics-adapter is normal, check whether the HPA metrics are relevant to the Ingress. If the metrics are relevant to the Ingress, make sure that you deploy the Log Service component before ack-alibaba-cloud-metrics-adapter is deployed. For more information, see Monitor nginx-ingress and analyze the access log of nginx-ingress.
- Make sure that the values of the HPA metrics are valid. The value of sls.ingress.route must be in the
<namespace>-<svc>-<port>
format.<namespace>
specifies the namespace to which the Ingress belongs.<svc>
specifies the name of the Service that you selected when you created the Ingress.<port>
specifies the port of the Service.
- Run the
-
Where can I find the metrics that are supported by HPA?
For more information about the metrics that are supported by HPA, see Alibaba Cloud metrics adapter. The following table describes the commonly used metrics.Metric Description Additional parameter sls_ingress_qps The number of requests that the Ingress can process per second based on a specific routing rule. sls.ingress.route sls_ingress_latency_avg The average latency of all requests. sls.ingress.route sls_ingress_latency_p50 The maximum latency for the fastest 50% of all requests. sls.ingress.route sls_ingress_latency_p95 The maximum latency for the fastest 95% of all requests. sls.ingress.route sls_ingress_latency_p99 The maximum latency for the fastest 99% of all requests. sls.ingress.route sls_ingress_latency_p9999 The maximum latency for the fastest 99.99% of all requests. sls.ingress.route sls_ingress_inflow The inbound bandwidth of the Ingress. sls.ingress.route