Horizontal Pod Autoscaler (HPA) is a component that can automatically scale the number of pods in Kubernetes clusters. This topic provides answers to some frequently asked questions about HPA.
If a FailedGetResourceMetric warning is returned in the current status of HPA, as shown in the following code, it indicates that kube-controller-manager cannot collect monitoring metrics from the data source.
Name: kubernetes-tutorial-deployment Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: Mon, 10 Jun 2019 11:46:48 0530 Reference: Deployment/kubernetes-tutorial-deployment Metrics: ( current / target ) resource cpu on pods (as a percentage of request): <unknown> / 2% Min replicas: 1 Max replicas: 4 Deployment pods: 1 current / 0 desired Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedGetResourceMetric 3m3s (x1009 over 4h18m) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
- Cause 1: The data source from which metrics are collected is unavailable.
kubectl top podcommand to check whether metrics of pods are returned. If no metric data is returned, run the
kubectl get apiservicecommand to check whether the data source is available.
NAME SERVICE AVAILABLE AGE v1. Local True 29h v1.admissionregistration.k8s.io Local True 29h v1.apiextensions.k8s.io Local True 29h v1.apps Local True 29h v1.authentication.k8s.io Local True 29h v1.authorization.k8s.io Local True 29h v1.autoscaling Local True 29h v1.batch Local True 29h v1.coordination.k8s.io Local True 29h v1.monitoring.coreos.com Local True 29h v1.networking.k8s.io Local True 29h v1.rbac.authorization.k8s.io Local True 29h v1.scheduling.k8s.io Local True 29h v1.storage.k8s.io Local True 29h v1alpha1.argoproj.io Local True 29h v1alpha1.fedlearner.k8s.io Local True 5h11m v1beta1.admissionregistration.k8s.io Local True 29h v1beta1.alicloud.com Local True 29h v1beta1.apiextensions.k8s.io Local True 29h v1beta1.apps Local True 29h v1beta1.authentication.k8s.io Local True 29h v1beta1.authorization.k8s.io Local True 29h v1beta1.batch Local True 29h v1beta1.certificates.k8s.io Local True 29h v1beta1.coordination.k8s.io Local True 29h v1beta1.events.k8s.io Local True 29h v1beta1.extensions Local True 29h ... [v1beta1.metrics.k8s.io kube-system/metrics-server True 29h] ... v1beta1.networking.k8s.io Local True 29h v1beta1.node.k8s.io Local True 29h v1beta1.policy Local True 29h v1beta1.rbac.authorization.k8s.io Local True 29h v1beta1.scheduling.k8s.io Local True 29h v1beta1.storage.k8s.io Local True 29h v1beta2.apps Local True 29h v2beta1.autoscaling Local True 29h v2beta2.autoscaling Local True 29h
If the apiservice for v1beta1.metrics.k8s.io is not metrics-server in the kube-system namespace, check whether metrics-server is overwritten by prometheus-operator. If metrics-server is overwritten by prometheus-operator, use the following YAML template to redeploy metrics-server:
apiVersion: apiregistration.k8s.io/v1beta1 kind: APIService metadata: name: v1beta1.metrics.k8s.io spec: service: name: metrics-server namespace: kube-system group: metrics.k8s.io version: v1beta1 insecureSkipTLSVerify: true groupPriorityMinimum: 100 versionPriority: 100
If no error is found after you have performed the preceding checks, refer to the troubleshooting content of metrics-server in the References section.
- Cause 2: Metrics cannot be collected during a rolling update or scale-out activity.
By default, metrics-server collects metrics at intervals of 1 second. However, metrics-server must wait a few seconds before it can collect metrics after a rolling update or scale-out activity. We recommend that you query metrics 2 seconds after a rolling update or scale-out activity.
- Cause 3: The
requestfield is not specified in the pod configurations. By default,
HPA obtains the CPU or memory usage of the pod by calculating the value of
used resources/requested resources. If the requested resources are not specified in the pod configurations, HPA cannot calculate the resource usage. Therefore, you must ensure that the request field is specified in the pod configurations.
## Add the following configuration to the startup settings of metrics-server. --enable-hpa-rolling-update-skipped=true
--metric_resolutionparameter in the startup settings. Example:
scaleTargetRefto the scaling object of HPA. This way, only HPA scales the application that is specified by
scaleTargetRef. This also enables CronHPA to detect the state of HPA. CronHPA does not directly change the number of pods for the Deployment. It triggers HPA to scale the pods. This resolves the conflict between CronHPA and HPA. For more information about how to enable CronHPA and HPA to interact without conflicts, see CronHPA in the References section.
The following YAML template provides the sample pod configurations that prevent HPA from triggering scaling activities in this case.
## A Deployment is used in this example. apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment-basic labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx annotations: HPAScaleUpDelay: 3m # This setting indicates that HPA takes effect 3 minutes after the pods are created. Valid units: s and m. s indicates seconds and m indicates minutes. spec: containers: - name: nginx image: nginx:1.7.9 # Replace it with your exactly <image_name:tags>. ports: - containerPort: 80