Horizontal Pod Autoscaler (HPA) is a component that can automatically scale the number of pods in Kubernetes clusters. This topic provides answers to some frequently asked questions about HPA.

If a FailedGetResourceMetric warning is returned in the current status of HPA, as shown in the following code, it indicates that kube-controller-manager cannot collect monitoring metrics from the data source.

Name:                                                  kubernetes-tutorial-deployment
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Mon, 10 Jun 2019 11:46:48  0530
Reference:                                             Deployment/kubernetes-tutorial-deployment
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 2%
Min replicas:                                          1
Max replicas:                                          4
Deployment pods:                                       1 current / 0 desired
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
  Type     Reason                   Age                      From                       Message
  ----     ------                   ----                     ----                       -------
  Warning  FailedGetResourceMetric  3m3s (x1009 over 4h18m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)

Possible causes:

  • Cause 1: The data source from which metrics are collected is unavailable.

    Run the kubectl top pod command to check whether metrics of pods are returned. If no metric data is returned, run the kubectl get apiservice command to check whether the data source is available.

    Sample output:

    NAME                                   SERVICE                      AVAILABLE   AGE
    v1.                                    Local                        True        29h
    v1.admissionregistration.k8s.io        Local                        True        29h
    v1.apiextensions.k8s.io                Local                        True        29h
    v1.apps                                Local                        True        29h
    v1.authentication.k8s.io               Local                        True        29h
    v1.authorization.k8s.io                Local                        True        29h
    v1.autoscaling                         Local                        True        29h
    v1.batch                               Local                        True        29h
    v1.coordination.k8s.io                 Local                        True        29h
    v1.monitoring.coreos.com               Local                        True        29h
    v1.networking.k8s.io                   Local                        True        29h
    v1.rbac.authorization.k8s.io           Local                        True        29h
    v1.scheduling.k8s.io                   Local                        True        29h
    v1.storage.k8s.io                      Local                        True        29h
    v1alpha1.argoproj.io                   Local                        True        29h
    v1alpha1.fedlearner.k8s.io             Local                        True        5h11m
    v1beta1.admissionregistration.k8s.io   Local                        True        29h
    v1beta1.alicloud.com                   Local                        True        29h
    v1beta1.apiextensions.k8s.io           Local                        True        29h
    v1beta1.apps                           Local                        True        29h
    v1beta1.authentication.k8s.io          Local                        True        29h
    v1beta1.authorization.k8s.io           Local                        True        29h
    v1beta1.batch                          Local                        True        29h
    v1beta1.certificates.k8s.io            Local                        True        29h
    v1beta1.coordination.k8s.io            Local                        True        29h
    v1beta1.events.k8s.io                  Local                        True        29h
    v1beta1.extensions                     Local                        True        29h
    [v1beta1.metrics.k8s.io                 kube-system/metrics-server   True        29h]
    v1beta1.networking.k8s.io              Local                        True        29h
    v1beta1.node.k8s.io                    Local                        True        29h
    v1beta1.policy                         Local                        True        29h
    v1beta1.rbac.authorization.k8s.io      Local                        True        29h
    v1beta1.scheduling.k8s.io              Local                        True        29h
    v1beta1.storage.k8s.io                 Local                        True        29h
    v1beta2.apps                           Local                        True        29h
    v2beta1.autoscaling                    Local                        True        29h
    v2beta2.autoscaling                    Local                        True        29h

    If the apiservice for v1beta1.metrics.k8s.io is not metrics-server in the kube-system namespace, check whether metrics-server is overwritten by prometheus-operator. If metrics-server is overwritten by prometheus-operator, use the following YAML template to redeploy metrics-server:

    apiVersion: apiregistration.k8s.io/v1beta1
    kind: APIService
      name: v1beta1.metrics.k8s.io
        name: metrics-server
        namespace: kube-system
      group: metrics.k8s.io
      version: v1beta1
      insecureSkipTLSVerify: true
      groupPriorityMinimum: 100
      versionPriority: 100

    If no error is found after you have performed the preceding checks, refer to the troubleshooting content of metrics-server in the References section.

  • Cause 2: Metrics cannot be collected during a rolling update or scale-out activity.

    By default, metrics-server collects metrics at intervals of 1 second. However, metrics-server must wait a few seconds before it can collect metrics after a rolling update or scale-out activity. We recommend that you query metrics 2 seconds after a rolling update or scale-out activity.

  • Cause 3: The request field is not specified in the pod configurations. By default,

    HPA obtains the CPU or memory usage of the pod by calculating the value of used resources/requested resources. If the requested resources are not specified in the pod configurations, HPA cannot calculate the resource usage. Therefore, you must ensure that the request field is specified in the pod configurations.

During a rolling update, kube-controller-manager performs zero filling on pods whose monitoring data cannot be collected. This may add excess pods. To fix this issue, we recommend that you upgrade metrics-server to the latest version and configure the following startup settings of metrics-server:
## Add the following configuration to the startup settings of metrics-server.
HPA may not scale pods even if the CPU or memory usage drops below the scale-in threshold or exceeds the scale-out threshold. HPA also takes other factors into consideration when it scales pods. For example, it checks whether the current scale-out activity triggers a scale-in activity or the current scale-in activity triggers a scale-out activity. This avoids repetitive scaling and prevents unnecessary resource consumption.
For metrics-server 0.2.1-b46d98c-aliyun, specify the --metric_resolution parameter in the startup settings. Example: --metric_resolution=15s.
CronHPA and HPA can interact without conflicts. ACK modifies the CronHPA configurations by setting scaleTargetRef to the scaling object of HPA. This way, only HPA scales the application that is specified by scaleTargetRef. This also enables CronHPA to detect the state of HPA. CronHPA does not directly change the number of pods for the Deployment. It triggers HPA to scale the pods. This resolves the conflict between CronHPA and HPA. For more information about how to enable CronHPA and HPA to interact without conflicts, see CronHPA in the References section.
When the pods of Java applications or applications powered by Java frameworks start running, the CPU and memory usage may be high for a few minutes during the warm-up period. This may trigger HPA to scale out the pods. To fix this issue, we recommend that you upgrade metrics-server to the latest version and add annotations to the pod configurations to prevent HPA from triggering scaling activities in this case. For more information about how to upgrade metrics-server, see Install the metrics-server component in the References section.

The following YAML template provides the sample pod configurations that prevent HPA from triggering scaling activities in this case.

## A Deployment is used in this example.
apiVersion: apps/v1
kind: Deployment
  name: nginx-deployment-basic
    app: nginx
  replicas: 2
      app: nginx
        app: nginx
        HPAScaleUpDelay: 3m # This setting indicates that HPA takes effect 3 minutes after the pods are created. Valid units: s and m. s indicates seconds and m indicates minutes.
      - name: nginx
        image: nginx:1.7.9 # Replace it with your exactly <image_name:tags>.
        - containerPort: 80