You can deploy the vertical-pod-autoscaler component in a Container Service for Kubernetes (ACK) cluster. vertical-pod-autoscaler is a Vertical Pod Autoscaler (VPA). The VPA enables vertical auto scaling of pods. The VPA automatically sets limits on the resource usage of a cluster based on the pod resource usage. This way, ACK can schedule pods to nodes that have sufficient resources. The VPA also maintains the ratio of the resource request to the resource limit that you specify in the initial container configurations. This topic describes how to use a YAML file to enable vertical pod auto scaling.

Prerequisites

Make sure that the following operations are completed:
  • An ACK cluster is created and its Kubernetes version is later than 1.12. For more information, see Create an ACK managed cluster.
  • You are connected to the cluster by using a command-line tool. For more information, see Connect to ACK clusters by using kubectl.
  • VPA is uninstalled from the cluster. This avoids conflicts when you deploy a new version of the VPA.

Background information

Notice Vertical pod auto scaling is in testing. Use this feature with caution.
  • You can use the VPA to update the resource configurations of running pods. This feature is in testing. The configuration updates will lead to pod restart and recreation, and the pods may be scheduled to other nodes.
  • The VPA does not evict the pods that are not managed by replication controllers. For these pods, the Auto mode is equivalent to the Initial mode.
  • The VPA and the Horizontal Pod Autoscaler (HPA) cannot run at the same time. The HPA monitors the CPU and memory metrics. If the HPA monitors only custom or external resource metrics other than CPU and memory metrics, you can use the VPA in conjunction with the HPA.
  • The VPA uses an admission webhook as its admission controller. If other admission webhooks exist in the cluster, make sure that the admission webhooks do not conflict with the admission webhook of the VPA. The execution sequence of admission controllers is defined in the parameters of the API server.
  • The VPA can handle most out of memory (OOM) events, but may fail to handle OOM events in some scenarios.
  • The VPA performance is not tested in large-scale clusters.
  • The pod resource request that is modified by the VPA may exceed the upper limit of the actual resources, including node resources, idle resources, and resource quotas. In this case, a pod may enter the Pending state and fail to be scheduled. You can use the cluster autoscaler to mitigate the impact of this issue.
  • If multiple VPAs monitor the resource usage of a pod at the same time, some undefined behavior may occur.

Install vertical-pod-autoscaler

  1. Run the following command to create a CustomResourceDefinition (CRD) for vertical-pod-autoscaler.

    The CRD is used to improve the scalability of ACK clusters. For more information, see Extend the Kubernetes API with CustomResourceDefinitions.

    kubectl apply -f crd.yaml
    apiVersion: apiextensions.k8s.io/v1beta1
    kind: CustomResourceDefinition
    metadata:
      name: verticalpodautoscalers.autoscaling.k8s.io
      annotations:
        "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
    spec:
      group: autoscaling.k8s.io
      scope: Namespaced
      names:
        plural: verticalpodautoscalers
        singular: verticalpodautoscaler
        kind: VerticalPodAutoscaler
        shortNames:
          - vpa
      version: v1beta1
      versions:
        - name: v1beta1
          served: false
          storage: false
        - name: v1beta2
          served: true
          storage: true
        - name: v1
          served: true
          storage: false
      validation:
        # openAPIV3Schema is the schema for validating custom objects.
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              required: []
              properties:
                targetRef:
                  type: object
                updatePolicy:
                  type: object
                  properties:
                    updateMode:
                      type: string
                resourcePolicy:
                  type: object
                  properties:
                    containerPolicies:
                      type: array
                      items:
                        type: object
                        properties:
                          containerName:
                            type: string
                          controlledValues:
                            type: string
                            enum: ["RequestsAndLimits", "RequestsOnly"]
                          mode:
                            type: string
                            enum: ["Auto", "Off"]
                          minAllowed:
                            type: object
                          maxAllowed:
                            type: object
                          controlledResources:
                            type: array
                            items:
                              type: string
                              enum: ["cpu", "memory"]
    ---
    apiVersion: apiextensions.k8s.io/v1beta1
    kind: CustomResourceDefinition
    metadata:
      name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
      annotations:
        "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
    spec:
      group: autoscaling.k8s.io
      scope: Namespaced
      names:
        plural: verticalpodautoscalercheckpoints
        singular: verticalpodautoscalercheckpoint
        kind: VerticalPodAutoscalerCheckpoint
        shortNames:
          - vpacheckpoint
      version: v1beta1
      versions:
        - name: v1beta1
          served: false
          storage: false
        - name: v1beta2
          served: true
          storage: true
        - name: v1
          served: true
          storage: false
    
    apiVersion: apiextensions.k8s.io/v1
    kind: CustomResourceDefinition
    metadata:
      name: verticalpodautoscalers.autoscaling.k8s.io
      annotations:
        "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
    spec:
      group: autoscaling.k8s.io
      scope: Namespaced
      names:
        plural: verticalpodautoscalers
        singular: verticalpodautoscaler
        kind: VerticalPodAutoscaler
        shortNames:
          - vpa
      versions:
        - name: v1beta1
          served: false
          storage: false
          schema:
            # openAPIV3Schema is the schema for validating custom objects.
            openAPIV3Schema:
              type: object
              properties:
                spec:
                  type: object
                  required: []
                  properties:
                    targetRef:
                      type: object
                    updatePolicy:
                      type: object
                      properties:
                        updateMode:
                          type: string
                    resourcePolicy:
                      type: object
                      properties:
                        containerPolicies:
                          type: array
                          items:
                            type: object
                            properties:
                              containerName:
                                type: string
                              controlledValues:
                                type: string
                                enum: ["RequestsAndLimits", "RequestsOnly"]
                              mode:
                                type: string
                                enum: ["Auto", "Off"]
                              minAllowed:
                                type: object
                              maxAllowed:
                                type: object
                              controlledResources:
                                type: array
                                items:
                                  type: string
                                  enum: ["cpu", "memory"]
        - name: v1beta2
          served: true
          storage: true
          schema:
            # openAPIV3Schema is the schema for validating custom objects.
            openAPIV3Schema:
              type: object
              properties:
                spec:
                  type: object
                  required: []
                  properties:
                    targetRef:
                      type: object
                    updatePolicy:
                      type: object
                      properties:
                        updateMode:
                          type: string
                    resourcePolicy:
                      type: object
                      properties:
                        containerPolicies:
                          type: array
                          items:
                            type: object
                            properties:
                              containerName:
                                type: string
                              controlledValues:
                                type: string
                                enum: ["RequestsAndLimits", "RequestsOnly"]
                              mode:
                                type: string
                                enum: ["Auto", "Off"]
                              minAllowed:
                                type: object
                              maxAllowed:
                                type: object
                              controlledResources:
                                type: array
                                items:
                                  type: string
                                  enum: ["cpu", "memory"]
        - name: v1
          served: true
          storage: false
          schema:
            # openAPIV3Schema is the schema for validating custom objects.
            openAPIV3Schema:
              type: object
              properties:
                spec:
                  type: object
                  required: []
                  properties:
                    targetRef:
                      type: object
                    updatePolicy:
                      type: object
                      properties:
                        updateMode:
                          type: string
                    resourcePolicy:
                      type: object
                      properties:
                        containerPolicies:
                          type: array
                          items:
                            type: object
                            properties:
                              containerName:
                                type: string
                              controlledValues:
                                type: string
                                enum: ["RequestsAndLimits", "RequestsOnly"]
                              mode:
                                type: string
                                enum: ["Auto", "Off"]
                              minAllowed:
                                type: object
                              maxAllowed:
                                type: object
                              controlledResources:
                                type: array
                                items:
                                  type: string
                                  enum: ["cpu", "memory"]
    
    ---
    apiVersion: apiextensions.k8s.io/v1
    kind: CustomResourceDefinition
    metadata:
      name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
      annotations:
        "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
    spec:
      group: autoscaling.k8s.io
      scope: Namespaced
      names:
        plural: verticalpodautoscalercheckpoints
        singular: verticalpodautoscalercheckpoint
        kind: VerticalPodAutoscalerCheckpoint
        shortNames:
          - vpacheckpoint
      versions:
        - name: v1beta1
          served: false
          storage: false
          schema:
            openAPIV3Schema:
              type: object
              properties:
                spec:
                  type: object
                  properties:
                    cronSpec:
                      type: string
                    image:
                      type: string
                    replicas:
                      type: integer
        - name: v1beta2
          served: true
          storage: true
          schema:
            openAPIV3Schema:
              type: object
              properties:
                spec:
                  type: object
                  properties:
                    cronSpec:
                      type: string
                    image:
                      type: string
                    replicas:
                      type: integer
        - name: v1
          served: true
          storage: false
          schema:
            openAPIV3Schema:
              type: object
              properties:
                spec:
                  type: object
                  properties:
                    cronSpec:
                      type: string
                    image:
                      type: string
                    replicas:
                      type: integer
    
  2. Install the components of vertical-pod-autoscaler.
    vertical-pod-autoscaler contains the following components: admission-controller, recommender, and updater.
    Note Before you install the admission-controller component, you must use the script to generate a certificate for a webhook.

    Use the following YAML template to install admission-controller:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: vpa-admission-controller
      namespace: kube-system
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: vpa-admission-controller
      template:
        metadata:
          labels:
            app: vpa-admission-controller
        spec:
          serviceAccountName: admin
          containers:
            - name: admission-controller
              image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.7.0
              imagePullPolicy: Always
              env:
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.namespace
              volumeMounts:
                - name: tls-certs
                  mountPath: "/etc/tls-certs"
                  readOnly: true
              resources:
                limits:
                  cpu: 200m
                  memory: 500Mi
                requests:
                  cpu: 50m
                  memory: 200Mi
              ports:
                - containerPort: 8000
          volumes:
            - name: tls-certs
              secret:
                secretName: vpa-tls-certs
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: vpa-webhook
      namespace: kube-system
    spec:
      ports:
        - port: 443
          targetPort: 8000
      selector:
        app: vpa-admission-controller
    

    Use the following YAML template to install recommender:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: vpa-recommender
      namespace: kube-system
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: vpa-recommender
      template:
        metadata:
          labels:
            app: vpa-recommender
        spec:
          serviceAccountName: admin
          containers:
          - name: recommender
            image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.7.0
            imagePullPolicy: Always
            resources:
              limits:
                cpu: 200m
                memory: 1000Mi
              requests:
                cpu: 50m
                memory: 500Mi
            ports:
            - containerPort: 8080
    

    Use the following YAML template to install updater:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: vpa-updater
      namespace: kube-system
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: vpa-updater
      template:
        metadata:
          labels:
            app: vpa-updater
        spec:
          serviceAccountName: admin
          containers:
            - name: updater
              image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.7.0
              imagePullPolicy: Always
              resources:
                limits:
                  cpu: 200m
                  memory: 1000Mi
                requests:
                  cpu: 50m
                  memory: 500Mi
              ports:
                - containerPort: 8080
    

Verify that the VPA is installed

  1. Use the following YAML template to create a Deployment named nginx-deployment-basic and a VPA resource named nginx-deployment-basic-vpa.
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment-basic
      labels:
        app: nginx
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.7.9
            ports:
            - containerPort: 80
    ---
    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: nginx-deployment-basic-vpa
    spec:
      targetRef:
        apiVersion: "apps/v1"
        kind:       Deployment
        name:       nginx-deployment-basic
      updatePolicy:
        updateMode: "Off"
    Note Set updateMode to Off, and leave the requests and limits fields in the Deployment empty.
  2. Run the following command to query the CPU request and memory request that the VPC recommends for the Deployment:
    Note The output is returned 2 minutes after you run the command to query the CPU request and memory request that the VPA recommends for the Deployment.
    kubectl describe vpa nginx-deployment-basic-vpa

    The following output shows the recommended resource requests:

    recommendation:
        containerRecommendations:
        - containerName: nginx
          lowerBound:
            cpu: 50m
            memory: 300144k
          target:
            cpu: 50m
            memory: 300144k
          upperBound:
            cpu: 8031m
            memory: 800000k

    You can set resource requests for the Deployment based on the recommendation. The VPA continuously monitors the resource usage of the Deployment and provides suggestions on how to improve resource utilization.