Vertical pod autoscaling - Container Service for Kubernetes

The Vertical Pod Autoscaler (VPA) automatically adjusts CPU and memory requests for Pods based on observed resource usage. Deploy the vertical-pod-autoscaler component in a Container Service for Kubernetes (ACK) cluster to analyze workload resource consumption, generate resource recommendations, allow proper scheduling of Pods onto nodes with sufficient resources, and maintain the ratio between requests and limits defined in your original container configuration.

Prerequisites

An ACK cluster running Kubernetes later than 1.12. For more information, see Create an ACK managed cluster
kubectl connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster
Any existing vertical-pod-autoscaler installation removed from the cluster to avoid version conflicts

Install the VPA

Step 1: Create RBAC resources

Run the following command to apply the role-based access control (RBAC) configuration:

kubectl apply -f rbac.yaml

rbac.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:metrics-reader
rules:
  - apiGroups:
      - "metrics.k8s.io"
    resources:
      - pods
    verbs:
      - get
      - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-actor
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - nodes
      - limitranges
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - get
      - list
      - watch
      - create
  - apiGroups:
      - "poc.autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
      - patch
  - apiGroups:
      - "autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
      - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-checkpoint-actor
rules:
  - apiGroups:
      - "poc.autoscaling.k8s.io"
    resources:
      - verticalpodautoscalercheckpoints
    verbs:
      - get
      - list
      - watch
      - create
      - patch
      - delete
  - apiGroups:
      - "autoscaling.k8s.io"
    resources:
      - verticalpodautoscalercheckpoints
    verbs:
      - get
      - list
      - watch
      - create
      - patch
      - delete
  - apiGroups:
      - ""
    resources:
      - namespaces
    verbs:
      - get
      - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:evictioner
rules:
  - apiGroups:
      - "apps"
      - "extensions"
    resources:
      - replicasets
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - pods/eviction
    verbs:
      - create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:metrics-reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-reader
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-actor
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-actor
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-checkpoint-actor
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-checkpoint-actor
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-target-reader
rules:
  - apiGroups:
    - '*'
    resources:
    - '*/scale'
    verbs:
    - get
    - watch
  - apiGroups:
      - ""
    resources:
      - replicationcontrollers
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - apps
    resources:
      - daemonsets
      - deployments
      - replicasets
      - statefulsets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - batch
    resources:
      - jobs
      - cronjobs
    verbs:
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-target-reader-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-target-reader
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
  - kind: ServiceAccount
    name: vpa-admission-controller
    namespace: kube-system
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-evictioner-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:evictioner
subjects:
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vpa-admission-controller
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vpa-recommender
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vpa-updater
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-admission-controller
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - configmaps
      - nodes
      - limitranges
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "admissionregistration.k8s.io"
    resources:
      - mutatingwebhookconfigurations
    verbs:
      - create
      - delete
      - get
      - list
  - apiGroups:
      - "poc.autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "coordination.k8s.io"
    resources:
      - leases
    verbs:
      - create
      - update
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-admission-controller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-admission-controller
subjects:
  - kind: ServiceAccount
    name: vpa-admission-controller
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-status-reader
rules:
  - apiGroups:
      - "coordination.k8s.io"
    resources:
      - leases
    verbs:
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-status-reader-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-status-reader
subjects:
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system

Step 2: Create the CustomResourceDefinitions

A CustomResourceDefinition (CRD) extends the Kubernetes API with VPA resource types. For more information, see Extend the Kubernetes API with CustomResourceDefinitions.

Run the following command:

kubectl apply -f crd.yaml

Select the CRD template that matches your cluster's Kubernetes version.

crd.yaml for Kubernetes versions earlier than 1.22

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: verticalpodautoscalers.autoscaling.k8s.io
  annotations:
    "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
spec:
  group: autoscaling.k8s.io
  scope: Namespaced
  names:
    plural: verticalpodautoscalers
    singular: verticalpodautoscaler
    kind: VerticalPodAutoscaler
    shortNames:
      - vpa
  version: v1beta1
  versions:
    - name: v1beta1
      served: false
      storage: false
    - name: v1beta2
      served: true
      storage: true
    - name: v1
      served: true
      storage: false
  validation:
    # openAPIV3Schema is the schema for validating custom objects.
    openAPIV3Schema:
      type: object
      properties:
        spec:
          type: object
          required: []
          properties:
            targetRef:
              type: object
            updatePolicy:
              type: object
              properties:
                updateMode:
                  type: string
            resourcePolicy:
              type: object
              properties:
                containerPolicies:
                  type: array
                  items:
                    type: object
                    properties:
                      containerName:
                        type: string
                      controlledValues:
                        type: string
                        enum: ["RequestsAndLimits", "RequestsOnly"]
                      mode:
                        type: string
                        enum: ["Auto", "Off"]
                      minAllowed:
                        type: object
                      maxAllowed:
                        type: object
                      controlledResources:
                        type: array
                        items:
                          type: string
                          enum: ["cpu", "memory"]
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
  annotations:
    "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
spec:
  group: autoscaling.k8s.io
  scope: Namespaced
  names:
    plural: verticalpodautoscalercheckpoints
    singular: verticalpodautoscalercheckpoint
    kind: VerticalPodAutoscalerCheckpoint
    shortNames:
      - vpacheckpoint
  version: v1beta1
  versions:
    - name: v1beta1
      served: false
      storage: false
    - name: v1beta2
      served: true
      storage: true
    - name: v1
      served: true
      storage: false

crd.yaml for Kubernetes 1.22 and later

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
    controller-gen.kubebuilder.io/version: v0.9.2
  creationTimestamp: null
  name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
spec:
  group: autoscaling.k8s.io
  names:
    kind: VerticalPodAutoscalerCheckpoint
    listKind: VerticalPodAutoscalerCheckpointList
    plural: verticalpodautoscalercheckpoints
    shortNames:
    - vpacheckpoint
    singular: verticalpodautoscalercheckpoint
  scope: Namespaced
  versions:
  - name: v1
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
          state of VPA that is used for recovery after recommender's restart.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              containerName:
                description: Name of the checkpointed container.
                type: string
              vpaObjectName:
                description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
                  object.
                type: string
            type: object
          status:
            description: Data of the checkpoint.
            properties:
              cpuHistogram:
                description: Checkpoint of histogram for consumption of CPU.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              firstSampleStart:
                description: Timestamp of the fist sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastSampleStart:
                description: Timestamp of the last sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastUpdateTime:
                description: The time when the status was last refreshed.
                format: date-time
                nullable: true
                type: string
              memoryHistogram:
                description: Checkpoint of histogram for consumption of memory.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              totalSamplesCount:
                description: Total number of samples in the histograms.
                type: integer
              version:
                description: Version of the format of the stored data.
                type: string
            type: object
        type: object
    served: true
    storage: true
  - name: v1beta2
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
          state of VPA that is used for recovery after recommender's restart.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              containerName:
                description: Name of the checkpointed container.
                type: string
              vpaObjectName:
                description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
                  object.
                type: string
            type: object
          status:
            description: Data of the checkpoint.
            properties:
              cpuHistogram:
                description: Checkpoint of histogram for consumption of CPU.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              firstSampleStart:
                description: Timestamp of the fist sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastSampleStart:
                description: Timestamp of the last sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastUpdateTime:
                description: The time when the status was last refreshed.
                format: date-time
                nullable: true
                type: string
              memoryHistogram:
                description: Checkpoint of histogram for consumption of memory.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              totalSamplesCount:
                description: Total number of samples in the histograms.
                type: integer
              version:
                description: Version of the format of the stored data.
                type: string
            type: object
        type: object
    served: true
    storage: false
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
    controller-gen.kubebuilder.io/version: v0.9.2
  creationTimestamp: null
  name: verticalpodautoscalers.autoscaling.k8s.io
spec:
  group: autoscaling.k8s.io
  names:
    kind: VerticalPodAutoscaler
    listKind: VerticalPodAutoscalerList
    plural: verticalpodautoscalers
    shortNames:
    - vpa
    singular: verticalpodautoscaler
  scope: Namespaced
  versions:
  - additionalPrinterColumns:
    - jsonPath: .spec.updatePolicy.updateMode
      name: Mode
      type: string
    - jsonPath: .status.recommendation.containerRecommendations[0].target.cpu
      name: CPU
      type: string
    - jsonPath: .status.recommendation.containerRecommendations[0].target.memory
      name: Mem
      type: string
    - jsonPath: .status.conditions[?(@.type=='RecommendationProvided')].status
      name: Provided
      type: string
    - jsonPath: .metadata.creationTimestamp
      name: Age
      type: date
    name: v1
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscaler is the configuration for a vertical pod
          autoscaler, which automatically manages pod resources based on historical
          and real time resource utilization.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the behavior of the autoscaler. More info:
              https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              recommenders:
                description: Recommender responsible for generating recommendation
                  for this object. List should be empty (then the default recommender
                  will generate the recommendation) or contain exactly one recommender.
                items:
                  description: VerticalPodAutoscalerRecommenderSelector points to
                    a specific Vertical Pod Autoscaler recommender. In the future
                    it might pass parameters to the recommender.
                  properties:
                    name:
                      description: Name of the recommender responsible for generating
                        recommendation for this object.
                      type: string
                  required:
                  - name
                  type: object
                type: array
              resourcePolicy:
                description: Controls how the autoscaler computes recommended resources.
                  The resource policy may be used to set constraints on the recommendations
                  for individual containers. If not specified, the autoscaler computes
                  recommended resources for all containers in the pod, without additional
                  constraints.
                properties:
                  containerPolicies:
                    description: Per-container resource policies.
                    items:
                      description: ContainerResourcePolicy controls how autoscaler
                        computes the recommended resources for a specific container.
                      properties:
                        containerName:
                          description: Name of the container or DefaultContainerResourcePolicy,
                            in which case the policy is used by the containers that
                            don't have their own policy specified.
                          type: string
                        controlledResources:
                          description: Specifies the type of recommendations that
                            will be computed (and possibly applied) by VPA. If not
                            specified, the default of [ResourceCPU, ResourceMemory]
                            will be used.
                          items:
                            description: ResourceName is the name identifying various
                              resources in a ResourceList.
                            type: string
                          type: array
                        controlledValues:
                          description: Specifies which resource values should be controlled.
                            The default is "RequestsAndLimits".
                          enum:
                          - RequestsAndLimits
                          - RequestsOnly
                          type: string
                        maxAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the maximum amount of resources that
                            will be recommended for the container. The default is
                            no maximum.
                          type: object
                        minAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the minimal amount of resources that
                            will be recommended for the container. The default is
                            no minimum.
                          type: object
                        mode:
                          description: Whether autoscaler is enabled for the container.
                            The default is "Auto".
                          enum:
                          - Auto
                          - "Off"
                          type: string
                      type: object
                    type: array
                type: object
              targetRef:
                description: TargetRef points to the controller managing the set of
                  pods for the autoscaler to control - e.g. Deployment, StatefulSet.
                  VerticalPodAutoscaler can be targeted at controller implementing
                  scale subresource (the pod set is retrieved from the controller's
                  ScaleStatus) or some well known controllers (e.g. for DaemonSet
                  the pod set is read from the controller's spec). If VerticalPodAutoscaler
                  cannot use specified target it will report ConfigUnsupported condition.
                  Note that VerticalPodAutoscaler does not require full implementation
                  of scale subresource - it will not use it to modify the replica
                  count. The only thing retrieved is a label selector matching pods
                  grouped by the target resource.
                properties:
                  apiVersion:
                    description: API version of the referent
                    type: string
                  kind:
                    description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
                    type: string
                  name:
                    description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
                    type: string
                required:
                - kind
                - name
                type: object
                x-kubernetes-map-type: atomic
              updatePolicy:
                description: Describes the rules on how changes are applied to the
                  pods. If not specified, all fields in the `PodUpdatePolicy` are
                  set to their default values.
                properties:
                  minReplicas:
                    description: Minimal number of replicas which need to be alive
                      for Updater to attempt pod eviction (pending other checks like
                      PDB). Only positive values are allowed. Overrides global '--min-replicas'
                      flag.
                    format: int32
                    type: integer
                  updateMode:
                    description: Controls when autoscaler applies changes to the pod
                      resources. The default is 'Auto'.
                    enum:
                    - "Off"
                    - Initial
                    - Recreate
                    - Auto
                    type: string
                type: object
            required:
            - targetRef
            type: object
          status:
            description: Current information about the autoscaler.
            properties:
              conditions:
                description: Conditions is the set of conditions required for this
                  autoscaler to scale its target, and indicates whether or not those
                  conditions are met.
                items:
                  description: VerticalPodAutoscalerCondition describes the state
                    of a VerticalPodAutoscaler at a certain point.
                  properties:
                    lastTransitionTime:
                      description: lastTransitionTime is the last time the condition
                        transitioned from one status to another
                      format: date-time
                      type: string
                    message:
                      description: message is a human-readable explanation containing
                        details about the transition
                      type: string
                    reason:
                      description: reason is the reason for the condition's last transition.
                      type: string
                    status:
                      description: status is the status of the condition (True, False,
                        Unknown)
                      type: string
                    type:
                      description: type describes the current condition
                      type: string
                  required:
                  - status
                  - type
                  type: object
                type: array
              recommendation:
                description: The most recently computed amount of resources recommended
                  by the autoscaler for the controlled pods.
                properties:
                  containerRecommendations:
                    description: Resources recommended by the autoscaler for each
                      container.
                    items:
                      description: RecommendedContainerResources is the recommendation
                        of resources computed by autoscaler for a specific container.
                        Respects the container resource policy if present in the spec.
                        In particular the recommendation is not produced for containers
                        with `ContainerScalingMode` set to 'Off'.
                      properties:
                        containerName:
                          description: Name of the container.
                          type: string
                        lowerBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Minimum recommended amount of resources. Observes
                            ContainerResourcePolicy. This amount is not guaranteed
                            to be sufficient for the application to operate in a stable
                            way, however running with less resources is likely to
                            have significant impact on performance/availability.
                          type: object
                        target:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Recommended amount of resources. Observes ContainerResourcePolicy.
                          type: object
                        uncappedTarget:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: The most recent recommended resources target
                            computed by the autoscaler for the controlled pods, based
                            only on actual resource usage, not taking into account
                            the ContainerResourcePolicy. May differ from the Recommendation
                            if the actual resource usage causes the target to violate
                            the ContainerResourcePolicy (lower than MinAllowed or
                            higher that MaxAllowed). Used only as status indication,
                            will not affect actual resource assignment.
                          type: object
                        upperBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Maximum recommended amount of resources. Observes
                            ContainerResourcePolicy. Any resources allocated beyond
                            this value are likely wasted. This value may be larger
                            than the maximum amount of application is actually capable
                            of consuming.
                          type: object
                      required:
                      - target
                      type: object
                    type: array
                type: object
            type: object
        required:
        - spec
        type: object
    served: true
    storage: true
    subresources: {}
  - deprecated: true
    deprecationWarning: autoscaling.k8s.io/v1beta2 API is deprecated
    name: v1beta2
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscaler is the configuration for a vertical pod
          autoscaler, which automatically manages pod resources based on historical
          and real time resource utilization.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the behavior of the autoscaler. More info:
              https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              resourcePolicy:
                description: Controls how the autoscaler computes recommended resources.
                  The resource policy may be used to set constraints on the recommendations
                  for individual containers. If not specified, the autoscaler computes
                  recommended resources for all containers in the pod, without additional
                  constraints.
                properties:
                  containerPolicies:
                    description: Per-container resource policies.
                    items:
                      description: ContainerResourcePolicy controls how autoscaler
                        computes the recommended resources for a specific container.
                      properties:
                        containerName:
                          description: Name of the container or DefaultContainerResourcePolicy,
                            in which case the policy is used by the containers that
                            don't have their own policy specified.
                          type: string
                        maxAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the maximum amount of resources that
                            will be recommended for the container. The default is
                            no maximum.
                          type: object
                        minAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the minimal amount of resources that
                            will be recommended for the container. The default is
                            no minimum.
                          type: object
                        mode:
                          description: Whether autoscaler is enabled for the container.
                            The default is "Auto".
                          enum:
                          - Auto
                          - "Off"
                          type: string
                      type: object
                    type: array
                type: object
              targetRef:
                description: TargetRef points to the controller managing the set of
                  pods for the autoscaler to control - e.g. Deployment, StatefulSet.
                  VerticalPodAutoscaler can be targeted at controller implementing
                  scale subresource (the pod set is retrieved from the controller's
                  ScaleStatus) or some well known controllers (e.g. for DaemonSet
                  the pod set is read from the controller's spec). If VerticalPodAutoscaler
                  cannot use specified target it will report ConfigUnsupported condition.
                  Note that VerticalPodAutoscaler does not require full implementation
                  of scale subresource - it will not use it to modify the replica
                  count. The only thing retrieved is a label selector matching pods
                  grouped by the target resource.
                properties:
                  apiVersion:
                    description: API version of the referent
                    type: string
                  kind:
                    description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
                    type: string
                  name:
                    description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
                    type: string
                required:
                - kind
                - name
                type: object
                x-kubernetes-map-type: atomic
              updatePolicy:
                description: Describes the rules on how changes are applied to the
                  pods. If not specified, all fields in the `PodUpdatePolicy` are
                  set to their default values.
                properties:
                  updateMode:
                    description: Controls when autoscaler applies changes to the pod
                      resources. The default is 'Auto'.
                    enum:
                    - "Off"
                    - Initial
                    - Recreate
                    - Auto
                    type: string
                type: object
            required:
            - targetRef
            type: object
          status:
            description: Current information about the autoscaler.
            properties:
              conditions:
                description: Conditions is the set of conditions required for this
                  autoscaler to scale its target, and indicates whether or not those
                  conditions are met.
                items:
                  description: VerticalPodAutoscalerCondition describes the state
                    of a VerticalPodAutoscaler at a certain point.
                  properties:
                    lastTransitionTime:
                      description: lastTransitionTime is the last time the condition
                        transitioned from one status to another
                      format: date-time
                      type: string
                    message:
                      description: message is a human-readable explanation containing
                        details about the transition
                      type: string
                    reason:
                      description: reason is the reason for the condition's last transition.
                      type: string
                    status:
                      description: status is the status of the condition (True, False,
                        Unknown)
                      type: string
                    type:
                      description: type describes the current condition
                      type: string
                  required:
                  - status
                  - type
                  type: object
                type: array
              recommendation:
                description: The most recently computed amount of resources recommended
                  by the autoscaler for the controlled pods.
                properties:
                  containerRecommendations:
                    description: Resources recommended by the autoscaler for each
                      container.
                    items:
                      description: RecommendedContainerResources is the recommendation
                        of resources computed by autoscaler for a specific container.
                        Respects the container resource policy if present in the spec.
                        In particular the recommendation is not produced for containers
                        with `ContainerScalingMode` set to 'Off'.
                      properties:
                        containerName:
                          description: Name of the container.
                          type: string
                        lowerBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Minimum recommended amount of resources. Observes
                            ContainerResourcePolicy. This amount is not guaranteed
                            to be sufficient for the application to operate in a stable
                            way, however running with less resources is likely to
                            have significant impact on performance/availability.
                          type: object
                        target:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Recommended amount of resources. Observes ContainerResourcePolicy.
                          type: object
                        uncappedTarget:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: The most recent recommended resources target
                            computed by the autoscaler for the controlled pods, based
                            only on actual resource usage, not taking into account
                            the ContainerResourcePolicy. May differ from the Recommendation
                            if the actual resource usage causes the target to violate
                            the ContainerResourcePolicy (lower than MinAllowed or
                            higher that MaxAllowed). Used only as status indication,
                            will not affect actual resource assignment.
                          type: object
                        upperBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Maximum recommended amount of resources. Observes
                            ContainerResourcePolicy. Any resources allocated beyond
                            this value are likely wasted. This value may be larger
                            than the maximum amount of application is actually capable
                            of consuming.
                          type: object
                      required:
                      - target
                      type: object
                    type: array
                type: object
            type: object
        required:
        - spec
        type: object
    served: true
    storage: false

Step 3: Deploy VPA components

The vertical-pod-autoscaler consists of three components:

admission-controller -- Injects recommended resource values into new Pods through a mutating webhook.
recommender -- Monitors resource usage and generates resource recommendations.
updater -- Evicts Pods that need resource updates so they are recreated with new values.

Note

Before deploying the admission-controller, generate a TLS certificate for its webhook by using the gencerts.sh script.

Select the YAML templates that match your cluster's Kubernetes version.

Kubernetes versions earlier than 1.22

admission-controller

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-admission-controller
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-admission-controller
  template:
    metadata:
      labels:
        app: vpa-admission-controller
    spec:
      serviceAccountName: admin
      containers:
        - name: admission-controller
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.7.0
          imagePullPolicy: Always
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          volumeMounts:
            - name: tls-certs
              mountPath: "/etc/tls-certs"
              readOnly: true
          resources:
            limits:
              cpu: 200m
              memory: 500Mi
            requests:
              cpu: 50m
              memory: 200Mi
          ports:
            - containerPort: 8000
      volumes:
        - name: tls-certs
          secret:
            secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
  name: vpa-webhook
  namespace: kube-system
spec:
  ports:
    - port: 443
      targetPort: 8000
  selector:
    app: vpa-admission-controller

recommender

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-recommender
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-recommender
  template:
    metadata:
      labels:
        app: vpa-recommender
    spec:
      serviceAccountName: admin
      containers:
      - name: recommender
        image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.7.0
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 200m
            memory: 1000Mi
          requests:
            cpu: 50m
            memory: 500Mi
        ports:
        - containerPort: 8080

updater

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-updater
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-updater
  template:
    metadata:
      labels:
        app: vpa-updater
    spec:
      serviceAccountName: admin
      containers:
        - name: updater
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.7.0
          imagePullPolicy: Always
          resources:
            limits:
              cpu: 200m
              memory: 1000Mi
            requests:
              cpu: 50m
              memory: 500Mi
          ports:
            - containerPort: 8080

Kubernetes 1.22 and later

admission-controller

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-admission-controller
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-admission-controller
  template:
    metadata:
      labels:
        app: vpa-admission-controller
    spec:
      serviceAccountName: vpa-admission-controller
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
        - name: admission-controller
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.13.0
          imagePullPolicy: Always
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          volumeMounts:
            - name: tls-certs
              mountPath: "/etc/tls-certs"
              readOnly: true
          resources:
            limits:
              cpu: 200m
              memory: 500Mi
            requests:
              cpu: 50m
              memory: 200Mi
          ports:
            - containerPort: 8000
            - name: prometheus
              containerPort: 8944
      volumes:
        - name: tls-certs
          secret:
            secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
  name: vpa-webhook
  namespace: kube-system
spec:
  ports:
    - port: 443
      targetPort: 8000
  selector:
    app: vpa-admission-controller

recommender

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-recommender
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-recommender
  template:
    metadata:
      labels:
        app: vpa-recommender
    spec:
      serviceAccountName: vpa-recommender
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
      - name: recommender
        image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.13.0
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 200m
            memory: 1000Mi
          requests:
            cpu: 50m
            memory: 500Mi
        ports:
        - name: prometheus
          containerPort: 8942

updater

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-updater
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-updater
  template:
    metadata:
      labels:
        app: vpa-updater
    spec:
      serviceAccountName: vpa-updater
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
        - name: updater
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.13.0
          imagePullPolicy: Always
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          resources:
            limits:
              cpu: 200m
              memory: 1000Mi
            requests:
              cpu: 50m
              memory: 500Mi
          ports:
            - name: prometheus
              containerPort: 8943

Verify the installation

Create a test Deployment and a VPA resource by applying the following YAML:

vpa-test.yaml

Note

Set updateMode to "Off" so the VPA only generates recommendations without modifying Pods. Leave the requests and limits fields empty in the Deployment.

   apiVersion: apps/v1
   kind: Deployment
   metadata:
     name: nginx-deployment-basic
     labels:
       app: nginx
   spec:
     replicas: 2
     selector:
       matchLabels:
         app: nginx
     template:
       metadata:
         labels:
           app: nginx
       spec:
         containers:
         - name: nginx
           image: nginx:1.7.9
           ports:
           - containerPort: 80
   ---
   apiVersion: autoscaling.k8s.io/v1
   kind: VerticalPodAutoscaler
   metadata:
     name: nginx-deployment-basic-vpa
   spec:
     targetRef:
       apiVersion: "apps/v1"
       kind:       Deployment
       name:       nginx-deployment-basic
     updatePolicy:
       updateMode: "Off"

Wait approximately 2 minutes, then query the VPA recommendations: Expected output (values may vary): Use the Target values to set appropriate resource requests for the Deployment. VPA continuously monitors resource usage and updates its recommendations over time.

   kubectl describe vpa nginx-deployment-basic-vpa

   Recommendation:
       Container Recommendations:
         Container Name:  nginx
         Lower Bound:
           Cpu:     25m
           Memory:  262144k
         Target:
           Cpu:     25m
           Memory:  262144k
         Uncapped Target:
           Cpu:     25m
           Memory:  262144k
         Upper Bound:
           Cpu:     11601m
           Memory:  12128573170

Limitations

Important

Vertical pod autoscaling is in testing. Exercise caution when using this feature.

Updating the resource configuration of a running Pod causes the Pod to restart and be recreated. The Pod may be rescheduled to a different node.
VPA does not evict Pods that are not managed by a controller. For unmanaged Pods, the "Auto" mode is equivalent to "Initial" mode.
VPA and the Horizontal Pod Autoscaler (HPA) cannot run at the same time when the HPA monitors CPU or memory metrics. If the HPA monitors only custom or external metrics other than CPU and memory, VPA and the HPA can coexist.
VPA uses an admission webhook as its admission controller. Make sure that the VPA webhook does not conflict with other admission webhooks in the cluster. The execution sequence of admission controllers is defined in the API server parameters.
VPA can handle most out of memory (OOM) events but may fail in specific scenarios.
VPA performance is not tested in large-scale clusters.
VPA-modified Pod resource requests may exceed actual available resources, including node resources, idle resources, and resource quotas. In this case, a Pod may enter the Pending state and fail to be scheduled. Use the Cluster Autoscaler to mitigate this issue.
If multiple VPA objects monitor the same Pod, undefined behavior may occur.