The Vertical Pod Autoscaler (VPA) automatically adjusts CPU and memory requests for Pods based on observed resource usage. Deploy the vertical-pod-autoscaler component in a Container Service for Kubernetes (ACK) cluster to analyze workload resource consumption, generate resource recommendations, allow proper scheduling of Pods onto nodes with sufficient resources, and maintain the ratio between requests and limits defined in your original container configuration.
Prerequisites
An ACK cluster running Kubernetes later than 1.12. For more information, see Create an ACK managed cluster
kubectl connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster
Any existing
vertical-pod-autoscalerinstallation removed from the cluster to avoid version conflicts
Install the VPA
Step 1: Create RBAC resources
Run the following command to apply the role-based access control (RBAC) configuration:
kubectl apply -f rbac.yamlrbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-reader
rules:
- apiGroups:
- "metrics.k8s.io"
resources:
- pods
verbs:
- get
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:vpa-actor
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- limitranges
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- create
- apiGroups:
- "poc.autoscaling.k8s.io"
resources:
- verticalpodautoscalers
verbs:
- get
- list
- watch
- patch
- apiGroups:
- "autoscaling.k8s.io"
resources:
- verticalpodautoscalers
verbs:
- get
- list
- watch
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:vpa-checkpoint-actor
rules:
- apiGroups:
- "poc.autoscaling.k8s.io"
resources:
- verticalpodautoscalercheckpoints
verbs:
- get
- list
- watch
- create
- patch
- delete
- apiGroups:
- "autoscaling.k8s.io"
resources:
- verticalpodautoscalercheckpoints
verbs:
- get
- list
- watch
- create
- patch
- delete
- apiGroups:
- ""
resources:
- namespaces
verbs:
- get
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:evictioner
rules:
- apiGroups:
- "apps"
- "extensions"
resources:
- replicasets
verbs:
- get
- apiGroups:
- ""
resources:
- pods/eviction
verbs:
- create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-reader
subjects:
- kind: ServiceAccount
name: vpa-recommender
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-actor
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-actor
subjects:
- kind: ServiceAccount
name: vpa-recommender
namespace: kube-system
- kind: ServiceAccount
name: vpa-updater
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-checkpoint-actor
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-checkpoint-actor
subjects:
- kind: ServiceAccount
name: vpa-recommender
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:vpa-target-reader
rules:
- apiGroups:
- '*'
resources:
- '*/scale'
verbs:
- get
- watch
- apiGroups:
- ""
resources:
- replicationcontrollers
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- replicasets
- statefulsets
verbs:
- get
- list
- watch
- apiGroups:
- batch
resources:
- jobs
- cronjobs
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-target-reader-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-target-reader
subjects:
- kind: ServiceAccount
name: vpa-recommender
namespace: kube-system
- kind: ServiceAccount
name: vpa-admission-controller
namespace: kube-system
- kind: ServiceAccount
name: vpa-updater
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-evictioner-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:evictioner
subjects:
- kind: ServiceAccount
name: vpa-updater
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: vpa-admission-controller
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: vpa-recommender
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: vpa-updater
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:vpa-admission-controller
rules:
- apiGroups:
- ""
resources:
- pods
- configmaps
- nodes
- limitranges
verbs:
- get
- list
- watch
- apiGroups:
- "admissionregistration.k8s.io"
resources:
- mutatingwebhookconfigurations
verbs:
- create
- delete
- get
- list
- apiGroups:
- "poc.autoscaling.k8s.io"
resources:
- verticalpodautoscalers
verbs:
- get
- list
- watch
- apiGroups:
- "autoscaling.k8s.io"
resources:
- verticalpodautoscalers
verbs:
- get
- list
- watch
- apiGroups:
- "coordination.k8s.io"
resources:
- leases
verbs:
- create
- update
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-admission-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-admission-controller
subjects:
- kind: ServiceAccount
name: vpa-admission-controller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:vpa-status-reader
rules:
- apiGroups:
- "coordination.k8s.io"
resources:
- leases
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-status-reader-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-status-reader
subjects:
- kind: ServiceAccount
name: vpa-updater
namespace: kube-systemStep 2: Create the CustomResourceDefinitions
A CustomResourceDefinition (CRD) extends the Kubernetes API with VPA resource types. For more information, see Extend the Kubernetes API with CustomResourceDefinitions.
Run the following command:
kubectl apply -f crd.yamlSelect the CRD template that matches your cluster's Kubernetes version.
crd.yaml for Kubernetes versions earlier than 1.22
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: verticalpodautoscalers.autoscaling.k8s.io
annotations:
"api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
spec:
group: autoscaling.k8s.io
scope: Namespaced
names:
plural: verticalpodautoscalers
singular: verticalpodautoscaler
kind: VerticalPodAutoscaler
shortNames:
- vpa
version: v1beta1
versions:
- name: v1beta1
served: false
storage: false
- name: v1beta2
served: true
storage: true
- name: v1
served: true
storage: false
validation:
# openAPIV3Schema is the schema for validating custom objects.
openAPIV3Schema:
type: object
properties:
spec:
type: object
required: []
properties:
targetRef:
type: object
updatePolicy:
type: object
properties:
updateMode:
type: string
resourcePolicy:
type: object
properties:
containerPolicies:
type: array
items:
type: object
properties:
containerName:
type: string
controlledValues:
type: string
enum: ["RequestsAndLimits", "RequestsOnly"]
mode:
type: string
enum: ["Auto", "Off"]
minAllowed:
type: object
maxAllowed:
type: object
controlledResources:
type: array
items:
type: string
enum: ["cpu", "memory"]
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
annotations:
"api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
spec:
group: autoscaling.k8s.io
scope: Namespaced
names:
plural: verticalpodautoscalercheckpoints
singular: verticalpodautoscalercheckpoint
kind: VerticalPodAutoscalerCheckpoint
shortNames:
- vpacheckpoint
version: v1beta1
versions:
- name: v1beta1
served: false
storage: false
- name: v1beta2
served: true
storage: true
- name: v1
served: true
storage: falsecrd.yaml for Kubernetes 1.22 and later
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
controller-gen.kubebuilder.io/version: v0.9.2
creationTimestamp: null
name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
spec:
group: autoscaling.k8s.io
names:
kind: VerticalPodAutoscalerCheckpoint
listKind: VerticalPodAutoscalerCheckpointList
plural: verticalpodautoscalercheckpoints
shortNames:
- vpacheckpoint
singular: verticalpodautoscalercheckpoint
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
state of VPA that is used for recovery after recommender's restart.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
properties:
containerName:
description: Name of the checkpointed container.
type: string
vpaObjectName:
description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
object.
type: string
type: object
status:
description: Data of the checkpoint.
properties:
cpuHistogram:
description: Checkpoint of histogram for consumption of CPU.
properties:
bucketWeights:
description: Map from bucket index to bucket weight.
type: object
x-kubernetes-preserve-unknown-fields: true
referenceTimestamp:
description: Reference timestamp for samples collected within
this histogram.
format: date-time
nullable: true
type: string
totalWeight:
description: Sum of samples to be used as denominator for weights
from BucketWeights.
type: number
type: object
firstSampleStart:
description: Timestamp of the fist sample from the histograms.
format: date-time
nullable: true
type: string
lastSampleStart:
description: Timestamp of the last sample from the histograms.
format: date-time
nullable: true
type: string
lastUpdateTime:
description: The time when the status was last refreshed.
format: date-time
nullable: true
type: string
memoryHistogram:
description: Checkpoint of histogram for consumption of memory.
properties:
bucketWeights:
description: Map from bucket index to bucket weight.
type: object
x-kubernetes-preserve-unknown-fields: true
referenceTimestamp:
description: Reference timestamp for samples collected within
this histogram.
format: date-time
nullable: true
type: string
totalWeight:
description: Sum of samples to be used as denominator for weights
from BucketWeights.
type: number
type: object
totalSamplesCount:
description: Total number of samples in the histograms.
type: integer
version:
description: Version of the format of the stored data.
type: string
type: object
type: object
served: true
storage: true
- name: v1beta2
schema:
openAPIV3Schema:
description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
state of VPA that is used for recovery after recommender's restart.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
properties:
containerName:
description: Name of the checkpointed container.
type: string
vpaObjectName:
description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
object.
type: string
type: object
status:
description: Data of the checkpoint.
properties:
cpuHistogram:
description: Checkpoint of histogram for consumption of CPU.
properties:
bucketWeights:
description: Map from bucket index to bucket weight.
type: object
x-kubernetes-preserve-unknown-fields: true
referenceTimestamp:
description: Reference timestamp for samples collected within
this histogram.
format: date-time
nullable: true
type: string
totalWeight:
description: Sum of samples to be used as denominator for weights
from BucketWeights.
type: number
type: object
firstSampleStart:
description: Timestamp of the fist sample from the histograms.
format: date-time
nullable: true
type: string
lastSampleStart:
description: Timestamp of the last sample from the histograms.
format: date-time
nullable: true
type: string
lastUpdateTime:
description: The time when the status was last refreshed.
format: date-time
nullable: true
type: string
memoryHistogram:
description: Checkpoint of histogram for consumption of memory.
properties:
bucketWeights:
description: Map from bucket index to bucket weight.
type: object
x-kubernetes-preserve-unknown-fields: true
referenceTimestamp:
description: Reference timestamp for samples collected within
this histogram.
format: date-time
nullable: true
type: string
totalWeight:
description: Sum of samples to be used as denominator for weights
from BucketWeights.
type: number
type: object
totalSamplesCount:
description: Total number of samples in the histograms.
type: integer
version:
description: Version of the format of the stored data.
type: string
type: object
type: object
served: true
storage: false
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
controller-gen.kubebuilder.io/version: v0.9.2
creationTimestamp: null
name: verticalpodautoscalers.autoscaling.k8s.io
spec:
group: autoscaling.k8s.io
names:
kind: VerticalPodAutoscaler
listKind: VerticalPodAutoscalerList
plural: verticalpodautoscalers
shortNames:
- vpa
singular: verticalpodautoscaler
scope: Namespaced
versions:
- additionalPrinterColumns:
- jsonPath: .spec.updatePolicy.updateMode
name: Mode
type: string
- jsonPath: .status.recommendation.containerRecommendations[0].target.cpu
name: CPU
type: string
- jsonPath: .status.recommendation.containerRecommendations[0].target.memory
name: Mem
type: string
- jsonPath: .status.conditions[?(@.type=='RecommendationProvided')].status
name: Provided
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1
schema:
openAPIV3Schema:
description: VerticalPodAutoscaler is the configuration for a vertical pod
autoscaler, which automatically manages pod resources based on historical
and real time resource utilization.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: 'Specification of the behavior of the autoscaler. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
properties:
recommenders:
description: Recommender responsible for generating recommendation
for this object. List should be empty (then the default recommender
will generate the recommendation) or contain exactly one recommender.
items:
description: VerticalPodAutoscalerRecommenderSelector points to
a specific Vertical Pod Autoscaler recommender. In the future
it might pass parameters to the recommender.
properties:
name:
description: Name of the recommender responsible for generating
recommendation for this object.
type: string
required:
- name
type: object
type: array
resourcePolicy:
description: Controls how the autoscaler computes recommended resources.
The resource policy may be used to set constraints on the recommendations
for individual containers. If not specified, the autoscaler computes
recommended resources for all containers in the pod, without additional
constraints.
properties:
containerPolicies:
description: Per-container resource policies.
items:
description: ContainerResourcePolicy controls how autoscaler
computes the recommended resources for a specific container.
properties:
containerName:
description: Name of the container or DefaultContainerResourcePolicy,
in which case the policy is used by the containers that
don't have their own policy specified.
type: string
controlledResources:
description: Specifies the type of recommendations that
will be computed (and possibly applied) by VPA. If not
specified, the default of [ResourceCPU, ResourceMemory]
will be used.
items:
description: ResourceName is the name identifying various
resources in a ResourceList.
type: string
type: array
controlledValues:
description: Specifies which resource values should be controlled.
The default is "RequestsAndLimits".
enum:
- RequestsAndLimits
- RequestsOnly
type: string
maxAllowed:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Specifies the maximum amount of resources that
will be recommended for the container. The default is
no maximum.
type: object
minAllowed:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Specifies the minimal amount of resources that
will be recommended for the container. The default is
no minimum.
type: object
mode:
description: Whether autoscaler is enabled for the container.
The default is "Auto".
enum:
- Auto
- "Off"
type: string
type: object
type: array
type: object
targetRef:
description: TargetRef points to the controller managing the set of
pods for the autoscaler to control - e.g. Deployment, StatefulSet.
VerticalPodAutoscaler can be targeted at controller implementing
scale subresource (the pod set is retrieved from the controller's
ScaleStatus) or some well known controllers (e.g. for DaemonSet
the pod set is read from the controller's spec). If VerticalPodAutoscaler
cannot use specified target it will report ConfigUnsupported condition.
Note that VerticalPodAutoscaler does not require full implementation
of scale subresource - it will not use it to modify the replica
count. The only thing retrieved is a label selector matching pods
grouped by the target resource.
properties:
apiVersion:
description: API version of the referent
type: string
kind:
description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
type: string
name:
description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
type: string
required:
- kind
- name
type: object
x-kubernetes-map-type: atomic
updatePolicy:
description: Describes the rules on how changes are applied to the
pods. If not specified, all fields in the `PodUpdatePolicy` are
set to their default values.
properties:
minReplicas:
description: Minimal number of replicas which need to be alive
for Updater to attempt pod eviction (pending other checks like
PDB). Only positive values are allowed. Overrides global '--min-replicas'
flag.
format: int32
type: integer
updateMode:
description: Controls when autoscaler applies changes to the pod
resources. The default is 'Auto'.
enum:
- "Off"
- Initial
- Recreate
- Auto
type: string
type: object
required:
- targetRef
type: object
status:
description: Current information about the autoscaler.
properties:
conditions:
description: Conditions is the set of conditions required for this
autoscaler to scale its target, and indicates whether or not those
conditions are met.
items:
description: VerticalPodAutoscalerCondition describes the state
of a VerticalPodAutoscaler at a certain point.
properties:
lastTransitionTime:
description: lastTransitionTime is the last time the condition
transitioned from one status to another
format: date-time
type: string
message:
description: message is a human-readable explanation containing
details about the transition
type: string
reason:
description: reason is the reason for the condition's last transition.
type: string
status:
description: status is the status of the condition (True, False,
Unknown)
type: string
type:
description: type describes the current condition
type: string
required:
- status
- type
type: object
type: array
recommendation:
description: The most recently computed amount of resources recommended
by the autoscaler for the controlled pods.
properties:
containerRecommendations:
description: Resources recommended by the autoscaler for each
container.
items:
description: RecommendedContainerResources is the recommendation
of resources computed by autoscaler for a specific container.
Respects the container resource policy if present in the spec.
In particular the recommendation is not produced for containers
with `ContainerScalingMode` set to 'Off'.
properties:
containerName:
description: Name of the container.
type: string
lowerBound:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Minimum recommended amount of resources. Observes
ContainerResourcePolicy. This amount is not guaranteed
to be sufficient for the application to operate in a stable
way, however running with less resources is likely to
have significant impact on performance/availability.
type: object
target:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Recommended amount of resources. Observes ContainerResourcePolicy.
type: object
uncappedTarget:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: The most recent recommended resources target
computed by the autoscaler for the controlled pods, based
only on actual resource usage, not taking into account
the ContainerResourcePolicy. May differ from the Recommendation
if the actual resource usage causes the target to violate
the ContainerResourcePolicy (lower than MinAllowed or
higher that MaxAllowed). Used only as status indication,
will not affect actual resource assignment.
type: object
upperBound:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Maximum recommended amount of resources. Observes
ContainerResourcePolicy. Any resources allocated beyond
this value are likely wasted. This value may be larger
than the maximum amount of application is actually capable
of consuming.
type: object
required:
- target
type: object
type: array
type: object
type: object
required:
- spec
type: object
served: true
storage: true
subresources: {}
- deprecated: true
deprecationWarning: autoscaling.k8s.io/v1beta2 API is deprecated
name: v1beta2
schema:
openAPIV3Schema:
description: VerticalPodAutoscaler is the configuration for a vertical pod
autoscaler, which automatically manages pod resources based on historical
and real time resource utilization.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: 'Specification of the behavior of the autoscaler. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
properties:
resourcePolicy:
description: Controls how the autoscaler computes recommended resources.
The resource policy may be used to set constraints on the recommendations
for individual containers. If not specified, the autoscaler computes
recommended resources for all containers in the pod, without additional
constraints.
properties:
containerPolicies:
description: Per-container resource policies.
items:
description: ContainerResourcePolicy controls how autoscaler
computes the recommended resources for a specific container.
properties:
containerName:
description: Name of the container or DefaultContainerResourcePolicy,
in which case the policy is used by the containers that
don't have their own policy specified.
type: string
maxAllowed:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Specifies the maximum amount of resources that
will be recommended for the container. The default is
no maximum.
type: object
minAllowed:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Specifies the minimal amount of resources that
will be recommended for the container. The default is
no minimum.
type: object
mode:
description: Whether autoscaler is enabled for the container.
The default is "Auto".
enum:
- Auto
- "Off"
type: string
type: object
type: array
type: object
targetRef:
description: TargetRef points to the controller managing the set of
pods for the autoscaler to control - e.g. Deployment, StatefulSet.
VerticalPodAutoscaler can be targeted at controller implementing
scale subresource (the pod set is retrieved from the controller's
ScaleStatus) or some well known controllers (e.g. for DaemonSet
the pod set is read from the controller's spec). If VerticalPodAutoscaler
cannot use specified target it will report ConfigUnsupported condition.
Note that VerticalPodAutoscaler does not require full implementation
of scale subresource - it will not use it to modify the replica
count. The only thing retrieved is a label selector matching pods
grouped by the target resource.
properties:
apiVersion:
description: API version of the referent
type: string
kind:
description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
type: string
name:
description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
type: string
required:
- kind
- name
type: object
x-kubernetes-map-type: atomic
updatePolicy:
description: Describes the rules on how changes are applied to the
pods. If not specified, all fields in the `PodUpdatePolicy` are
set to their default values.
properties:
updateMode:
description: Controls when autoscaler applies changes to the pod
resources. The default is 'Auto'.
enum:
- "Off"
- Initial
- Recreate
- Auto
type: string
type: object
required:
- targetRef
type: object
status:
description: Current information about the autoscaler.
properties:
conditions:
description: Conditions is the set of conditions required for this
autoscaler to scale its target, and indicates whether or not those
conditions are met.
items:
description: VerticalPodAutoscalerCondition describes the state
of a VerticalPodAutoscaler at a certain point.
properties:
lastTransitionTime:
description: lastTransitionTime is the last time the condition
transitioned from one status to another
format: date-time
type: string
message:
description: message is a human-readable explanation containing
details about the transition
type: string
reason:
description: reason is the reason for the condition's last transition.
type: string
status:
description: status is the status of the condition (True, False,
Unknown)
type: string
type:
description: type describes the current condition
type: string
required:
- status
- type
type: object
type: array
recommendation:
description: The most recently computed amount of resources recommended
by the autoscaler for the controlled pods.
properties:
containerRecommendations:
description: Resources recommended by the autoscaler for each
container.
items:
description: RecommendedContainerResources is the recommendation
of resources computed by autoscaler for a specific container.
Respects the container resource policy if present in the spec.
In particular the recommendation is not produced for containers
with `ContainerScalingMode` set to 'Off'.
properties:
containerName:
description: Name of the container.
type: string
lowerBound:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Minimum recommended amount of resources. Observes
ContainerResourcePolicy. This amount is not guaranteed
to be sufficient for the application to operate in a stable
way, however running with less resources is likely to
have significant impact on performance/availability.
type: object
target:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Recommended amount of resources. Observes ContainerResourcePolicy.
type: object
uncappedTarget:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: The most recent recommended resources target
computed by the autoscaler for the controlled pods, based
only on actual resource usage, not taking into account
the ContainerResourcePolicy. May differ from the Recommendation
if the actual resource usage causes the target to violate
the ContainerResourcePolicy (lower than MinAllowed or
higher that MaxAllowed). Used only as status indication,
will not affect actual resource assignment.
type: object
upperBound:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: Maximum recommended amount of resources. Observes
ContainerResourcePolicy. Any resources allocated beyond
this value are likely wasted. This value may be larger
than the maximum amount of application is actually capable
of consuming.
type: object
required:
- target
type: object
type: array
type: object
type: object
required:
- spec
type: object
served: true
storage: falseStep 3: Deploy VPA components
The vertical-pod-autoscaler consists of three components:
admission-controller-- Injects recommended resource values into new Pods through a mutating webhook.recommender-- Monitors resource usage and generates resource recommendations.updater-- Evicts Pods that need resource updates so they are recreated with new values.
Before deploying the admission-controller, generate a TLS certificate for its webhook by using the gencerts.sh script.
Select the YAML templates that match your cluster's Kubernetes version.
Kubernetes versions earlier than 1.22
admission-controller
apiVersion: apps/v1
kind: Deployment
metadata:
name: vpa-admission-controller
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: vpa-admission-controller
template:
metadata:
labels:
app: vpa-admission-controller
spec:
serviceAccountName: admin
containers:
- name: admission-controller
image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.7.0
imagePullPolicy: Always
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: tls-certs
mountPath: "/etc/tls-certs"
readOnly: true
resources:
limits:
cpu: 200m
memory: 500Mi
requests:
cpu: 50m
memory: 200Mi
ports:
- containerPort: 8000
volumes:
- name: tls-certs
secret:
secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
name: vpa-webhook
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8000
selector:
app: vpa-admission-controllerrecommender
apiVersion: apps/v1
kind: Deployment
metadata:
name: vpa-recommender
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: vpa-recommender
template:
metadata:
labels:
app: vpa-recommender
spec:
serviceAccountName: admin
containers:
- name: recommender
image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.7.0
imagePullPolicy: Always
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
ports:
- containerPort: 8080updater
apiVersion: apps/v1
kind: Deployment
metadata:
name: vpa-updater
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: vpa-updater
template:
metadata:
labels:
app: vpa-updater
spec:
serviceAccountName: admin
containers:
- name: updater
image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.7.0
imagePullPolicy: Always
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
ports:
- containerPort: 8080Kubernetes 1.22 and later
admission-controller
apiVersion: apps/v1
kind: Deployment
metadata:
name: vpa-admission-controller
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: vpa-admission-controller
template:
metadata:
labels:
app: vpa-admission-controller
spec:
serviceAccountName: vpa-admission-controller
securityContext:
runAsNonRoot: true
runAsUser: 65534 # nobody
containers:
- name: admission-controller
image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.13.0
imagePullPolicy: Always
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: tls-certs
mountPath: "/etc/tls-certs"
readOnly: true
resources:
limits:
cpu: 200m
memory: 500Mi
requests:
cpu: 50m
memory: 200Mi
ports:
- containerPort: 8000
- name: prometheus
containerPort: 8944
volumes:
- name: tls-certs
secret:
secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
name: vpa-webhook
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8000
selector:
app: vpa-admission-controllerrecommender
apiVersion: apps/v1
kind: Deployment
metadata:
name: vpa-recommender
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: vpa-recommender
template:
metadata:
labels:
app: vpa-recommender
spec:
serviceAccountName: vpa-recommender
securityContext:
runAsNonRoot: true
runAsUser: 65534 # nobody
containers:
- name: recommender
image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.13.0
imagePullPolicy: Always
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
ports:
- name: prometheus
containerPort: 8942updater
apiVersion: apps/v1
kind: Deployment
metadata:
name: vpa-updater
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: vpa-updater
template:
metadata:
labels:
app: vpa-updater
spec:
serviceAccountName: vpa-updater
securityContext:
runAsNonRoot: true
runAsUser: 65534 # nobody
containers:
- name: updater
image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.13.0
imagePullPolicy: Always
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
ports:
- name: prometheus
containerPort: 8943Verify the installation
Create a test Deployment and a VPA resource by applying the following YAML:
vpa-test.yaml
NoteSet updateMode to "Off" so the VPA only generates recommendations without modifying Pods. Leave the
requestsandlimitsfields empty in the Deployment.apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment-basic labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80 --- apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: nginx-deployment-basic-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: nginx-deployment-basic updatePolicy: updateMode: "Off"Wait approximately 2 minutes, then query the VPA recommendations: Expected output (values may vary): Use the
Targetvalues to set appropriate resource requests for the Deployment. VPA continuously monitors resource usage and updates its recommendations over time.kubectl describe vpa nginx-deployment-basic-vpaRecommendation: Container Recommendations: Container Name: nginx Lower Bound: Cpu: 25m Memory: 262144k Target: Cpu: 25m Memory: 262144k Uncapped Target: Cpu: 25m Memory: 262144k Upper Bound: Cpu: 11601m Memory: 12128573170
Limitations
Vertical pod autoscaling is in testing. Exercise caution when using this feature.
Updating the resource configuration of a running Pod causes the Pod to restart and be recreated. The Pod may be rescheduled to a different node.
VPA does not evict Pods that are not managed by a controller. For unmanaged Pods, the
"Auto"mode is equivalent to"Initial"mode.VPA and the Horizontal Pod Autoscaler (HPA) cannot run at the same time when the HPA monitors CPU or memory metrics. If the HPA monitors only custom or external metrics other than CPU and memory, VPA and the HPA can coexist.
VPA uses an admission webhook as its admission controller. Make sure that the VPA webhook does not conflict with other admission webhooks in the cluster. The execution sequence of admission controllers is defined in the API server parameters.
VPA can handle most out of memory (OOM) events but may fail in specific scenarios.
VPA performance is not tested in large-scale clusters.
VPA-modified Pod resource requests may exceed actual available resources, including node resources, idle resources, and resource quotas. In this case, a Pod may enter the Pending state and fail to be scheduled. Use the Cluster Autoscaler to mitigate this issue.
If multiple VPA objects monitor the same Pod, undefined behavior may occur.