Configure auto scaling -

You can call the CreateAutoscalingConfig operation to configure auto scaling.

Debugging

OpenAPI Explorer automatically calculates the signature value. For your convenience, we recommend that you call this operation in OpenAPI Explorer. OpenAPI Explorer dynamically generates the sample code of the operation for different SDKs.

Request syntax

POST /cluster/ClusterId/autoscale/config HTTP/1.1
Content-Type:application/json

{
  "cool_down_duration" : "String",
  "unneeded_duration" : "String",
  "utilization_threshold" : "String",
  "gpu_utilization_threshold" : "String",
  "scan_interval" : "String",
  "scale_down_enabled" : Boolean,
  "expander" : "String",
  "skip_nodes_with_system_pods" : Boolean,
  "skip_nodes_with_local_storage" : Boolean,
  "daemonset_eviction_for_nodes" : Boolean,
  "max_graceful_termination_sec" : Integer,
  "min_replica_count" : Integer,
  "recycle_node_deletion_enabled" : Boolean,
  "scale_up_from_zero" : Boolean
}

Request parameters

Table 1. Request path parameters
Parameter	Type	Required	Example	Description
ClusterId	String	Yes	cdde1f21ae22e483ebcb068a6eb7f****	The ID of the cluster that you want to manage.

Table 2. Request body parameters
Parameter	Type	Required	Example	Description
cool_down_duration	String	No	10 m	The waiting time before the auto scaling feature performs a scale-in activity. Only if the resource usage on a node remains below the scale-in threshold within the waiting time, the node is removed after the waiting time ends. Unit: minutes.
unneeded_duration	String	No	10 m	The cooldown period. Newly added nodes can be removed in scale-in activities only after the cooldown period ends. Unit: minutes.
utilization_threshold	String	No	0.5	The scale-in threshold. This threshold specifies the ratio of the resources that are requested by pods to the total resources on the node.
gpu_utilization_threshold	String	No	0.5	The scale-in threshold of GPU utilization. This threshold specifies the ratio of the GPU resources that are requested by pods to the total GPU resources on the node.
scan_interval	String	No	30s	The interval at which the cluster is scanned and evaluated for scaling. Unit: seconds.
scale_down_enabled	Boolean	No	true	Specifies whether to allow node scale-in activities. Valid values: `true`: allows node scale-in activities. `false`: does not allow node scale-in activities.
expander	String	No	least-waste	The node pool scale-out policy. Valid values: `least-waste`: the default policy. If multiple node pools meet the requirement, this policy selects the node pool that will have the least idle resources after the scale-out activity is completed. `random`: the random policy. If multiple node pools meet the requirement, this policy selects a random node pool for the scale-out activity. `priority`: the priority-based policy. If multiple node pools meet the requirement, this policy selects the node pool with the highest priority for the scale-out activity. The priorities of node pools are configured in the `cluster-autoscaler-priority-expander` ConfigMap in the kube-system namespace. When a scale-out activity is triggered, the policy obtains the node pool priorities from the ConfigMap based on the node pool IDs and then selects the node pool with the highest priority for the scale-out activity.
skip_nodes_with_system_pods	Boolean	No	true	Specifies whether to allow the cluster autoscaler to scale in nodes that host pods in the kube-system namespace, excluding DaemonSet pods and mirror pods. Valid values: `true`: does not allow the cluster autoscaler to scale in these nodes. `false`: allows the cluster autoscaler to scale in these nodes.
skip_nodes_with_local_storage	Boolean	No	false	Specifies whether to allow the cluster autoscaler to scale in nodes that host pods mounted with local storage (such as EmptyDir volumes or HostPath volumes). Valid values: `true`: does not allow the cluster autoscaler to scale in these nodes. `false`: allows the cluster autoscaler to scale in these nodes.
daemonset_eviction_for_nodes	Boolean	No	false	Specifies whether to evict DaemonSet pods during scale-in activities. Valid values: `true`: evicts DaemonSet pods. `false`: does not evict DaemonSet pods.
max_graceful_termination_sec	Integer	No	14400s	The maximum amount of time that the cluster autoscaler waits for pods on the nodes to terminate during scale-in activities. Unit: seconds.
min_replica_count	Integer	No	0	The minimum number of pods that must be guaranteed during scale-in activities.
recycle_node_deletion_enabled	Boolean	No	false	Specifies whether to delete the corresponding Kubernetes node objects after nodes are removed in swift mode.
scale_up_from_zero	Boolean	No	true	Specifies whether the cluster autoscaler performs scale-out activities when the number of ready nodes in the cluster is zero.

Response syntax

HTTP/1.1 200 OK

Response parameters

Configure auto scaling

Examples

POST /cluster/cdde1f21ae22e483ebcb068a6eb7f****/autoscale/config HTTP/1.1
Host:cs.aliyuncs.com
Content-Type:application/json

{
  "cool_down_duration" : "10",
  "unneeded_duration" : "10",
  "utilization_threshold" : "0.5",
  "gpu_utilization_threshold" : "0.5",
  "scan_interval" : "30",
  "scale_down_enabled" : true,
  "expander" : "least-waste",
  "skip_nodes_with_system_pods" : true,
  "skip_nodes_with_local_storage" : false,
  "daemonset_eviction_for_nodes" : false,
  "max_graceful_termination_sec" : 14400,
  "min_replica_count" : 0,
  "recycle_node_deletion_enabled" : false,
  "scale_up_from_zero" : true
}

Sample success responses

JSON format

HTTP/1.1 200 OK

Error codes

For a list of error codes, see Service error codes.