When a traffic spike hits and your cluster needs new nodes, the default scaling path introduces a significant delay: Cluster Autoscaler (CA) must detect the unschedulable pod and then provision a new node before the workload can start. For latency-sensitive services, this delay is unacceptable.
ack-autoscaling-placeholder solves this by keeping pre-warmed capacity in the cluster at all times. Low-priority placeholder pods reserve node resources. When a real workload arrives, it preempts the placeholder and starts immediately on the already-provisioned node. The now-Pending placeholder then triggers CA to provision a new node in the background, automatically replenishing the buffer.
Prerequisites
Before you begin, ensure that you have:
-
Node autoscaling enabled for the ACK cluster with an elastic node pool configured. See Enable Node Autoscaling.
-
A node label set on the elastic node pool using the Node Labels configuration item, so workloads are scheduled to specific node pools and results are easy to verify. See Create and Manage Node Pools. The examples in this guide use
demo=yesas the label.
How it works
All three steps work together through Kubernetes priority preemption:
-
The placeholder workload runs with a low PriorityClass (value
-1), reserving node resources without doing real work. -
When an actual workload is deployed with a high PriorityClass (value
1000000), the scheduler evicts the placeholder and places the real workload on the freed resources immediately. -
The now-Pending placeholder triggers CA to provision a new node. Once the node is ready, the placeholder is rescheduled and the buffer is restored.
-1 used in this guide is deliberately above that threshold.Step 1: Deploy ack-autoscaling-placeholder and create a placeholder workload
-
Log on to the Container Service Management Console. In the left navigation pane, click Marketplace > Marketplace.
-
On the App Catalog tab, search for
ack-autoscaling-placeholderand click ack-autoscaling-placeholder. -
On the ack-autoscaling-placeholder page, click Deploy.
-
On the creation panel, go to the Parameter tab and replace the content of Parameters with the following YAML, then click OK.
Setresources.requeststo match the allocatable resources on the target node, not the total node capacity. Nodes reserve capacity for kubelet, the operating system, and kube-proxy. If your node has 4 vCPU and 16 GiB total, the allocatable capacity is typically lower — check withkubectl describe node <node-name>under theAllocatablefield.nameOverride: "" fullnameOverride: "" priorityClassDefault: enabled: true name: default-priority-class # Low-priority class for placeholder pods. value: -1 # Must be above the CA expendable-pod cutoff and below real workload priority. deployments: - name: ack-place-holder replicaCount: 1 containers: - name: placeholder image: registry-vpc.cn-shenzhen.aliyuncs.com/acs/pause:3.1 pullPolicy: IfNotPresent resources: requests: cpu: 4 # Size these requests to match allocatable node resources, memory: 8Gi # not raw node capacity (deduct kubelet, OS, and kube-proxy overhead). imagePullSecrets: {} annotations: {} nodeSelector: # Must match the labels on the elastic node pool. demo: "yes" tolerations: [] affinity: {} labels: {} -
Go to Applications > Helm and verify that the application status is Deployed.
Step 2: Create a PriorityClass for the actual workload
-
Create a file named
priorityClass.yamlwith the following content.apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: high-priority value: 1000000 # Must be higher than the placeholder PriorityClass value (-1). globalDefault: false description: "High-priority class for production workloads." -
Apply the PriorityClass.
kubectl apply -f priorityClass.yamlExpected output:
priorityclass.scheduling.k8s.io/high-priority created
Step 3: Deploy the actual workload
-
Create a file named
workload.yamlwith the following content.apiVersion: apps/v1 kind: Deployment metadata: name: placeholder-test labels: app: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: nodeSelector: # Must match the labels on the elastic node pool. demo: "yes" priorityClassName: high-priority # References the PriorityClass created in Step 2. containers: - name: nginx image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6 ports: - containerPort: 80 resources: requests: cpu: 3 # Must be less than or equal to the placeholder's cpu request (4) memory: 5Gi # so the real workload fits within the reserved space. -
Apply the Deployment.
kubectl apply -f workload.yamlExpected output:
deployment.apps/placeholder-test created
Verify the result
-
After the placeholder workload
ack-place-holderis created, its pod status is Running.
-
When the actual workload is deployed, it preempts the placeholder and starts immediately on the same node. The placeholder pod is evicted and moves to Pending.
-
The actual workload
placeholder-testis Running on the node previously occupied by the placeholder.
-
The placeholder pod is evicted and enters Pending state due to insufficient remaining resources.

-
-
CA detects the Pending placeholder and provisions a new node. Once the node is ready, the placeholder is scheduled there and returns to Running, restoring the buffer for the next scale event.

What's next
For over-provisioning across multiple availability zones simultaneously, see Achieve Fast Elastic Scale-out in Multiple Zones Simultaneously.