All Products
Search
Document Center

Container Service for Kubernetes:Achieve Container Scaling in Seconds Using ack-autoscaling-placeholder

Last Updated:Mar 26, 2026

When a traffic spike hits and your cluster needs new nodes, the default scaling path introduces a significant delay: Cluster Autoscaler (CA) must detect the unschedulable pod and then provision a new node before the workload can start. For latency-sensitive services, this delay is unacceptable.

ack-autoscaling-placeholder solves this by keeping pre-warmed capacity in the cluster at all times. Low-priority placeholder pods reserve node resources. When a real workload arrives, it preempts the placeholder and starts immediately on the already-provisioned node. The now-Pending placeholder then triggers CA to provision a new node in the background, automatically replenishing the buffer.

image

Prerequisites

Before you begin, ensure that you have:

  • Node autoscaling enabled for the ACK cluster with an elastic node pool configured. See Enable Node Autoscaling.

  • A node label set on the elastic node pool using the Node Labels configuration item, so workloads are scheduled to specific node pools and results are easy to verify. See Create and Manage Node Pools. The examples in this guide use demo=yes as the label.

How it works

All three steps work together through Kubernetes priority preemption:

  1. The placeholder workload runs with a low PriorityClass (value -1), reserving node resources without doing real work.

  2. When an actual workload is deployed with a high PriorityClass (value 1000000), the scheduler evicts the placeholder and places the real workload on the freed resources immediately.

  3. The now-Pending placeholder triggers CA to provision a new node. Once the node is ready, the placeholder is rescheduled and the buffer is restored.

The placeholder priority must stay above the CA expendable-pod cutoff. Pods below that cutoff do not trigger scale-out when Pending, which prevents the buffer from being replenished. The value -1 used in this guide is deliberately above that threshold.

Step 1: Deploy ack-autoscaling-placeholder and create a placeholder workload

  1. Log on to the Container Service Management Console. In the left navigation pane, click Marketplace > Marketplace.

  2. On the App Catalog tab, search for ack-autoscaling-placeholder and click ack-autoscaling-placeholder.

  3. On the ack-autoscaling-placeholder page, click Deploy.

  4. On the creation panel, go to the Parameter tab and replace the content of Parameters with the following YAML, then click OK.

    Set resources.requests to match the allocatable resources on the target node, not the total node capacity. Nodes reserve capacity for kubelet, the operating system, and kube-proxy. If your node has 4 vCPU and 16 GiB total, the allocatable capacity is typically lower — check with kubectl describe node <node-name> under the Allocatable field.
    nameOverride: ""
    fullnameOverride: ""
    
    priorityClassDefault:
      enabled: true
      name: default-priority-class   # Low-priority class for placeholder pods.
      value: -1                      # Must be above the CA expendable-pod cutoff and below real workload priority.
    
    deployments:
       - name: ack-place-holder
         replicaCount: 1
         containers:
           - name: placeholder
             image: registry-vpc.cn-shenzhen.aliyuncs.com/acs/pause:3.1
             pullPolicy: IfNotPresent
             resources:
               requests:
                 cpu: 4             # Size these requests to match allocatable node resources,
                 memory: 8Gi        # not raw node capacity (deduct kubelet, OS, and kube-proxy overhead).
         imagePullSecrets: {}
         annotations: {}
         nodeSelector:              # Must match the labels on the elastic node pool.
           demo: "yes"
         tolerations: []
         affinity: {}
         labels: {}
  5. Go to Applications > Helm and verify that the application status is Deployed.

Step 2: Create a PriorityClass for the actual workload

  1. Create a file named priorityClass.yaml with the following content.

    apiVersion: scheduling.k8s.io/v1
    kind: PriorityClass
    metadata:
      name: high-priority
    value: 1000000       # Must be higher than the placeholder PriorityClass value (-1).
    globalDefault: false
    description: "High-priority class for production workloads."
  2. Apply the PriorityClass.

    kubectl apply -f priorityClass.yaml

    Expected output:

    priorityclass.scheduling.k8s.io/high-priority created

Step 3: Deploy the actual workload

  1. Create a file named workload.yaml with the following content.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: placeholder-test
      labels:
        app: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          nodeSelector:              # Must match the labels on the elastic node pool.
            demo: "yes"
          priorityClassName: high-priority   # References the PriorityClass created in Step 2.
          containers:
          - name: nginx
            image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
            ports:
            - containerPort: 80
            resources:
              requests:
                cpu: 3             # Must be less than or equal to the placeholder's cpu request (4)
                memory: 5Gi        # so the real workload fits within the reserved space.
  2. Apply the Deployment.

    kubectl apply -f workload.yaml

    Expected output:

    deployment.apps/placeholder-test created

Verify the result

  1. After the placeholder workload ack-place-holder is created, its pod status is Running.

    image

  2. When the actual workload is deployed, it preempts the placeholder and starts immediately on the same node. The placeholder pod is evicted and moves to Pending.

    • The actual workload placeholder-test is Running on the node previously occupied by the placeholder. image

    • The placeholder pod is evicted and enters Pending state due to insufficient remaining resources. image

  3. CA detects the Pending placeholder and provisions a new node. Once the node is ready, the placeholder is scheduled there and returns to Running, restoring the buffer for the next scale event.

    image

What's next

For over-provisioning across multiple availability zones simultaneously, see Achieve Fast Elastic Scale-out in Multiple Zones Simultaneously.