ack-autoscaling-placeholder provides a buffer for the auto scaling of pods in a Container Service for Kubernetes (ACK) cluster. ack-autoscaling-placeholder is suitable for quickly launching pods for workloads without the need to worry whether node resources are sufficient. This topic describes how to use ack-autoscaling-placeholder to scale pods within seconds.

Prerequisites

Auto Scaling is enabled for your ACK cluster. For more information, see Auto scaling of nodes.

Procedure

  1. Log on to the ACK console.
  2. In the left-side navigation pane of the ACK console, choose Marketplace > App Catalog.
  3. On the App Catalog tab, find and click ack-autoscaling-placeholder.
  4. On the ack-autoscaling-placeholder page, click Deploy.
  5. In the Deploy wizard, select a cluster and a namespace, and then click Next. Select a chart version, configure the parameters, and then click OK.
    After ack-autoscaling-placeholder is deployed, go to the cluster details page. In the left-side navigation, choose Applications > Helm. You can find that the application is in the Deployed state.
  6. In the left-side navigation pane of the details page, choose Applications > Helm.
  7. On the Helm page, find ack-autoscaling-placeholder and click Update in the Actions column. In the Update Release panel, modify the YAML template based on your requirements, and then click OK.
    nameOverride: ""
    fullnameOverride: ""
    ##
    priorityClassDefault:
      enabled: true
      name: default-priority-class
      value: -1
    
    ##
    deployments:
       - name: ack-place-holder
         replicaCount: 1
         containers:
           - name: placeholder
             image: registry-vpc.cn-shenzhen.aliyuncs.com/acs/pause:3.1
             pullPolicy: IfNotPresent
             resources:
               requests:
                 cpu: 4                  # Occupy 4 vCPUs and 8 GiB of memory.
                 memory: 8               
         imagePullSecrets: {}
         annotations: {}
         nodeSelector:                   # Specify rules that are used to select nodes.
           demo: "yes"  
         tolerations: []
         affinity: {}
         labels: {}
  8. Create a PriorityClass for a workload.

    In this example, a PriorityClass that grants a high priority is created.

    kubectl apply -f priorityClass.yaml
    apiVersion: scheduling.k8s.io/v1
    kind: PriorityClass
    metadata:
      name: high-priority
    value: 1000000              # Specify the priority. 
    globalDefault: false
    description: "This priority class should be used for XYZ service pods only."
  9. Deploy a workload.
    kubectl apply -f workload.yaml
    apiVersion: apps/v1 
    kind: Deployment
    metadata:
      name: placeholder-test
      labels:
        app: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          nodeSelector:                        # Specify rules that are used to select nodes. 
            demo: "yes"
          priorityClassName: high-priority     # Specify the name of the PriorityClass that you created in Step 8. 
          containers:
          - name: nginx
            image: nginx:1.7.9 
            ports:
            - containerPort: 80
            resources:       
              requests:      
                cpu: 3                         # Specify the resource request of the workload. 
                memory: 5
    A PriorityClass that grants a higher priority than other pods is created for the pod of the workload, as shown in the following figure. When node resources are insufficient, the placeholder pod named placeHolder is evicted and changes to the Pending state. After the placeholder pod changes to the Pending state, a scale-out activity is triggered in the cluster because Auto Scaling is enabled for the cluster. Consequently, a new pod is created within seconds for the workload. pendingrun

How it works

A placeholder pod with an extremely low priority (a negative value) is created to occupy a certain amount of computing resources for other pods with higher priorities. If the computing resources are insufficient, the placeholder pod is evicted to release the occupied computing resources for the workload. This way, pods can be launched in seconds. cluster-autoscaler is also used to scale nodes in the cluster.