All Products
Search
Document Center

Container Service for Kubernetes:Use ack-autoscaling-placeholder to scale pods within seconds

Last Updated:Apr 22, 2024

When you want to launch pods within seconds without worrying about node resources, you can use the ack-autoscaling-placeholder component. ack-autoscaling-placeholder provides a buffer for the auto scaling of pods in a Container Service for Kubernetes (ACK) cluster. A placeholder pod with an extremely low priority (a negative value) is created to occupy a certain amount of computing resources for other pods with higher priorities. When the computing resources are insufficient, the placeholder pod is evicted to release the occupied computing resources for the workload. This way, pods can be launched within seconds. cluster-autoscaler is also used to scale nodes in the cluster. This topic describes how to use ack-autoscaling-placeholder to scale pods within seconds.

Prerequisites

Auto Scaling is enabled for your ACK cluster. For more information about how to enable auto scaling, see Enable node auto scaling.

Procedure

  1. Log on to the ACK console. In the left-side navigation pane, choose Marketplace > Marketplace.

  2. On the App Catalog tab, find and click ack-autoscaling-placeholder.

  3. On the ack-autoscaling-placeholder page, click Deploy.

  4. On the Basic Information wizard page, select a cluster from the Cluster drop-down list and a namespace from the Namespace drop-down list, and then click Next. Select a chart version from the Chart Version drop-down list, configure the parameters, and then click OK.

    After ack-autoscaling-placeholder is deployed, go to the cluster details page. In the left-side navigation, choose Applications > Helm. You can find that the application is in the Deployed state.

  5. In the left-side navigation pane of the details page, choose Applications > Helm.

  6. On the Helm page, find ack-autoscaling-placeholder and click Update in the Actions column. In the Update Release panel, modify the YAML template based on your requirements, and then click OK.

    nameOverride: ""
    fullnameOverride: ""
    ##
    priorityClassDefault:
      enabled: true
      name: default-priority-class
      value: -1
    
    ##
    deployments:
       - name: ack-place-holder
         replicaCount: 1
         containers:
           - name: placeholder
             image: registry-vpc.cn-shenzhen.aliyuncs.com/acs/pause:3.1
             pullPolicy: IfNotPresent
             resources:
               requests:
                 cpu: 4                  # Occupy 4 vCPUs and 8 GiB of memory. 
                 memory: 8               
         imagePullSecrets: {}
         annotations: {}
         nodeSelector:                   # Specify rules that are used to select nodes. 
           demo: "yes"  
         tolerations: []
         affinity: {}
         labels: {}
  7. Create a PriorityClass for the workload.

    In this example, a PriorityClass that grants a high priority is created.

    kubectl apply -f priorityClass.yaml
    apiVersion: scheduling.k8s.io/v1
    kind: PriorityClass
    metadata:
      name: high-priority
    value: 1000000              # Specify the priority. 
    globalDefault: false
    description: "This priority class should be used for XYZ service pods only."
  8. Deploy a workload.

    kubectl apply -f workload.yaml
    apiVersion: apps/v1 
    kind: Deployment
    metadata:
      name: placeholder-test
      labels:
        app: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          nodeSelector:                        # Specify rules that are used to select nodes. 
            demo: "yes"
          priorityClassName: high-priority     # Specify the name of the PriorityClass that you created in Step 8. 
          containers:
          - name: nginx
            image: nginx:1.7.9 
            ports:
            - containerPort: 80
            resources:       
              requests:      
                cpu: 3                         # Specify the resource request of the workload. 
                memory: 5

    A PriorityClass that grants a higher priority than other pods is created for the pod of the workload, as shown in the following figure. When node resources are insufficient, the placeholder pod named placeHolder is evicted and changes to the Pending state. After the placeholder pod changes to the Pending state, a scale-out activity is triggered in the cluster because Auto Scaling is enabled for the cluster. Consequently, a new pod is created within seconds for the workload. pendingrun

References

Multi-zone load balancing is a deployment solution commonly used in high availability scenarios for data services. If an application that is deployed across zones does not have sufficient resources to handle heavy workloads, you may want ACK to create a specific number of nodes in each zone of the application. For more information, see Configure auto scaling for cross-zone deployment.