Alibaba Cloud Container Service for Kubernetes (ACK) provides the auto scaling component for nodes to automatically scale in and out. Regular instances, GPU-accelerated instances, and preemptible instances can be automatically added to or removed from an ACK cluster based on your requirements. This feature is applicable to instances deployed across multiple zones and diverse instance types, and also supports different scaling modes.

How it works

The auto scaling feature automatically scales nodes based on how resources are allocated in an ACK cluster. The CPU and memory resources of a node are allocated based on requests from pods. When the pods on a node are overwhelmed by resource requests, they become pending. The auto scaling component then automatically calculates the number of nodes required to be added based on the predefined resource specifications and constraint of the scaling group. If the scaling requirements are met, the nodes in the scaling group are added to the ACK cluster. If the resource demand on a node in the scaling group drops below the threshold, the auto scaling component automatically removes the node from the ACK cluster. Therefore, appropriate resource request settings are the prerequisite for auto scaling.

Notes

  • For each account, the default CPU quota of pay-as-you-go instances in each region is 50 vCPUs. You can create up to 48 custom route entries in each route table in a Virtual Private Cloud (VPC) network. To increase the quota, submit a ticket.
  • The stock of a specific Elastic Compute Service (ECS) instance type greatly fluctuates. We recommend that you specify multiple instance types for a scaling group. This improves the success rate of node scaling.
  • In swift mode, when a node is shut down and reclaimed, it stops running and remains in the NotReady state. When a scale-out event is triggered, the state of the node is changed to Ready.
  • When a node is shut down and reclaimed in swift mode, you are charged only for the storage costs of the disk. Exceptions are nodes that use local disks, such as the instance type of ecs.d1ne.2xlarge, where the computing costs are charged in addition. However, if the stock of the node resources is sufficient, nodes can be launched immediately.
  • If you have bound elastic IP addresses (EIPs) to the pods, we recommend that you do not delete the auto scaling group or the ECS nodes scaled out by the scaling group from the ECS console. Otherwise, the EIPs cannot be automatically released.

Perform auto scaling

  1. Log on to the ACK console.
  2. In the left-side navigation pane, click Clusters.
  3. Find the cluster that you want to scale and select Auto Scaling from the More drop-down list in the Actions column.
  4. Find the cluster that you want to scale and select Auto Scaling from the More drop-down list in the Actions column.

Authorization

Activate Auto Scaling (ESS)
  1. In the dialog box that appears, click Auto Scaling (ESS) to log on to the ESS console.
    Activate Auto Scaling (ESS)
  2. Click Activate Auto Scaling to go to the Enable Service page.
    Enable Service
  3. Select I agree with Auto Scaling Service Agreement of Service and click Enable Now.
  4. On the Activated page, click Console to return to the ESS console.
  5. Click Authorize to go to the Cloud Resource Access Authorization page, and then authorize ESS to access other cloud resources.
  6. Click Confirm Authorization Policy.

Expected results

If the authorization is successful, you are redirected to the ESS console. Go to the page in Step 1 and continue to Grant permissions to RAM roles.

Grant permissions to RAM roles

  1. Click the second hyperlink in the dialog box to go to the RAM Roles page.
    Note You must log on to the console with an Alibaba Cloud account.
    RAM Roles
  2. Click the Permissions tab. Click the policy you want to modify to go to the details page.
    Permissions
  3. Click Modify Policy Document. The Modify Policy Document page appears on the right side of the page.
  4. Add the following content to the Action field of the Policy Document:
    
    "ess:Describe*", 
    "ess:CreateScalingRule", 
    "ess:ModifyScalingGroup", 
    "ess:RemoveInstances", 
    "ess:ExecuteScalingRule", 
    "ess:ModifyScalingRule", 
    "ess:DeleteScalingRule", 
    "ecs:DescribeInstanceTypes",
    "ess:DetachInstances",
    "vpc:DescribeVSwitches"
    Note Add a comma (,) to the end of the bottom line in the Action field before you add the following content.
    If you need to bind EIPs to the scaling group, add the following policies:
    1. If you need to bind EIPs to the scaling group, add the following policies:
      "ecs:AllocateEipAddress",
      "ecs:AssociateEipAddress",
      "ecs:DescribeEipAddresses",
      "ecs:DescribeInstanceTypes",
      "ecs:DescribeInvocationResults",
      "ecs:DescribeInvocations",
      "ecs:ReleaseEipAddress",
      "ecs:RunCommand",
      "ecs:UnassociateEipAddress",
      "ess:CompleteLifecycleAction",
      "ess:CreateScalingRule",
      "ess:DeleteScalingRule",
      "ess:Describe*",
      "ess:DetachInstances",
      "ess:ExecuteScalingRule",
      "ess:ModifyScalingGroup",
      "ess:ModifyScalingRule",
      "ess:RemoveInstances",
      "vpc:AllocateEipAddress",
      "vpc:AssociateEipAddress",
      "vpc:DescribeEipAddresses",
      "vpc:DescribeVSwitches",
      "vpc:ReleaseEipAddress",
      "vpc:UnassociateEipAddress",
      "vpc:TagResources"
    2. On the RAM Roles page, click Trust Policy Management. On the Edit Trust Policy tab, add oos.aliyuncs.com to the field as shown in the following figure.oos
  5. Click OK.

Configure auto scaling

  1. On the Auto Scaling page, set the following parameters and click Submit.
    Parameter Description
    Clusters The name of the ACK cluster.
    Scale-in Threshold In a scaling group managed by cluster-autoscaler, the system automatically calculates the ratio of the requested resources to the resource capacity of a node. If the ratio is lower than the threshold, nodes in the scaling group are removed from the ACK cluster.
    Note Scale-out events are automatically triggered based on the scheduling of pods. Therefore, you only need to configure scale-in rules.
    Defer Scale-in For The amount of time that the cluster must wait before the cluster scales in. Unit: minute. The default value is 10 minutes.
    Cooldown The cooldown period after a scale-in event is triggered. No scale-in event is triggered again during this period of time. Unit: minutes. The default value is 10 minutes.
  2. Select an instance type based on your workload requirements, such as a regular instance, an GPU-accelerated ECS instance, or a preemptible instance. Then, click Create.
    Instance types
  3. On the Auto Scaling Group Configuration page, set the following parameters to create a scaling group.
    Parameter Description
    Region The region where the scaling group is deployed. The region where the cluster is deployed is selected and cannot be changed.
    VPC The VPC network where the cluster is deployed is automatically selected.
    VSwitch You can select VSwitches based on the zones where they are deployed and the CIDR blocks of the pods.
  4. Configure worker nodes.
    Parameter Description
    NodeType The types of nodes in the scaling group. The node types must be the same as those in the cluster.
    Instance Type The types of instances in the scaling group.
    Selected Types The instance types that you have selected. You can select up to 10 instance types.
    System Disk The system disk of the scaling group.
    Mount Data Disk Specifies whether to mount data disks to the scaling group. By default, no data disk is mounted.
    Instances The number of instances in the scaling group.
    Note
    • Existing instances in the cluster are not taken into account.
    • By default, the minimum number of instances is 0. If you specify one or more instances, the system will add instances to the scaling group. When a scale-out event is triggered, instances in the scaling group are added to the cluster to which the scaling group is bound.
    Key Pair The key pair used to log on to the nodes after the nodes are added to the ACK cluster. You can create key pairs in the ECS console.
    Note Only key pair logon is supported.
    Scaling Mode You can select Standard or Swift mode.
    RDS whitelist The Relational Database Service (RDS) instances that can be accessed by the nodes in the scaling group after a scale-out event is triggered.
    Node Label Node labels are automatically attached to the nodes that are added to the ACK cluster after a scale-out event is triggered.
    Taints After you add taints to a node, ACK will not schedule pods to the node.
  5. Click OK to create the scaling group.

Expected results

  1. On the Auto Scaling page, you can find the newly created scaling group in the regular instance section.
    Auto Scaling
  2. On the Clusters page, click the name of the target cluster or click Manage in the Actions column.
  3. In the left-side navigation pane, click Workload.
  4. On the Deployments tab, select the scaled cluster and select the kube-system namespace. You can find the cluster-autoscaler component. This indicates that the scaling group is created.

FAQ

  • Why does the auto scaling component fail to add nodes after a scale-out event is triggered?
    Check for the following issues:
    • Whether instance types configured for the scaling group match the requested instance types of pods. Whether the requested resource specifications of the pods are within the specified instance specifications.
    • Whether you have authorized the Resource Access Management (RAM) roles to manage the cluster. Whether the authorization process is performed on each cluster involved in the scale-out event.
    • Whether the cluster can access the Internet. Nodes in a scaling group require Internet access. This is because the auto scaling component needs to call Alibaba Cloud APIs over the Internet.
  • Why does the auto scaling component fail to remove nodes after a scale-in event is triggered?
    Check for the following issues:
    • Whether the ratio of the requested resources to the resource capacity of the node is higher than the scale-in threshold.
    • Whether the pods in the kube-system namespace are running on the nodes.
    • Whether the pods are configured with a scheduling policy, which forbids these pods to be scheduled to other nodes.
    • Whether the pods on the nodes are configured with a PodDisruptionBudget and the number of pods has reached the specified minimum requirement.

    For more frequently asked questions about the auto scaling component, visit the open source community.

  • How does the system select from multiple scaling groups for a scale-out event?

    When pods cannot be scheduled to nodes, the auto scaling component will simulate the scheduling of pods based on the configurations of the scaling group. The configurations include the labels, taints, and instance specifications. If a scaling group can simulate the scheduling of pods, this scaling group is selected for the scale-out event. If more than one scaling groups can simulate the scheduling of pods, the system chooses the scaling group that has the least idle resources remained after the simulation.