Driving Business Agility and Efficient Cloud Resource Management through Elastic Scheduling

This article presents two scenarios to illustrate how the elastic scheduling feature helps enterprises optimize resource allocation, reduce costs, and enhance efficiency.

By Kun Wu (Yueming)

What is Elastic Scheduling?

In the era of cloud computing, enterprises can acquire a large amount of computing resources from the cloud platform and adjust the types and amounts of resources used flexibly according to the development of business and the real-time changes in traffic requirements. Alibaba Cloud offers various elastic resources, such as Elastic Compute Service (ECS) and Elastic Container Instance (ECI). It also supports different billing methods, such as subscription, pay-as-you-go, and preemptible instances. While different types of instances and payment methods provide flexibility to customers, they also raise higher demands for customers' resource management capability.

Alibaba Cloud Container Service for Kubernetes (ACK) simplifies the management and O&M of cluster nodes using the node pool feature and supports automatic scaling by adjusting the number of nodes according to workloads and preset policies. It allows scaling of ECS instances in different zones, with different specifications, and utilizing different billing methods. Additionally, it enables the creation of ECIs for virtual nodes as needed to meet diverse requirements and optimize costs. However, customers still face the challenge of how to efficiently utilize and manage these computing resources.

The main challenges include:

1. Differentiated Control of Business Resource Usage

Clusters are set up with both subscription and preemptible instances, and service pods run on ECIs when resources are scarce. To ensure high-priority business operations remain stable on subscription instances, it's necessary to limit resource usage across different business types and instance types.

2. Failure to Release All the Application Pods during Scale-in

The default scale-in policy does not guarantee that service pods scaled out during peak hours are scaled in first. Consequently, some pods in the autoscaling node pool or on ECI remain undeleted post-peak hours. As a result, instances cannot be scaled in and continue to incur charges, necessitating manual migration of service pods.

Elastic scheduling aims to help customers tackle these challenges when utilizing elastic resources on the cloud. It supports resource scheduling based on multiple levels of priority and scaling in according to defined priorities.

Configure Priority-based Resource Scheduling

To address the difficulties customers encounter in multi-level resource management, Alibaba Cloud Container Service for Kubernetes (ACK) enhances the standard Kubernetes scheduling framework with an elastic scheduling feature. This feature allows for configuring custom priority-based resource scheduling.

This feature provides the capability to differentially schedule ECS and ECI resources, including:

Custom priority-based resource scheduling policies:

During application release or scaling out, customers can set the order in which application instance pods are scheduled to different types of node resources based on custom resource policies.

Scale-in in reverse order:

When customers use HPA to scale in service pods, they can scale out the service in reverse order of resource priority in the policy to ensure elastic resource priority scale-out and reduce billing.

Flexible policy modification:

When the policy changes, the priority of the scheduled service pods is adjusted synchronously. This feature is implemented through custom resource policies, eliminating the need to change business deployments.

Control over business resource usage:

Ensure that high-quality resources are preferentially provided for high-quality businesses by limiting the resource usage of different types of instances for different businesses.

Multiple resource usage statistical policies:

Support multiple resource usage statistical policies to limit business resource usage, such as ignoring pods in the Terminating process or ignoring pods that are scheduled before the ResourcePolicy is submitted..

Optimized deployment rolling update:

Pods created during a rolling update are automatically considered as a new group, eliminating the need to update ResourcePolicy each time a deployment is updated.

The following two scenarios illustrate how the elastic scheduling feature helps enterprises optimize resource allocation, reduce costs, and increase efficiency.

Scenario 1: Resident ECS Instances and Auto Scaling Preemptible Instances for Scale-in in Reverse Order

Preemptible instances, previously known as spot instances, are on-demand instances with similar performance to regular ECS instances. Their prices fluctuate in real-time based on market supply and demand. Compared to pay-as-you-go instances, preemptible instances can reduce instance costs by up to 90%. By using preemptible instances effectively, you can significantly lower the cost of cloud resources.

Preemptible instances may be reclaimed if they are preempted by other users with higher bids, which cannot guarantee the minimum requirements of running instances. Therefore, users typically combine preemptible instances with long-term subscription instances.

During peak hours, horizontal pod autoscaling (HPA) automatically increases the number of pods. Subsequently, the autoscaling node pool scales out preemptible instances based on the number of pending pods.

During off-peak hours, the pods in the preemptible instance should be preferentially reclaimed, allowing the autoscaling node pool to reclaim the preemptible instance to reduce resource overhead. Currently, the default scale-in policy may reclaim pods that run on resident ECS instances, resulting in additional overhead.

As shown in the figure below, cn-hongkong.192.168.7.147 and cn-hongkong.192.168.7.148 are labeled with unit=first and unit=second respectively, indicating a preference for scheduling pods to cn-hongkong.192.168.7.147.

After scaling in to three pods:

As shown in the preceding figure, pods remain on cn-hongkong.192.168.7.148.

To preferentially reclaim preemptible instances, you can submit the following ResourcePolicy to clusters. In the following example, service pods are scheduled to the nodes with alibabacloud.com/nodepool-id: example-spot-instance-nodepool-id only when all top-ranked nodes with alibabacloud.com/nodepool-id: example-ecs-nodepool-id cannot be scheduled with pods.

In the policy, it is assumed that ECS instances and preemptible instances are distinguished by node pool IDs. You can also distinguish between them by type or other custom labels. In practice, you can replace variables to suit your needs.

apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
  name: $example-name
  namespace: $example-namespace
spec:
  selector:
    $example-label-key: $example-label-value
  strategy: prefer
  units:
  - nodeSelector:
      alibabacloud.com/nodepool-id: $example-ecs-nodepool-id
    resource: ecs
  - nodeSelector:
      alibabacloud.com/nodepool-id: $example-spot-instance-nodepool-id
    resource: ecs

The following results are returned after configuration:

In this example, we submit a policy that preferentially schedules nodes with the unit=first label and then schedules nodes with the unit=second label.

During scale-out, after pods fully use the resources on node cn-hongkong.192.168.7.147, subsequent pods are scheduled to node cn-hongkong.192.168.7.148.

Finally, we scale in the business deployment into three replicas. In this case, pods on node hongkong.192.168.7.148 are preferentially removed, achieving the effect of scale-in in reverse order. In practice, after pods on cn-hongkong.192.168.7.148 are removed, the autoscaling node pool automatically reclaims the corresponding nodes to save costs.

Scenario 2: Control over Business Resource Usage by Using the Max Option

Rolling update is a vital process during service launch. To prevent the service from being affected during the process, a "delete after creation" policy is usually adopted, that is, gradually cleaning up earlier replicas after new replicas run normally. This makes the actual amount of resources consumed by the business during the rolling update higher than that during the running process. Excessive resource consumption may affect the running or scaling of other services in the cluster.

In order to limit the resource usage of the service on some types of resources, you can use the Max option, which allows you to limit the number of service pods on each resource type. To enable this feature, you only need to add a Max field to the unit of ResourcePolicy:

apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
  name: $example-name
  namespace: $example-namespace
spec:
  selector:
    $example-label-key: $example-label-value
  strategy: prefer
  units:
  - nodeSelector:
      alibabacloud.com/nodepool-id: $example-ecs-nodepool-id
    resource: ecs
    max: $example-max
  - nodeSelector:
      alibabacloud.com/nodepool-id: $example-spot-instance-nodepool-id
    resource: ecs

After the Max value is set, if the number of service pods running on a unit reaches the Max value, the next pod is scheduled to the next unit. When the last unit is full, the pod fails to be scheduled.

The following is an example of using Max to limit resource usage. We still label cn-hongkong.192.168.7.147 and cn-hongkong.192.168.7.148 with unit=first and unit=second respectively.

If there is no limit on resource usage, the service will use up high-quality resources before using the next-level resources. If there are other high-quality applications, the high-quality resources cannot be used.

In this example, we submit a policy of first scheduling to nodes with the unit=first label, then scheduling to nodes with the unit=second label. We also add a limit of scheduling up to one service pod to the level-1 policy, which limits the usage of the app=nginx service on level-1 resources to one pod.

After configuring the ResourcePolicy, you will notice that the resource usage of the service on the machine labeled unit=first is limited. Any excess beyond the limit is then scheduled to the machine labeled unit=second, achieving a reasonable allocation of service resources.

What's Next?

The elastic scheduling feature helps enterprises efficiently utilize cloud resources by using multi-level priority scheduling, multi-level resource limiting, and scale-in in reverse order. In addition to the two basic features described in this article, the priority-based resource scheduling of ACK also supports advanced features such as resource statistical policies and intelligent grouping based on labels. By flexibly setting elastic scheduling, enterprises can implement efficient resource allocation and cost management, and better cope with resource management challenges brought by business growth.

Learn more about cloud-native AI suite:
https://www.alibabacloud.com/help/en/ack/cloud-native-ai-suite/product-overview/overview-8

Community

Driving Business Agility and Efficient Cloud Resource Management through Elastic Scheduling

What is Elastic Scheduling?

1. Differentiated Control of Business Resource Usage

2. Failure to Release All the Application Pods during Scale-in

Configure Priority-based Resource Scheduling

Scenario 1: Resident ECS Instances and Auto Scaling Preemptible Instances for Scale-in in Reverse Order

Scenario 2: Control over Business Resource Usage by Using the Max Option

What's Next?

Read previous post:

Read next post:

Alibaba Cloud Native

You may also like

Comments

Alibaba Cloud Native

Related Products

ECS(Elastic Compute Service)

Elastic Container Instance

ECS Bare Metal Instance

Elastic High Performance Computing Solution