How to choose the elasticity strategy of K8S

The emergence of the microservice architecture has split the huge single application, making the development and collaboration between businesses more flexible. When faced with scenarios where business traffic increases, it is often necessary to expand the capacity of some application components. K8S provides HPA at the application level, and elastic components such as KEDA have been extended around the HPA open source community, providing the possibility for microservice applications to implement elastic policies based on business indicators. However, a major prerequisite for HPA to work normally is to ensure sufficient cluster resources. For this reason, users must expand the cluster capacity in advance or maintain cluster resource redundancy from time to time.
For the proposition of cluster resource elasticity, the K8S community has given two solutions, Cluster Autoscaler (CA) and Virtual Kubelet (VK). This article focuses on the form and characteristics of microservice applications, analyzes the applicable scenarios of CA and VK, and summarizes how to choose cluster resource elasticity for applications under the microservice architecture.

Microservice application form and characteristics
In the microservice application architecture, the microservice architecture splits a huge application system into discrete application components. These components are connected together through RPC to provide complete services externally. Each component is discrete, and most components can be scaled horizontally to adjust service capacity. Components on the non-core link allow delayed expansion or no expansion, or even shrink to give up resources.
There are five major characteristics in elastic scenarios under the microservice architecture:
• Horizontal scaling can adjust system capacity: When external resources are sufficient, horizontal scaling of microservice application components can increase the capacity of the business system.
• There are dependencies between applications: a single microservice application cannot provide complete services, and expanding a single microservice component can only increase the system capacity to a very limited extent. It is often necessary to expand the system capacity together with dependent services to effectively increase the system capacity.
• The application itself is stateless: If the microservice application itself is stateful, it is not conducive to horizontal expansion. For example, it has a strong dependence on the disk. The same node preempts the disk IO, and it also needs to consider the processing of state data when shrinking capacity. Therefore, it is necessary to transform it into a stateless application as much as possible.
• Fast start-up and no damage to the online and offline traffic of the service: The lossless online and offline traffic of the service is very important for the automatic expansion and contraction scenario, especially in the high-traffic and high-concurrency scenario. Under the action of the health probe, the newly expanded pods are continuously restarted by K8s, and the final expansion is invalid.
• Traffic is cyclical: Most microservice architecture applications are oriented to online services, so it can be described by the 80/20 law, that is, 80% of the traffic is processed 20% of the time. For business traffic, the most notable feature is that there are periodic changes, and these changes are often rapid, so the response speed of microservice application capacity expansion and contraction plays an important role in the stability of the business system.
When configuring application elasticity in the microservice application architecture, what we need to consider is to choose an appropriate indicator to measure the system capacity. When configuring cluster resource elasticity, what we need to consider is whether the expanded computing resources can meet the needs of the application.
K8S Cluster Resource Elasticity Technical Solution
As mentioned in the preface, the K8S community has given two "standard answer" frameworks, and the specific realization of resource elasticity also depends on the technical form and product capabilities of cloud vendors.
Virtual node: VK
Virtual Kubelet is a concept of "virtual node" proposed according to the definition of Kubelet, which allows cloud vendors to package cloud services as a "virtual node" and add them to the Kubernetes cluster. Behind the virtual nodes is often a large resource pool of cloud vendors, so in theory we can think that the resources of virtual nodes are unlimited, of course, the actual situation should be judged by the scale and product capabilities of cloud vendors.
Node scaling: CA
Cluster Autoscaler is a cluster node scaling solution given by the K8S community. The CA monitors all Pod events in the cluster. When a Pod cannot be scheduled due to insufficient resources, the CA will perform simulation expansion and scheduling calculations based on the scaling group information, and finally perform real node expansion according to the preset node expansion strategy. At the same time, CA monitors the overall resource utilization rate of the cluster. When the utilization rate is lower than the preset shrinkage threshold, CA performs simulated shrinkage scheduling calculations. After excluding various influencing factors, CA performs staining, drainage, and deletion of shrinkable nodes. this series of operations.
Comparison of the features of each program
The real node scaling based on CA technology and the virtual node based on VK technology have their own characteristics. The main differences are as follows:
In short, CA scales real nodes to provide complete K8S capabilities, but the response speed is slow; VK is driven by cloud vendor resource pools, providing second-level, unlimited resource elasticity, but there are no real nodes, thus losing some K8S features.
Cloud Vendor Solutions
In the two main resource elasticity technology directions of VK and CA, various cloud vendors have also launched corresponding products to provide corresponding solutions.
The direction of Serverless is mainly Serverless Instance and Serverless Cluster. Serverless Instance products include ECI, Fargate, and ACI, which are characterized by fast speed and unlimited resources. Serverless Cluster products include Alibaba Cloud's ASK and Google's GKE Autopilot. Cloud vendors maintain all cluster resources, and users can use them out of the box without O&M.
In the direction of node scaling, AWS also launched the open source component Karpenter, which bypasses the concept of scaling groups in CA, thus making resource selection more flexible when scaling.
Resource Elasticity Strategy Selection and Considerations
When it comes to resource elasticity, our primary consideration is capacity. That is, whether the new computing resources can meet business usage requirements. The flexible solution based on VK technology is limited by factors such as architecture design, security, and performance, and naturally lacks capabilities such as node characteristics and container privileges. Business applications with this part of the appeal should be modified as much as possible to remove related dependencies.
The second thing we consider is cost and efficiency. For enterprise applications, cost budgeting is an unavoidable topic. Different pricing rules and billing models will ultimately result in different resource costs, which will inevitably affect our preference for a certain technology. In the current serverless scenario, the computing resources generally adopt the pay-as-you-go mode. For some long-running applications, whether the computing resources using the prepaid mode can achieve more cost savings requires further research and cooperation. try. The level of cost includes not only resource costs, but also operation and maintenance costs, team technology learning costs, and migration costs implied by relying on specific cloud vendors. From a perspective, this part cannot be ignored. At the same time, for the technical team, while choosing the corresponding technical solution to save operation and maintenance costs and reduce team learning costs, it is necessary to rationally look at the relationship between this part of cost savings and the benefits brought by team growth.
Efficiency is one of the important factors that affect the cost of business revenue. From incoming traffic, to HPA responding according to indicators, to resource elasticity to make actions, and finally to application startup and service launch. Each of these links has a time cost. Usually, the smaller the time cost, the better, but there are also some businesses that are not sensitive to time cost. For each link of capacity expansion, corresponding technical solutions have been extended. For example, in the passive response of HPA, Alibaba Cloud has launched AHPA's early expansion capability with index prediction. For example, the startup of JAVA applications is slow, and GraalVM and Alibaba Dragonwell have made some efforts on cold startup. For applications with a clear business cycle, setting up timing elastic expansion in advance will naturally solve these problems.
Finally, there are still some scenario issues that need to be considered. When upgrading, migrating, and rebuilding the current application architecture, we need to take high-elasticity factors into consideration and select appropriate technologies.
To sum up, we have summarized a resource elastic selection strategy diagram and listed the factors that need to be considered when selecting cluster elastic types in general scenarios.

In the microservice architecture, we need to sort out and divide application components from a business perspective. For the core link components, it is necessary to ensure the robustness of these components as much as possible, and adjust them into a highly available and highly elastic architecture, so that the core business can run for a long time. For peripheral link components, it is necessary to weigh the benefits brought about by cost and high availability and high elasticity.
On the issue of K8S resource elasticity, in the existing technical means, we need to consider compatibility, efficiency and cost, so as to choose a cluster elasticity strategy suitable for our own business.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us