By Yanxun, Core Development Engineer of Alibaba Cloud EDAS and Andy Shi, Technical Evangelist of Alibaba Cloud
Cloud-native technology stacks are increasingly widespread, but how can we implement Kubernetes in a more efficient and accessible manner to show the real value of cloud-native? This is a new challenge and a hot topic. The focus on cloud-native technology gradually has shifted from "usage" to "better usage." Thus, CNCF SIG App Delivery teamed up with Alibaba Cloud's Cloud-Native Application Platform Team. They launched a series of articles called From Zero to One: Building a Cloud-Native Application Management Platform. These articles aim to help readers better implement and practice core cloud-native technologies and build their own application-centered Kubernetes platform.
Alibaba Cloud Enterprise Distributed Application Service (EDAS) is an all-in-one PaaS platform for application lifecycle management and monitoring. Besides, it is the first Internet-level commercial platform to implement the Open Application Model (OAM) on the public cloud. Today, the kernel of the EDAS application management layer is built on the native Kubernetes cluster based on the KubeVela open-source project. The platform has served thousands of cloud application developers in an efficient, stable, intelligent, and scalable manner. In this article, we use the underlying technology of EDAS as a specific example. We describe the problems and solutions Alibaba Cloud encountered while designing and implementing intelligent application scaling strategies in a production environment. We also include the best practices for building a cloud-native application platform.
As a core product for application management and delivery, EDAS has already completed the overall architecture migration from the exclusive virtual machine to a Kubernetes container cluster in the early days. Like most Kubernetes-based PaaS platforms, EDAS implements application automatic scaling based on CPU and memory, which are provided by the native Horizontal Pod Autoscaler (HPA) of Kubernetes at this stage. However, with the increase of users and diversified demands, the native HPA-based application scaling policy gradually exposed many shortcomings.
As a "Platform for Platform" project, the built-in capabilities of Kubernetes are mainly used for container-level management and orchestration. However, for products that focus on applications and users, scaling metrics, such as CPU and memory, are too coarse-grained. Although HPA provides a degree of customizing metrics, its overall scalability is not flexible enough. In addition, the pluggability of customized metrics is poor. When we tried to refine metrics to applications or source code, the HPA code, which is a part of Kubernetes code, needed modification. Therefore, we must think about how to implement fine-grained application scaling policies through an external framework with powerful scalability.
We know scale-to-zero is a typical automatic scaling scenario in Serverless and FaaS scenarios. It can effectively help users save idle resources and reduce platform usage costs. In modern microservices applications, many microservices hosted on the cloud by users also have some characteristics of Serverless applications, such as being stateless and traffic-based responding. Thus, scale-to-zero is also an important requirement for them. However, the built-in HPA in Kubernetes is not suitable for this scenario and does not provide this capability. EDAS is a full-featured PaaS product and seeks atomicity that is independent and free from platform binding. These demands make it impossible to solve the problems in all user scenarios by introducing Serverless solutions, such as OpenFaas or Knative.
Except for scale-to-zero, scheduled scaling is an indispensable feature required by EDAS users. Similarly, this application O&M capability must be the independent atomicity capability. We cannot just introduce a complete set of solutions from another platform for one requirement.
Alibaba Cloud planned a new version of EDAS with an automatic scaling capability to solve the preceding problems. At the same time, the underlying architecture of EDAS has been undergoing a series of evolution and upgrades based on the Open Application Model (OAM) since the beginning of 2020. By doing so, the team aims to introduce a standardized and pluggable application definition model to replace the original Application CRD of EDAS. Then, the team can provide an application-centered upper abstract to users rather than forcing users to learn the underlying concepts in Kubernetes. The team can also use the scalability of the model to ensure that EDAS can insert various capabilities from the cloud-native ecosystem into products with one click. Therefore, the design and implementation of this new automatic and elastic scaling component are integrated with the OAM-based architecture of EDAS.
In this new architecture, the automatic elastic scaling policy of an application is the Trait of this application. The concept of "application" here is a Kubernetes-based upper abstraction exposed for users by EDAS through OAM. It is described with primitive words on the user side. Then, there is a question, "How can the user-defined and application-oriented elastic scaling policy be implemented or selected in the specific implementation layer of Kubernetes?"
Combining the three specific challenges mentioned earlier and the OAM-based Kubernetes-native design of the new EDAS, the team decided to introduce a horizontal scaling component from the open-source community to solve the preceding problems. The team summarizes three main selection requirements for EDAS scenarios:
After evaluation and selection in the community, the team finally chose the open-source KEDA project of Microsoft, which is hosted by CNCF. The KEDA project natively supports scale-to-zero. More importantly, it decouples the scaled object from scaling metrics for application-level horizontal scaling and proposes corresponding abstract interfaces respectively through the Scaler + Metrics Adapter mechanism. This provides a powerful plug-in mechanism and a unified definition method for all scaling policies. In addition, the design and architecture of KEDA are relatively simple, without complex black technologies. Many built-in scalers can be used directly, meeting the overall demands of EDAS.
In terms of technical architecture, the kernel of Alibaba Cloud EDAS is built based on the KubeVela open-source project from the OAM community. With the native extension mechanism of Kubernetes provided by OAM, the EDAS R&D team does not need to be the same as the traditional PaaS team. The team doesn't have to perform massive secondary development or modify the user-side API when launching features from the cloud-native open-source community, such as KEDA. The team only needs to register the CRD of KEDA as an Autoscale Trait of EDAS according to the OAM specification. Then, users can use the newly added horizontal scaling capability after completing the monitoring data connection. The overall architecture is shown on the chart below:
In its implementation, EDAS drives KEDA for rapid horizontal scaling of the workload, based on the fine-grained application-level monitoring data provided by Alibaba Cloud ARMS. ARMS Scaler was added in KEDA. EDAS also fixed many problems and enhanced some aspects of KEDA v1, including:
These problems have been submitted (or are being submitted) by the EDAS Team to the KEDA upstream, and some of them have been fixed in the KEDA v2.
Kubernetes has a long-standing problem where automatic scaling and gray release often conflict. To address this problem, EDAS uses the semantics of the OAM model layer to carry out the mutual exclusion of these two capabilities.
EDAS is currently working with open-source communities to add many new capabilities to the KEDA-based Autoscaler Trait, including:
In the future, the EDAS Team will mainly focus on integrating the current architecture with the AIOps capabilities of EDAS. Thus, a more intelligent and elastic experience for the entire platform can be achieved, including:
In the next version, these KEDA-based innovations and enhancements will bring more powerful, intelligent, and stable application auto-scaling capabilities and a friendlier user experience.
This article introduces the challenges and solutions of the Alibaba Cloud Enterprise Application Platform during the support of the horizontal scaling component of KEDA by using the automatic elastic scaling of EDAS as an example. This procedure is based on the OAM and KubeVela projects in the classic PaaS scenario. In the future, this KEDA-based platform will integrate with a wider range of scaling metrics and more intelligent decision-making mechanisms.
As the cloud-native ecosystem evolves, Alibaba Cloud EDAS is practiced on a large scale in the cloud-native application management field. EDAS brings application versioning, dependency management, O&M feature interaction, batch delivery, and other enhancements. Moreover, it provides a wide range of best practices and experiences. Alibaba Cloud EDAS can integrate with the "new forces" of cloud-native communities, such as KEDA, with the support from standardized and open product architecture. It launches powerful application management capabilities from open-source communities for users in a standardized and scalable manner. It achieves user-centered technological innovation and evolution and moves towards the next era of PaaS cloud-native application.
Alibaba EMR - May 11, 2021
Alex - January 22, 2020
Alibaba Developer - September 23, 2020
Alex - June 21, 2019
Alibaba Clouder - July 13, 2020
Alibaba Clouder - February 24, 2021
MSE provides a fully managed registration and configuration center, and gateway and microservices governance capabilities.Learn More
Provides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resourcesLearn More
A secure image hosting platform providing containerized image lifecycle managementLearn More
Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.Learn More
More Posts by Alibaba Cloud Native