KubeVela + KEDA: Inherent Auto Scaling for Applications

Written Jointly by:
Yan Xun, Senior Engineer of Alibaba Cloud EDAS Team
Andy Shi, Alibaba Cloud Developer
Tom Kerkhove, Head of Codit Containerized Business, Azure Architect, KEDA Maintainer, and CNCF ambassador

You will think of some fields when you scale Kubernetes. However, if you are new to Kubernetes, you may find it difficult to cope.

This article will briefly explain how to make application auto scaling simple with KEDA and why Alibaba Cloud Enterprise Distributed Application Service (EDAS) is fully standardized on KEDA.

Kubernetes Scaling

When managing Kubernetes clusters and applications, you need to carefully monitor various things, such as:

Cluster Capacity: Do we have enough resources available to run our workloads?
Application Workloads: Does the application have enough resources available? Can it keep up with the work to be done, such as queue depth?

You usually set alerts or use auto scaling to achieve automation and get notifications. Kubernetes is a great platform to help you implement this instantly available feature.

You can scale a cluster easily using Cluster Autoscaler components. The component monitors the cluster to detect pods that cannot be scheduled due to a shortage of resources and starts to add or remove nodes accordingly.

Since Cluster Autoscaler only starts when pods are over-scheduled, you may have a time interval when your workloads are not starting and running.

Virtual Kubelet, a CNCF sandbox project, allows you to add a virtual node to a Kubernetes cluster where a pod can be scheduled.

By doing so, platform suppliers, including Alibaba, Azure, and HashiCorp, allow you to overflow pending pods outside the cluster until it provides the required cluster capacity to alleviate the problem.

In addition to scaling clusters, Kubernetes allows you to scale applications easily:

Horizontal Pod Autoscaler (HPA) allows you to add or remove more pods to your workloads to scale in or scale out (to add or remove replicas.)
Vertical Pod Autoscaler (VPA) allows you to add or remove resources to your pods to scale up or scale down (to add or remove CPU or memory.)

All these provide a good start to scale applications.

HPA Limitations

Although HPA is a good start, it mainly focuses on pod metrics, allowing you to scale it based on CPU and memory. In other words, you can fully configure the way it auto-scales, which makes it powerful.

This is ideal for some workloads because you usually want to scale based on metrics from other fields, such as Prometheus, Kafka, cloud providers, or other events.

Thanks to external metrics support, you can install the metrics adapter to provide metrics from external services and scale them automatically using the metrics server.

Note: You can only run one metric server in a cluster, which means you must select the source of the custom metrics.

You can use Prometheus and tools, such as Promitor, to obtain your metrics from other providers and scale them as a single source. However, this requires a lot of plumbings and work to expand.

There is an easier way; it is Kubernetes Event-Driven Autoscaling (KEDA)!

What Is KEDA?

Kubernetes Event-Driven Autoscaling (KEDA) is a single-purpose event-driven autoscaler for Kubernetes that can be added to Kubernetes clusters easily to scale applications.

It aims to make application auto scaling extremely simple and optimize costs by supporting scale-to-zero.

KEDA manages everything for you without scaling infrastructure, allowing you to scale more than 30 systems for your scaler.

You only need to create ScaledObject or ScaledJob to define the object you want to scale and the trigger you want to use. Then, KEDA will handle everything else!

You can scale anything, even if it is the CRD of another tool you are using, as long as it implements or scales sub-resources.

Did KEDA reuse the adapter? No! Instead, it extends Kubernetes by HPA at the underlying level, and HPA uses external metrics provided by our metrics adapter, which replaces all other adapters.

Last year, KEDA joined CNCF. As a CNCF sandbox project, it plans to upgrade the proposal to the incubation phase later this year.

Alibaba's Practices Based on OAM/KubeVela and KEDA

As a leading enterprise PaaS product on Alibaba Cloud, EDAS has been serving countless developers on the public cloud on a large scale for many years. From the perspective of architecture, EDAS was built in conjunction with the KubeVela project. The following figure shows the overall architecture:

In production, EDAS integrates ARMS with Alibaba Cloud to provide fine-grained metrics of monitoring and applications. The EDAS /team has added an ARMS Scaler to the KEDA project to perform auto scaling. Besides, some features are added, and some bugs in the KEDA version 1.0 version are fixed as well, including:

When there are multiple triggers, these values will be summed instead of being left as separate values.
When creating KEDA HPA, the length of the name will be limited to 63 characters to avoid triggering DNS complaints.
Triggers cannot be disabled, which may cause trouble in production.

The EDAS Team is actively sending these fixes to the upstream KEDA, although some of them have been added to version 2.0.

Why Does Alibaba Cloud Standardize KEDA as an Autoscaler for Applications?

When it comes to auto scaling, EDAS initially used the CPU and memory of the upstream Kubernetes HPA as two metrics. However, with the growth of user groups and the diversification of demand, the EDAS Team discovered the following limitations of the upstream HPA:

Support for custom metrics is limited, especially for fine-grained application-level metrics. The upstream HPA focuses on container-level metrics, such as CPU and memory, which are too rough for applications. Metrics that reflect application loads, such as RT and QPS, are not supported. Indeed, HPA can be extended. However, this capability is limited when it comes to application-level metrics. The EDAS Team is often forced to fork code when trying to introduce fine-grained application-level metrics.
Scale-to-zero is not supported. Many users have a scale-to-zero need when their microservices are not being used. This need is not limited to FaaS or serverless workloads. It saves costs and resources for all users. Currently, the upstream HPA does not support this feature.
Scheduled scaling is not supported. EDAS users also need scheduled scaling capabilities. Similarly, the upstream HPA does not support this feature. So the EDAS Team needs to find non-vendor locked alternatives.

Based on these needs, the EDAS Team began to plan a new version of EDAS auto-scaling. Meanwhile, EDAS introduced OAM in early 2020 and reformed its underlying core components thoroughly. OAM provides EDAS with standardized and pluggable application definitions to replace the internal Kubernetes application CRD. The scalability of this model enables EDAS to integrate with any new features of the Kubernetes community easily. Therefore, the EDAS Team tried to combine the requirements for new EDAS auto scaling features with the standard implementation of the OAM auto scaling feature.

The EDAS Team defines three metrics based on the instances:

The auto scaling feature should present itself as a simple atomic function without adding any complex solutions.
Metrics should be pluggable, so the EDAS Team can customize them and build on them to support various requirements.
It requires out-of-the-box supports to achieve a scale-to-zero feature.

After a detailed evaluation, the EDAS Team chose the KEDA project. The project was made open-source by Microsoft and Red Hat and has been donated to CNCF. KEDA provides several useful scalers and out-of-the-box supports to achieve a scale-to-zero feature by default. It provides fine-grained auto scaling for applications. Moreover, it has the concept of scalar and metric adapters, supporting a powerful plug-in architecture and providing a unified API layer. Most importantly, KEDA only focuses on auto scaling, which allows it to be integrated easily as an OAM feature. In general, KEDA is very suitable for EDAS.

Prospects

Alibaba is actively promoting the KEDA feature driven by AIOps to bring intelligent decision-making to its auto scaling behavior. This will leverage the newly implemented application QoS triggers and database metric triggers in Alibaba's KEDA component, essentially achieving auto scaling decisions based on expert systems and historical data analysis. Therefore, a more powerful, intelligent, and stable KEDA-based auto scaling feature is expected to be released in KEDA soon.

Community

KubeVela + KEDA: Inherent Auto Scaling for Applications

Kubernetes Scaling

HPA Limitations

What Is KEDA?

Alibaba's Practices Based on OAM/KubeVela and KEDA

Why Does Alibaba Cloud Standardize KEDA as an Autoscaler for Applications?

Prospects

Read previous post:

Read next post:

Alibaba Cloud Native Community

You may also like

Comments

Alibaba Cloud Native Community

Related Products

Auto Scaling

Super Computing Cluster

ApsaraDB for MyBase

EDAS