From Kubernetes to Serverless

1、 What does K8S bring to Serverless

Kubernetes is a container based scheduling engine. In the early days, container based technologies included LXC, Cgroup, etc. In 2013, the open source of Docker project marked the maturity of container technology and the birth of container image standard.

Before the birth of the container image standard, the industry did not really split the system with respect to the notes. We believe that the system is infrastructure oriented. The birth of the container image standard means that the system will be split according to two concerns: one is application oriented, and the other is infrastructure oriented. Application oriented is container image, and infrastructure oriented is kernel. With container images, container based deployment ecology can be continuously improved.

In 2014, the Kubernetes project was born, and Lambda released it in the same year, marking the further commercialization of Serverless. In 2016, the establishment of CNCF Cloud Native Foundation marked the gradual maturity and growth of Kubernetes ecology. In 2018, Knative and Alibaba Serverless application engines were released, marking the commercialization of Kubernetes based serverless.

Under the framework of CNCF, many Kuebrnetes ecological projects have been born and are developing rapidly.

The uniqueness of Kubernetes lies in its concept. For example, it follows the concept of serverless, such as immutable infrastructure, terminal oriented, declarative API, etc., which has created today's Kubernetes ecosystem.

The early Linux operating system was oriented to a server, and the application runtime could be decoupled from the underlying devices. Applications can be adapted and run on different hardware through the API of the kernel. Its design philosophy is "everything is a file", which also brings about the prosperity of Linux ecology. Kubernetes' declarative API can also be seen as the CRUD interface of resource files, which realizes the decoupling between distributed infrastructure and applications.

Linux is for a single machine, and Kubernetes is for a distributed system. All nodes in the cluster are flattened and can be directly scheduled. Because of its concept of an immutable infrastructure, processes can be rebuilt after being hung up and can be scheduled to other nodes.

Kubernetes is a virtual distributed operating system. Problems encountered on Linux system will also be encountered on Kubernetes.

2、 Problems encountered in distributed application delivery

Since the development of Kubernetes, the ecology has been constantly improving, while the ecology of containers and Kubernetes has established a series of application delivery forms, but exposed too much complexity, and the complexity of model selection, maintenance and management has increased significantly.

Currently, the user's focus has been moving to the left. In the early days, more attention was paid to infrastructure, network hardware, etc. Now, it is no longer necessary to pay attention to the infrastructure layer, just focus on your own business.

The self built Kubernetes mainly has the following three problems:

First, resource consumption. For small and medium-sized users, maintaining a cluster will bring a lot of costs. For example, a highly available operator needs three replicas, and monitoring and logging should be established around the three replicas. In addition, the cluster itself will also take up resources, and when the business fluctuates, it is easy to cause node node underutilization and resource waste.

Second, operation and maintenance are complex. It is not only necessary to understand the architecture of Kubernetes cluster, but also to select technologies for addon (logging and monitoring) to ensure rapid troubleshooting and positioning when problems occur in the production environment. The troubleshooting of problems will also bring more workload to the operation and maintenance engineers.

Third, use your mind. The API of Kubernetes has two sides. On the one hand, it is implementation oriented and is the basis for the operation of operator; On the other hand, it is oriented to use, such as kubectl, kustomize, and help. There are about 100 resource types in a bare kubernetes, but the number of resources used by users is basically no more than 10.

3、 SAE's Solution

The Serverless architecture has become a trend, which is more easily recognized by everyone and really improves the user experience. Serverless not only refers to function computing, but also refers to the operation model that all the supply, expansion, monitoring and configuration of computing infrastructure are entrusted to the platform.

If you want to build Kubernetes by yourself, the process is similar to selecting an assembly computer. You need to select a video card, CPU, memory, etc. many times, and the cost of trial and error is high; If you use a one-stop publishing platform, it is similar to choosing an all-in-one machine. You don't need to pay attention to all the details and use it out of the box. But a one-stop publishing platform needs to be based on large-scale precipitation and accumulation. In addition, Alibaba has also modified and adapted some open source components to provide a Serverless experience.

SAE is the "all-in-one machine" mentioned above - a one-stop application publishing platform.

The SAE platform is implemented based on Kubernetes, but considering the compatibility with infrastructure, as well as the addon adaptation and compatibility involved in application orientation, it finally realizes an integrated version to provide users with a better experience.

SAE provides the Serverless experience. Users do not need to pay attention to the infrastructure. They trust the infrastructure to the cloud platform to obtain greater flexibility and capacity expansion. In addition, SAE provides rapid flexibility and flexible strategies. Benefiting from the high integration with the cloud, it supports serverless features such as second scale elastic scaling and pay as you go in the emergency scenario, steadily guarantees user SLAs, and supports rich elastic policies.

4、 SAE Technical Principles

First, Kubernetes integrates security containers. The kernel of the container is shared by the host, and the security container is equivalent to that each container has an independent kernel, which ensures that different applications will not have cross container intrusion due to kernel vulnerabilities or ultra vires.

The community's security container solutions include Kata, Firecracker and gVisor. The security container not only realizes security isolation, but also realizes performance isolation and fault isolation. It will not cause all applications on the machine to fail because an application causes a kernel problem.

SAE provides the ability of microservices to go offline without loss.

All interfaces of Kubernetes are asynchronous. After the pod is terminated, the service will be removed from the registered node, so the flow will be cut off. On the other hand, based on Kubernetes' own capabilities, SAE pre stop implements active logoff. Once it receives the signal of pod termination, it first calls pre stop to help users actively remove traffic from the registry, so that applications will not be disconnected during logoff.

In terms of monitoring and diagnosis, Alibaba Cloud's internal products, such as ARMS, are integrated. After the application is published, monitoring capabilities can be automatically integrated. You can view service calls and function call paths with one click on the interface.

SAE provides the capability of end cloud joint debugging.

Users can use IDE and plug-ins on Alibaba Cloud locally, combined with the springboard machine, to achieve joint debugging between local and cloud, and improve the efficiency of local development.

In the SAE scenario, the user has no awareness of the cluster, and only needs to create a namespace and set the VPCA and switch of the namespace. After the application is deployed to a namespace, the pod under the application will also be deployed to the VPCA. There is no need to build a Kubernetes cluster, and the management dimension can also reach the namespace granularity.

The operation and maintenance free capability provided by Serverless can be directly oriented to applications, deployed at one click, and does not need to pay attention to the underlying infrastructure resources. It charges by instance and performs various operation and maintenance operations according to the application dimensions. The log monitoring system required on ECS has been integrated.

The extreme flexibility of Serverless has brought better user experience and reduced costs. In the face of dramatic changes in traffic, it can achieve rapid scaling in 10-15s.

The original K8S deployment, reconstruction and upgrade strategy is shown in the figure above. SAE implements the in place upgrade strategy. When only the image is updated, the container does not need to reschedule. Instead, the old image can be destroyed in place and the new image can be pulled up. This enables rapid startup and improves the deployment efficiency by 42%.

In addition, SAE implements the image warm-up strategy. When pulling the pod, a large part of the time is spent on pulling the image. However, pulling the user's image overlaps with a part of the scheduling system. Therefore, it is not necessary to pull the image after the network container is established. Instead, it can help users load the node on a node in advance, speed up the warm-up process, and help users start the container faster. The elastic efficiency is improved by 30%.

SAE has made startup acceleration for different languages.

Taking Java as an example, we have optimized Dragonwell and integrated it into SAE to enhance the AppCDS startup acceleration. For example, in the case of multiple instances, the start time of the first instance may be normal. When several other instances start or expand, the cold start time can be greatly reduced through AppCDS technology.

At the same time, Java runtime acceleration is realized by using the capabilities of Dragonwell Wisp. Convert normal Java threads to coroutines to achieve runtime acceleration, and the running efficiency is improved by 20%.


Q: Why do you choose lightweight containers for function calculation?

A: At the earliest time of function calculation, there was only one model, namely, single instance and single request, which means that an instance can only process a request once at a certain time. After processing, if the next request hits the instance again, it will be processed again. At present, the underlying isolation of function computing is Micro VM, which is no longer the traditional Docker. Many customers see the flexibility of function calculation, but they can't accept cold start. Therefore, they want to avoid cold start through single instance multi request. Because multiple requests will achieve elastic scaling based on the water level, when the water level does not reach the pop-up warning line, several concurrent instances will work on the existing instance, and multiple requests will be in a Micro VM, without isolation of a single request.

Python or Node.js supports single instance multi concurrency, but it only reduces cold start. The specific request processing process is still a single processing model. Therefore, there is no multi request. The other, such as the Java ecosystem or the go ecosystem, can be used for co scheduling or multithreading triggering, and can process multiple requests at the same time. However, on the runtime model, if another layer of isolation is set at the request level, the overrun will be very long.

In addition, in the case of single instance and multithreading, one thread panic may cause the entire instance panic. At present, there is no good solution to this problem.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us