×
Community Blog Use LifseaOS to Experience the Minute-Level Scale-Out of Thousands of ACK Nodes

Use LifseaOS to Experience the Minute-Level Scale-Out of Thousands of ACK Nodes

This article introduces the fast advantages of LifseaOS in scalability.

By Alibaba Cloud ACK and Operating System Team

LifseaOS was released in 2021. It is a vertically optimized operating system distribution for cloud-native scenarios (known as ContainerOS in the industry). When it was initially released, it provided the following outstanding features: Lightweight, Fast, Secure, and Atomic.

LifseaOS is now widely used in the managed node pools of Alibaba Cloud Container Service for Kubernetes Pro. After more than a year of refining and optimization, LifseaOS is showing more unique advantages in Kubernetes-based cloud-native clusters. This article introduces the fast advantages of LifseaOS in scalability.

Start OS in Two Seconds and Scale-Out Thousands of Nodes in Minutes

Elastic scaling in ACK clusters is a basic, critical, and heavily dependent capability, especially when the cluster resource pool is tight or insufficient. Learning how to quickly scale out nodes and restore resource usage is important to your business.

In node auto scaling scenarios, ACK checks whether the resources in the cluster are sufficient by polling the components. If the resources are insufficient, ACK automatically scales out nodes. If the node scale-out speed is slow, the auto scaling effect will be seriously affected, and the user business will be affected due to the long-term shortage of resource usage. Currently, the node scaling speed accounts for more than 90% of the end-to-end auto scaling time of ACK nodes. The optimization of the node scale-out speed determines the auto scaling experience.

Let’s take the scale-out scenario of a quantitative company as an example. In the process of providing services for a long period, it has undergone thousands of scale-out activities at the level of hundreds of nodes. The average time taken for each scale-out of the P90 node to be ready (from the start of a single scale-out activity to the 90% node in the ready state) is 162s, which is a long time. If you can reduce the duration of a scale-out activity to less than one minute, your experience in the node scale-out scenario will improve significantly.

LifseaOS allows you to scale out node pools in ACK clusters at high speed.

On the one hand, LifseaOS significantly improves the OS startup speed by simplifying the startup process. It cuts out a large number of hardware drivers that are not needed in cloud scenarios, changes the necessary kernel driver modules to the built-in mode, and removes initramfs. The udev rules are simplified. The first startup time of the OS is reduced from more than one minute of the traditional OS to about two seconds.

On the other hand, LifseaOS is customized and optimized in combination with ACK scenarios. Container Registry allows you to preset container images required for cluster management and control to reduce the time-consuming process of image pulling during node startup. Container Registry also optimizes the management and control links of Container Service for Kubernetes. For example, you can adjust the detection frequency of key logic and the throttling value of system bottlenecks under high loads. This significantly improves the speed of node scale-out.

Compared with traditional OS solutions, LifseaOS has more advantages in node elastic scaling scenarios. The more nodes start concurrently. As shown in the following figure, when 1000 nodes are started concurrently, the P90 node readiness time CentOS is 380s, while LifseaOS only needs 53s, realizing the experience of scaling thousands of nodes in minutes!

1

Reduces OS O&M Burden and Optimized for Deep Container Scenarios

ContainerOS (based on LifseaOS) can be used in ACK node pools. Compared with traditional Linux OS, ContainerOS is deeply optimized for container scenarios and has the advantages of greater security, lightness, fast startup, and immutable image.

  • Lightweight

LifseaOS integrates with containerd and Kubernetes by default. It retains the Kubernetes pods-required system services and packages. Compared to the traditional operating system (Alibaba Cloud Linux 2/3, CentOS), the package reduces by 60%, and image size reduces by 70%.

  • Fast

LifseaOS cuts out a lot of cloud scenes without the hardware drivers. The necessary kernel driver module is modified to built-in, which removed initramfs. Udev rules are simplified and enhanced the startup time. OS starts from the traditional OS of 1min to about 2s.

  • Secure

The root file system of LifseaOS is read-only. Only the /etc and /var directories are writable to meet the basic system configuration requirements. Sshd services and Python support are removed to reduce the threat posed by sshd CVE vulnerabilities. At the same time, the regular operation and maintenance of the OS use an application programming interface to reduce the stability and security risks caused by users logging into the system to perform some black-screen operations that may not be traceable. However, LifseaOS still provides a dedicated O&M container to log on to the system to meet urgent O&M requirements. The O&M container needs to be pulled up by API on demand and is not enabled by default.

  • Atomic Management

LifseaOS does not support rpm package installation, upgrade, and uninstall. Ostree technology enables it to manage OS image versions and update operations packages on the system. When curing the configuration, it is necessary for the entire image granularity to update (or rollback) as far as possible to ensure the consistency of the package version in each node and system configuration.

2

ContainerOS combines ACK-managed node pools with automatic management capabilities, including quick node CVE repair, node self-healing, and automatic image upgrade. This can further reduce the management burden on users in OS O&M and allow users to pay more attention to upper-layer applications.

Get Started with LifseaOS

LifseaOS is a Linux based operating system optimized for container scenarios. It is deeply integrated with ACK and allows you to quickly scale out clusters.

LifseaOS is available in managed node pools of ACK Pro 1.24.6 and later. The product type is ContainerOS in the following figure:

3

You can go to the ACK console to create a managed node pool based on LifseaOS and start using LifseaOS.

The OpenAnolis community has set up a special interest group for ContainerOS, and LifseaOS-related code will be contributed to the community. Please visit the link for more information: https://openanolis.cn/sig/container-os

0 1 0
Share on

You may also like

Comments

Related Products