By Zhao Mingshan (Liheng)
OpenKruise is an open source management suite developed by Alibaba Cloud for cloud native application automation. It is currently a Sandbox project hosted under the Cloud Native Computing Foundation (CNCF). Based on years of Alibaba's experience in container and cloud native technologies, OpenKruise is a Kubernetes-based standard extension component that has been widely used in the Alibaba internal production environment, together with technical concepts and best practices for large-scale Internet scenarios.
OpenKruise released v0.8.0 (changelog) on March 4, 2021, with enhanced SidecarSet capabilities, especially for log management of Sidecar.
Sidecar is a very important cloud native container design mode. It can create an independent Sidecar container by separating the auxiliary capabilities from the main container. In microservice architectures, the Sidecar mode is also used to separate general capabilities such as configuration management, service discovery, routing, and circuit breaking from main programs, thus making the microservice architectures less complicated. Since the popularity of Service Mesh has led to the prevalence of the Sidecar mode, the Sidecar mode has also been widely used within Alibaba Group to implement common capabilities such as O&M, security, and message-oriented middleware.
In Kubernetes clusters, pods can not only support the construction of main containers and Sidecar containers, but also many powerful workloads, such as deployment and statefulset to manage and upgrade the main containers and Sidecar containers. However, with the ever-growing businesses in Kubernetes clusters day by day, there have also been various Sidecar containers with a larger scale. Therefore, management and upgrades of online Sidecar containers are more complex:
Alibaba Group has millions of containers with thousands of businesses. Therefore, the management and upgrades of Sidecar containers have become a major target for improvement. To this end, many internal requirements for the Sidecar containers have been summarized and integrated into OpenKruise. Finally, these requirements were abstracted as SidecarSet, a powerful tool to manage and upgrade a wide range of Sidecar containers.
SidecarSet is an abstracted concept for Sidecar from OpenKruise. As one of the core workloads of OpenKruise, it is used to inject and upgrade the Sidecar containers in Kubernetes clusters. SidecarSet provides a variety of features so that users can easily manage Sidecar containers. The main features are as follows:
Note: For a Pod that contains multiple container modes, the container that provides the main business logic to the external is the main container. Other containers provide auxiliary capabilities such as log collection, security, and proxy are Sidecar containers. For example, if a pod provides web capabilities outward, the nginx container that provides major web server capabilities is the main container. The logtail container is the Sidecar container that is responsible for collecting and reporting nginx logs. The SidecarSet resource abstraction in this article also solves some problems of the Sidecar containers.
Application logs allow you to see the internal running status of your application. Logs are useful for debugging problems and monitoring cluster activities. After the application is containerized, the simplest and most widely used logging is to write standard output and errors.
However, in the current distributed systems and large-scale clusters, the above solution is not enough to meet the production environment standards. First, for distributed systems, logs are scattered in every single container, without a unified place for congregation. Logs may be lost in scenarios such as container crashes and Pod eviction. Therefore, there is a need for a more reliable log solution that is independent of the container lifecycle.
Sidecar logging architectures places the logging agent in an independent Sidecar container to collect container logs by sharing the log directory. Then, the logs are stored in the back-end storage of the log platform.
This architecture is also used by Alibaba and Ant Group to realize the log collection of containers. Next, this article will explain how OpenKruise SidecarSet helps a large-scale implementation of the Sidecar log architecture in Kubernetes clusters.
OpenKruise SidecarSet has implemented automatic Sidecar container injection based on Kubernetes AdmissionWebhook mechanism. Therefore, as long as the Sidecar is configured in SidecarSet, the defined Sidecar container will be injected into the scaled pods with any deployment patterns, such as CloneSet, Deployment, or StatefulSet.
The owner of Sidecar containers only needs to configure SidecarSet to inject the Sidecar containers without affecting the business. This greatly reduces the threshold for using the Sidecar containers, and facilitates the management of Sidecar owners. In addition to containers, SidecarSet also extends the following fields to meet various scenarios of Sidecar injection:
# sidecarset.yaml
apiVersion : apps.kruise.io/v1alpha1
Kind : SidecarSet
metadata:
name: test-sidecarset
spec:
# Select Pods through the selector
selector:
matchLabels:
app: web-server
# Specify a namespace to take effect
namespace: ns-1
# container definition
containers:
- name: logtail
image: logtail:1.0.0
# Share the specified volume
volumeMounts:
- name: web-log
mountPath: /var/log/web
# Share all volumes
shareVolumePolicy: disabled
# Share environment variables
transferEnv:
- sourceContainerName: web-server
# TZ indicates the time zone, for example, the environment variable TZ = Asia/Shanghai in the web-server container
envName: TZ
volumes:
- name: web-log
emptyDir: {}
Use transferEnv to obtain environment variables from other containers, which copies the environment variable named envName in the sourceContainerName container to the current Sidecar container. In the above example, the Sidecar container of logs shares the time zone of the main container, which is especially common in overseas environments.
Note: The number of containers for the created Pods cannot be changed in the Kubernetes community. Therefore, the injection capability described above can only occur during the Pod creation phase. For the created Pods, Pod reconstruction is required for injection.
SidecarSet not only allows to inject the Sidecar containers, but also reuses the in-place update feature of OpenKruise. This realizes the upgrade of the Sidecar containers without restarting the Pod and the main container. Since this upgrade method does not affect the business, upgrading the Sidecar containers is no longer a pain point. Thus, it brings a lot of conveniences to Sidecar owners and speeds up the Sidecar version iteration.
Note: Only the modification on the container.image fields for the created Pods is allowed by the Kubernetes community. Therefore, the modification on other fields of Sidecar containers requires the reconstruction of Pod, and the in-place upgrade is not supported.
To meet the requirements in some complex Sidecar upgrade scenarios, SidecarSet provides the in-place upgrade and a wide range of gray release strategies.
Gray release is a common method that allows a Sidecar container to be released smoothly. It is highly recommended that this method is used in large-scale clusters. Here is an example of Pod's rolling release based on the maximum unavailability after the first batch of Pod release is suspended. Suppose that there are 1,000 Pods to be released:
apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
name: sidecarset
spec:
# ...
updateStrategy:
type: RollingUpdate
partition: 980
maxUnavailable: 10%
The configuration above is suspended after the former release of 20 pods (1000 – 980 = 20). After the Sidecar container has been normal for a period of time in business, adjust the update SidecarSet configuration:
apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
name: sidecarset
spec:
# ...
updateStrategy:
type: RollingUpdate
maxUnavailable: 10%
As such, the remaining 980 Pods will be released in the order of the maximum unavailable numbers (10% * 1000=100) until all Pods are released.
Partition indicates that the number or percentage of Pods of the old version is retained, with the default value of 0. Here, the partition does not represent any order number. If the partition is set up during the release process:
MaxUnavailable indicates the maximum unavailable number of pods at the same time during the release, with the default value of 1. Users can set the MaxUnavailable value as absolute value or percentage. The percentage is used by the controller to calculate the absolute value based on the number of selected pods.
Note: The values of maxUnavailable and partition are not necessarily associated. For example:
For businesses that require canary release, strategy.selector can be considered as a choice. The solution is to mark the labels[canary.release] = true into the Pods that require canary release, and then use strategy.selector.matchLabels to select the pods.
apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
name: sidecarset
spec:
# ...
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
- canary.release: true
maxUnavailable: 10%
The above configuration only releases the containers marked with canary labels. After the canary verification is completed, rolling release is performed based on the maximum unavailability by removing the configuration of updateStrategy.selector.
The upgrade sequence of pods in SidecarSet is subject to the following rules by default:
In addition to the above default release order, the scatter release policy allows users to scatter the pods that match certain labels to the entire release process. For example, for a global sidecar container like logtail, dozens of business pods may be injected into a cluster. Thus, the logtail can be released after being scattered based on the application name, realizing scattered and gray release among applications. And it can be performed together with the maximum unavailability.
apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
name: sidecarset
spec:
# ...
updateStrategy:
type: RollingUpdate
# Configure Pod labels and assume all pods contain the labels[app_name]
scatterStrategy:
- key: app_name
value: nginx
- key: app_name
value: web-server
- key: app_name
value: api-gateway
maxUnavailable: 10%
Note: In the current version, all application names must be listed. In the next version, an intelligent scattered release will be supported with only the label key configured.
SidecarSet has been widely used by Alibaba Group and Ant Group to manage Sidecar containers. The following is an example of log collection by the Logtail Sidecar.
1. Create SidecarSet resources based on the SidecarSet.yaml configuration file:
# sidecarset.yaml
apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
name: logtail-sidecarset
spec:
selector:
matchLabels:
app: nginx
updateStrategy:
type: RollingUpdate
maxUnavailable: 10%
containers:
- name: logtail
image: log-service/logtail:0.16.16
# when recevie sigterm, logtail will delay 10 seconds and then stop
command:
- sh
- -c
- /usr/local/ilogtail/run_logtail.sh 10
livenessProbe:
exec:
command:
- /etc/init.d/ilogtaild
- status
resources:
limits:
memory: 512Mi
requests:
cpu: 10m
memory: 30Mi
##### share this volume
volumeMounts:
- name: nginx-log
mountPath: /var/log/nginx
transferEnv:
- sourceContainerName: nginx
envName: TZ
volumes:
- name: nginx-log
emptyDir: {}
2. Create a Pod based onpod.yaml:
apiVersion: v1
kind: Pod
metadata:
labels:
# matches the SidecarSet's selector
app: nginx
name: test-pod
spec:
containers:
- name: nginx
image: log-service/docker-log-test:latest
command: ["/bin/mock_log"]
args: ["--log-type=nginx", "--stdout=false", "--stderr=true", "--path=/var/log/nginx/access.log", "--total-count=1000000000", "--logs-per-sec=100"]
volumeMounts:
- name: nginx-log
mountPath: /var/log/nginx
envs:
- name: TZ
value: Asia/Shanghai
volumes:
- name: nginx-log
emptyDir: {}
3. It can be found that this Pod creation process is injected with the logtail container:
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
test-pod 2/2 Running 0 118s
$ kubectl get pods test-pod -o yaml |grep 'logtail:0.16.16'
image: log-service/logtail:0.16.16
4. The SidecarSet status is updated as follows:
$ kubectl get sidecarset logtail-sidecarset -o yaml | grep -A4 status
status:
matchedPods: 1
observedGeneration: 1
readyPods: 1
updatedPods: 1
5. Update the image logtail: 0.16.18 of the Sidecar container in the SidecarSet:
$ kubectl edit sidecarsets logtail-sidecarset
# sidecarset.yaml
apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
name: logtail-sidecarset
spec:
containers:
- name: logtail
image: log-service/logtail:0.16.18
6. The logtail container in the Pod is updated to logtail: 0.16.18, and the Pod and other containers are not restarted:
$ kubectl get pods |grep test-pod
test-pod 2/2 Running 1 7m34s
$ kubectl get pods test-pod -o yaml |grep 'image: logtail:0.16.18'
image: log-service/logtail:0.16.18
$ kubectl describe pods test-pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 5m47s kubelet Container logtail definition changed, will be restarted
Normal Pulling 5m17s kubelet Pulling image "log-service/logtail:0.16.18"
Normal Created 5m5s (x2 over 12m) kubelet Created container logtail
Normal Started 5m5s (x2 over 12m) kubelet Started container logtail
Normal Pulled 5m5s kubelet Successfully pulled image "log-service/logtail:0.16.18"
In the OpenKruise v0.8.0, the SidecarSet has been improved in terms of log management in Sidecar scenarios. In the later exploration of the stability and performance of SidecarSet, more scenarios will be covered at the same time. For example, Service Mesh scenario will be supported in the next version. Moreover, more people are welcomed to participate OpenKruise community to improve the application management and delivery extensibility of Kubernetes for scenarios featuring large scale, complexity and extreme performance.
ChaosBlade x SkyWalking: High Availability Microservices Practices
507 posts | 48 followers
FollowAlibaba Cloud Community - February 20, 2024
Alibaba Cloud Native Community - November 19, 2024
Alibaba Cloud Native Community - December 29, 2023
Alibaba Clouder - December 3, 2020
Alibaba Developer - October 13, 2020
Alibaba Cloud Native Community - May 4, 2023
507 posts | 48 followers
FollowAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreProvides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resources
Learn MoreAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreMore Posts by Alibaba Cloud Native Community