Developer Content

Brief introduction

As an implementation standard in the field of container choreography, Kubernetes (K8s) is also used in more and more scenarios. As an important part of observability construction, log can record detailed access requests and error information, which is very helpful for problem location. Applications on Kubernetes, Kubernetes components themselves, hosts, etc. will generate various types of log data. SLS fully supports the collection and analysis of these data. This article will mainly introduce the basic principle of SLS for Kubernetes log collection, so that you can better plan the usage in practice.

Kubernetes log collection method

In K8s, log collection is generally divided into Sidecar and DaemonSet. It is generally recommended that DaemonSet be used in small and medium-sized clusters; Sidecar is recommended to be used in super large clusters (providing services for multiple business parties, each business party has specific customized log collection requirements, and the number of collection configurations will exceed 500).

• In DaemonSet mode, only one log agent is running on each node to collect all logs on this node. DaemonSet takes up much less resources, but its scalability and tenant isolation are limited. It is more suitable for clusters with single functions or not many businesses.

• The Sidecar method deploys a log agent separately for each POD, which is only responsible for the log collection of one business application. Sidecar is relatively resource-intensive, but flexible and multi-tenant isolated. It is recommended that large K8S clusters or clusters serving multiple business parties as PAAS platforms use this method.

SLS log collection principle

The log collection process usually includes three parts: deploy agent, configure agent, and agent works according to the configuration. SLS's log collection process is also basically similar, providing more deployment, configuration and collection methods than open source collection software. In K8s, the deployment methods are mainly divided into DaemonSet and Sidecar; Configuration methods include CRD configuration, environment variable configuration, console configuration, and API configuration; The collection method supports container file collection, container standard output collection and standard file collection. The whole process includes:

1. Deploy Logtail: support DaemonSet and Sidecar deployment methods (the first part will introduce the collection principles of these two methods).

2. Create collection configuration: support CRD configuration, environment variable configuration, console configuration, API configuration and other methods (the second part will introduce the advantages and disadvantages of these configurations and applicable scenarios).

3. Logtail collects data according to the configuration: Logtail obtains the collection configuration created in step 2 and works according to the configuration content.

• Note: The default DaemonSet and Sidecar modes all use user-defined machine groups, which are applicable to the node/container automatic expansion scenario. All collection configurations must be attached to the machine group to take effect.

• Note: The collection configuration of Logtail is obtained from the server. When Logtail is connected to the server, it will automatically synchronize the collection configuration associated with the machine group to the local and start working.

DaemonSet Log Collection Principle

DaemonSet (related documents can refer to: DaemonSet), Deployment and StatefullSet are both advanced orchestration methods (Controller) for Pod in Kubernetes. Deployment and StatefullSet will define a number of replicas, and K8s will schedule according to the number of replicas. However, DaemonSet does not need to specify the number of copies. By default, it will start a node on each node, which is generally used for log collection, monitoring, disk cleaning and other O&M related work. Therefore, the default recommended log collection method for Logtail is DaemonSet.

In DaemonSet mode, the Logtail installed by default will be in the kube-system namespace. The DaemonSet name is logtail-ds. The Pod of each node's Logtail is responsible for collecting the data of all running Pods on this node (including standard output and files).

• The following command can view the status of the Pod running logtail-ds: kubectl get pod - n kube-system | grep logtail-ds

prerequisite

The precondition for Logtail to collect other Pods/containers is to be able to access the container runtime on the host and the data of other containers:

• Access to container runtime: The Logtail container will mount the sock of the container runtime (Docker Engine/ContainerD) on the host to the container directory, so the Logtail container can access the Docker Engine/ContainerD of this node

• Accessing data from other containers: The Logtail container will mount the root directory ('/' directory) of the host to the/logtail of the container_ Host mount point, so you can use/logtail_ The host directory accesses the data of other containers (provided that the file system in the container runtime is stored on the host computer in the form of a normal file system, which is generally over-layfs, or the container's log directory is mounted on the host computer, such as hostPath or emptyDir)

Workflow

After these two prerequisites are met, Logtail will load the collection configuration from the SLS server and start working. For container log collection, the workflow after Logtail is started is mainly divided into two parts:

1. The process mainly includes:

A. Obtain all containers and their configuration information from the container runtime (Docker Engine/ContainerD), such as container name, ID, mount point, environment variable, Label, etc

B. Locate the containers to be collected according to the IncludeEnv, ExcludeEnv, IncludeLabel and ExcludeLabel in the collection configuration. Through the matching relationship of the collection configuration, the target to be collected can be targeted, and the collection of all containers will result in resource waste, data splitting and other problems. As shown in the following figure, configure 1 container for collecting Env as Pre, 2 containers for collecting APP as APP1, and 3 containers for collecting non-ENV as Pre. These three configurations will collect different data to different Logstores.

2. Collect container data, which mainly includes:

A. Determine the data address of the container to be collected, including the standard output and file collection address. This information is mainly in the configuration of the container. For example, the LogPath of the standard output and the storage path of the container file are identified in the following figure. Note:

ⅰ. Standard output: The standard output of the container must be saved to a file before it can be collected. For DockerEngine and ContainerD, you need to configure LogDriver, which can be configured as json-file or local (generally, the default configuration will be saved to a file, so most cases do not need to be concerned).

ⅱ. Container file: When the system with container file is overlay, the file in all containers will be automatically found according to UpperDir; However, the default configuration of ContainerD is devicemapper. In this case, you must mount the log to HostPath or EmptyDir to find the corresponding path.

B. Collect data according to the corresponding address, among which the standard output is special, so it is necessary to parse the standard output file to get the user's actual standard output

C. Parse the original log according to the configured parsing rules, including: regular at the beginning of the line, field extraction (regular, separator, JSON, anchor, etc.), filtering/discarding, desensitization, etc

D. Upload data to SLS

Sidecar log collection principle

In K8s, a Pod can run multiple containers, and these containers will share a namespace. We generally call the core work container as the main container and the other containers as Sidecar containers. The Sidecar container is generally used as an auxiliary function. Through the shared volume, functions such as synchronizing files, collecting monitoring/logs, and cleaning files are realized. The same principle applies to the Sidecar collection of Logtail. In the Sidecar mode, in addition to the main business container, the Sidecar container of Logtail will also be run. The Logtail container and the main container share the volume of the log. The collection process is as follows:

1. The business container prints the log to the shared volume (only files, standard output cannot be written to the shared volume)

2. Logtail collects the SLS after discovering the log changes by sharing the volume monitoring log

For the best practice of using Sidecar mode, please refer to: Collection of Logs by Sidecar and Kubernetes File Collection Practice: Sidecar+HostPath Volume.

Acquisition configuration principle

SLS log collection Logtail supports CRD configuration, environment variable configuration, console configuration, API configuration and other methods. For different configuration scenarios, our suggestions are:

1. For users with high requirements for CICD automated deployment and operation and maintenance, use CRD configuration mode

2. For scenarios where the environment is relatively static (publishing is not very frequent, and the log collection strategy changes infrequently), it is recommended to use the console configuration method

3. High-end players can consider API custom configuration

4. The configuration method of environment variables is relatively restrictive and weak, and is generally not recommended

CRD (Operator) configuration mode

The log service adds a CustomResourceDefinition extension called AliyunLogConfig for K8s. At the same time, alibaba-log-controller is developed to listen to AliyunLogConfig events and automatically create the collection configuration of Logtail. When users create/delete/modify AliyunLogConfig resources, alibaba-log-controller will listen for resource changes and correspondingly create/delete/modify the corresponding collection configuration on the log service. This is used to realize the association between AliyunLogConfig in K8S and the collection configuration in the log service.

CRD AliyunLogConfig Implementation Method

As shown in the figure above, the log service adds a CustomResourceDefinition extension named AliyunLogConfig for K8S. At the same time, alibaba-log-controller was developed to listen to AliyunLogConfig events.

When users create/delete/modify AliyunLogConfig resources, alibaba-log-controller will listen for resource changes and correspondingly create/delete/modify the corresponding collection configuration on the log service. This is used to realize the association between AliyunLogConfig in K8S and the collection configuration in the log service.

Internal implementation of alibaba-log-controller

The alibaba-log-controller is mainly composed of six modules. The functions and dependencies of each module are shown in the figure above:

• EventListener: responsible for monitoring the CRD resources of AliyunLogConfig. This EventListener is a listener in a broad sense. Its main functions include

• All AliyunLogConfig resources will be listed during initialization

• Register AliyunLogConfig to listen for changing events

• Periodically rescan the full amount of AliyunLogConfig resources to prevent missing events or processing failures

• Package the event and submit it to EventHandler for processing

• EventHandler: It is responsible for handling the corresponding Create/Update/Delete events. As the core module of the Controller, its main functions are as follows:

• First check the corresponding checkpoint in ConfigMapManager. If the event has been processed (the version number is the same and the status is 200), skip it directly

• To prevent historical events from interfering with the processing results, pull the latest resource status from the server and check whether it is the same version. If the version is inconsistent, replace it with the server version

• Pre-process events to meet the basic format requirements of the LogSDK

• Call LogSDKWrapper to create the corresponding configuration of the log service Logstore, Create/Update/Delete

• Update the status of the corresponding AliyunLogConfig resource according to the above processing results

• ConfigMapManager: Depends on the ConfigMap mechanism of K8S to implement the checkpoint management of the controller, including:

• Maintain the mapping relationship between checkpoint and ConfigMap

• Provide basic checkpoint add, delete, modify and query interface

• LogSDKWrapper: secondary packaging based on Alibaba Cloud LOG golang sdk, including:

• Initialize and create log service resources, including Project, MachineGroup, Operation Logstore, etc

• Convert the CRD resource into the corresponding log service resource operation, which is a one-to-many relationship

• Package the SDK interface to automatically handle network exceptions, server exceptions, and permission exceptions

• Be responsible for authority management, including automatically obtaining roles, updating sts token, etc

• ScheduledSyner: a periodic synchronization module in the background to prevent missing events due to configuration changes during process/node failure and ensure the final consistency of configuration management:

• Regularly refresh all checkpoint and AliyunLogConfig

• Check the mapping relationship between the checkpoint and AliyunLogConfig resources. If a non-existent configuration appears in the checkpoint, delete the corresponding resources

• Monitor: alibaba-log-controller not only outputs the local running log to stdout, but also collects the log directly to the log service for remote troubleshooting. The types of collection logs are as follows:

• Internal exception log of k8s api

• alibaba-log-controller operation log

• Alibaba-log-controller internal exception data (automatic aggregation)

Environment variable configuration method

The configuration of environment variables is relatively the simplest. Users only need to add the special field aliyun when configuring Pod_ The environment variable at the beginning of logs can complete the definition of configuration and data collection. This method is implemented through Logtail:

1. Logtail obtains the list of all containers from the container runtime (Docker Engine/ContainerD)

2. For the running container, check whether the environment variable contains aliyun_ Environment variables starting with logs

3. For aliyun_ The environment variable starting with logs is mapped to the SLS Logtail collection configuration, and the SLS interface is called to create the collection configuration

4. Logtail obtains the collection configuration of the server and starts work

Recommended usage

K8s log collection scheme can be implemented in many ways, with different complexity and effects. Generally, we need to select the collection method and configuration method. Here we recommend the following usage methods:

• Acquisition method

• DaemonSet: It is more suitable for clusters with single functions or not many services. It is necessary to control the number of collection configurations of the whole cluster to not exceed 500, otherwise Logtail will occupy more resources.

• Sidecar: It is recommended to use this method for large K8S clusters or clusters serving multiple business parties as a PAAS platform. The typical measurement is that there are more than 500 collection configurations.

• Mixed mode: standard output of container, system log and part of business log use DaemonSet mode; Sidecar mode is used for some pods that require high reliability of log collection.

• Configuration mode

• For users with high requirements for automatic deployment and operation and maintenance of CICD, use CRD configuration mode

• For scenarios where the environment is relatively static (publishing is not very frequent, and the log collection strategy changes infrequently), it is recommended to use the console configuration method

• High-end players can consider API custom configuration

Comprehensive Analysis of Kubernetes Log Collection Principle

Related Articles

A detailed explanation of Hadoop core architecture HDFS

What Does IOT Mean

6 Optional Technologies for Data Storage

What Is Blockchain Technology

Explore More Special Offers

Short Message Service(SMS) & Mail Service

Sales Support

Technical Support

Connect & Report Abuse