A detailed explanation of the Job class log collection scheme in the K8s environment

Background

The rich controllers of K8s provide great convenience for container choreography. For the needs of single task and scheduled task, K8s provides job and Cronjob controllers to meet the needs of non resident container choreography. Because of this non resident feature, the duration of the task container may be very short (such as the task of regularly cleaning up data), and even some tasks fail to run as soon as they are started, which brings great challenges to collecting job logs.

This paper will discuss various job log collection schemes based on the high-performance lightweight observable collector iLogtail, and analyze how these schemes can ensure the stability of log collection in different scenarios and how to optimize the scheme.

Job Container Features

For the convenience of expression, this article refers to the business containers controlled by the job controller as job containers. Compared with other types of containers, Job containers have the following characteristics:

1. High frequency of addition and deletion: Job containers are often scheduled periodically or on demand and end when the execution is completed, so the frequency of addition and deletion is significantly higher than that of other types of containers.

2. Short life cycle: The expectation of the job container is to exit after the task is executed, and it is not resident in the service, so the life cycle is relatively short. Some jobs are only used to simply delete historical data, and their life cycle is only seconds.

3. Large burst concurrency: Job containers are often used to arrange batch tasks or test scenarios. Such scenarios often trigger the generation of a large number of Job container instances instantaneously, accompanied by the generation of a large number of logs.

Key Considerations for Job Log Collection Scheme Selection

Therefore, the following three considerations are critical for job log collection.

Container discovery speed: Job containers are frequently added or deleted. If the container discovery speed is too slow, the container may have been destroyed before it can be found, let alone data collection.

Start collection delay: The life cycle of the job container may be very short. When the pod in K8s is destroyed, all the container data under it will be deleted. If collection is not started in time, the file handle cannot be locked and the deleted file data cannot be collected.

Elastic support: The burst high concurrency feature of the job container is very suitable for using elastic resources to save costs. Therefore, it is expected that the collection scheme can be adapted to support elastic capacity expansion.

At the same time, some general requirements for container log collection should also be taken into consideration when selecting solutions.

Resource cost: Lower resource cost usually means lower cost, and it also reduces the impact of log collection on business use of resources.

Meta information marking: meta information is used to identify the source of logs, and rich meta information is helpful to find and use logs.

Invasiveness: Invasiveness determines the development cost of log collection. At the same time, stronger invasiveness will also increase the coupling between log collection and business, bringing potential costs for future scheme modification and upgrading.

ILogtail Container Collection Scheme Comparison

DaemonSet acquisition mode

The DaemonSet collection method uses the DaemonSet controller of K8s to deploy an iLogtail container on each node for log collection of all containers on the node. This deployment method finds all containers on the node by communicating with docker.sock or containerd.lock on the node with the container runtime, obtains the standard output path and storage path of the container from the returned container information, and mounts the host path to collect data.

The advantages of this collection method are very obvious. Only one collection container needs to be deployed on each node. It has nothing to do with the number of application containers and saves resources. It can obtain complete container meta information, and the application is not aware of the collection container and has no intrusion.

On the other hand, the DaemonSet method on the three key points closely related to job log collection is average. When deploying iLogtail through DaemonSet, iLogtail's discovery container mechanism relies on communicating with docker.sock or containerd.lock. Docker has its own EventListener mechanism to obtain the creation and destruction events of containers in real time; Containers do not. They can only learn about the creation and destruction of containers through the polling mechanism. In the latest version, the polling interval is 1 second. Therefore, it can be considered that the delay for iLogtail to discover containers is 1 second. There is a delay of 3-6 seconds from the discovery of the container to the start of data collection. The delay of collecting stdout comes from the polling interval inside the stdout collection plug-in, while the delay of collecting container files comes from the docker_ The polling interval of the file plug-in and the frequency limit for loading the latest container configuration in the C++core. Therefore, in addition to some time-consuming processing, the DaemonSet mode expects the container log collection delay to be about 5-8 seconds. In terms of elasticity, iLogtail deployed by DaemonSet can support dynamic node expansion, but cannot directly support elastic container expansion without physical nodes.

Sidecar acquisition mode

Sidecar collection mode, using the feature that containers in the same pod of K8s can share storage volumes, deploy business and collection containers in a business pod at the same time to achieve the purpose of business data. This collection method requires the business container to mount the directory to be collected and share it with the collection container. The collection container collects business container logs by collecting local files.

In essence, this method is not very different from the host collection, and the container discovery problem does not need to be concerned. At the same time, as long as the collection container does not exit, the Pod will be in Running status, and the files on the shared storage volume will not be deleted. Therefore, there is no need to worry about the data loss caused by the collection delay. As the business container Pod is deployed, the Sidecar can also flexibly support various elastic expansion and contraction schemes, so it performs well on the key three points of job container collection.

However, compared with DaemonSet, Sidecar is not so popular. In addition to its inability to directly support the standard output function of the collection container, it has some shortcomings that limit its scope of use. First, the resource consumption is high. Each Pod needs a Sidecar collection container, and the resource cost is proportional to the number of business Pod. Secondly, since the principle of collection is the same as that of the host, the meta information of the container cannot be automatically collected, and it needs to be exposed to the collection container through environment variables. Finally, each business Pod needs to configure shared storage for target data, and consider the mechanism of notifying the collection container to exit, which is invasive.

ECI elastic container product collection method

ECI is the abbreviation of elastic container instance (see the official introduction of Alibaba Cloud for details). It is equivalent to a small virtual machine, which is destroyed when it is used up and has no physical nodes. It has cost and flexibility advantages for sudden high concurrency scenarios, which are just the characteristics of some job container scenarios. The principle of the ECI product collection method is similar to the DaemonSet collection method. The difference is that the iLogtail container in ECI is not controlled by the Kube Scheduler, but is controlled by ECI and is invisible to users. The iLogtail container discovery method does not communicate with docker.sock or containerd.lock, but finds the containers to be collected through static container information. In order to support container data acquisition, ECI also mounts the path on ECI to the iLogtail container, which is the same principle as DaemonSet attaching the host path.

The collection mode of ECI products naturally provides good support for elastic expansion and contraction. Because the collection principle is similar to DaemonSet, the ECI collection method also inherits some of the properties of DaemonSet collection, such as the ability to completely obtain meta information, such as container discovery and the delay of starting collection. However, since the ECI product collection method uses the static file discovery container method, the container can be found at the first time after startup, so the actual delay is much smaller than DaemonSet.

In terms of cost, although each pod will start an ECI with an iLogtail pod attached, the actual cost of a job scenario with high concurrency may be lower than that of a self built node due to the on-demand elastic container resource. Since ECI only creates iLogtail containers for containers that need to collect data, it needs some means to determine the log collection requirements of business containers. Currently, it supports the use of CRD (K8s Operator) or environment variables. CRD mode is a more recommended access mode at present, which has no intrusion into the business container and supports more abundant collection configurations. If the environment variable mode is adopted, there will be certain invasiveness.

Same as container collection method

The same container collection method refers to the deployment of the collection process and business process in the same container, which is equivalent to the deployment method of treating the container as a virtual machine. Therefore, the collection principle is completely the same as that of the host.

Although this method looks very cumbersome and intrusive, it is very common in the process of old business containerization. The same container deployment is adopted. To ensure that the job data is not lost, the exit mechanism of the container needs to be carefully designed, and the exit can only be performed after the completion of data collection.

Since the collection process and business process work in the same container, this collection method does not have the delay of container discovery and collection, and fully supports various elastic schemes.

In terms of resource cost, each business container consumes additional collection process cost, which leads to high resource consumption. To collect the container's meta information, it needs to be exposed in the business container through environment variables and other methods, and cannot be automatically labeled.

Independent storage acquisition mode

Independent storage means that the container prints the data to be collected to the shared pv or hostPath mounted path, while the collection container only needs to care about the collection scheme of collecting the data on the pv or hostPath. Some job schedulers can specify the log path of each job, so you can print the logs on the same shared volume. When using shared pv, only one collection container is required to collect all data; When using hostPath, you need to use DaemonSet to deploy the collection container so that there is exactly one collection container on each node.

After using independent storage, the life cycle of the data is separated from that of the container. The collection container only needs to collect the data on the storage according to the path, so there is no problem with the container and the start of collection delay. Other advantages of this collection method include that the number of collection containers does not increase with the business containers, the resource consumption is very low, and the business containers are not invaded.

However, in terms of elasticity, the independent storage collection method performs poorly. If PV is used and a collection container corresponds to it, the throughput of a collection container will become a bottleneck in the collection performance. If hostPath is used with DaemonSet deployment, the elastic container cannot be supported. This collection method also has poor support in obtaining meta information. You can only expose some meta information by embedding meta information in the data storage path. For example, when the volume is mounted, set SubPathExpr to logs/$(POD_NAMESPACE)_$ (POD_NAME)_$ (POD_IP)_$ (NODE_IP)_$ (NODE_NAME)。

Summary

The following table summarizes the five acquisition schemes described previously:

As you can see, each scheme has its own advantages and disadvantages, and is applicable to different job container collection scenarios and data integrity requirements. Here are some typical job container collection scenarios and the recommended collection scheme.

Typical Scenarios

Short life cycle job

If the task takes a long time to execute, such as more than 1 minute, the time window given to the collection end to collect Job container data will be larger. At this time, since it is not sensitive to the container discovery delay and the start collection delay of the collection scheme, there will be no problem using the DaemonSet collection method. However, when the life cycle of a job is less than 1 minute, the default collection parameters may cause log loss. When the life cycle of the job container is more than 10 seconds, we can continue to use the DaemonSet collection scheme. Just adjust some parameters to ensure complete data collection. Because the standard output collection principle is different from the container file, the parameters that need to be adjusted are not identical.

1. Docker by adjusting the startup parameters (effective globally)_ config_ update_ interval to reduce the effective delay after container discovery. For example, before version 1.0.34, it was adjusted from the default of 10 s to 3 s (after that, it was 3 s by default), reducing the possibility that files could not be locked after container discovery.

2. Adjust the polling time to reduce the delay of starting collection, and lock the file handle in advance to prevent data collection from failing after the Pod is deleted. For standard output, the acquisition configuration needs to be adjusted (the local acquisition configuration takes effect). The parameter name is FlushIntervalMs, for example, from the default 3000 ms to 1000 ms; For file collection, you need to adjust the startup parameters (effective globally). The parameter name is max_ docker_ config_ update_ For example, adjust the default 3-minute frequency control 10 to 60.

4. If a large number of logs will be printed when the job is started, you can adjust the location where the files are collected after they are found to prevent the logs from being lost because the starting location is not the starting location of the files. For standard output, the acquisition configuration needs to be adjusted (the local acquisition configuration takes effect). The parameter name is StartLogMaxOffset. For example, the default 131072 B is adjusted to 13107200 B; For file collection, the collection configuration needs to be adjusted (the local collection configuration takes effect). The parameter name is tail_ size_ Kb, for example, from the default 1024 KB to 10240 KB.

5. After the check job is completed, it will not be cleaned immediately, that is, the container standard output log just exited is still readable. If the built-in CronJob scheduling is used, confirm that the. spec.successfulJobsHistoryLimit and. spec.failedJobsHistoryLimit of CronJob are not configured or>0; Using the custom scheduler, confirm that the. spec.ttlSecondsAfterFinished of the job is not configured or>0, and that its own logic will not immediately clean up the completed job.

Second Fallback Job

If the life cycle of a container is very short, less than 10 seconds, or even less, the DaemonSet method can easily lead to the loss of data collection for such job containers due to its inevitable container discovery delay. In this case, we suggest using one of the following two methods:

1. Use container standard output. Change the container log output to standard output, and submit the log rotation request to kubelet for processing. The garbage collection mechanism of K8s usually ensures that each pod keeps the container meta information and standard output log records of the last container. In this way, even though the container has exited, as long as the pod has not been deleted, the container can still be found and its standard output log can be collected. With this scheme, it is still necessary to pay attention to the 4 JobsHistoryLimit mentioned above to avoid the Pod being destroyed before log collection.

2. Use SideCar or ECI to collect data to ensure the integrity of data collection. Different from collecting ordinary container logs, the Job container needs to pay attention to the exit mechanism when using Sidecar to collect logs. The exit of ordinary containers is usually initiated by the controller to ask the Pod to exit, so all containers in the Pod will receive sigterm signals. However, the job container usually exits automatically after task execution. The collection container will not receive the sigterm signal, so the business container needs to notify it to exit. A simple implementation is to notify through the files on the shared volume.

Summary and outlook

Because of the high frequency of addition and deletion, short life cycle and large burst concurrency of Job container collection, special requirements are put forward for the log collection scheme. Based on these characteristics, this paper analyzes the advantages and disadvantages of five collection schemes from three main considerations and three secondary considerations, and gives specific collection solutions in combination with typical job scenarios. From the results of the discussion, it is often necessary to adjust the parameters of the collection container to ensure that data is not lost in various job scenarios. Running jobs and collecting logs in ECI mode may become the best practice for running cloud native jobs in the future due to its flexibility and simple collection configuration.

For iLogtail, how to further reduce the container discovery and collection start delay and avoid the need for users to manually adjust parameters is the next step to be optimized. For SLS products, how to better deal with sudden large traffic and reduce user intervention costs is also an optimization point that can be considered.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us