This topic describes how to configure Logtail in the Log Service console to collect Kubernetes stdout and stderr logs in DaemonSet mode.

Prerequisites

The Helm package alibaba-log-controller is installed. For more information, see Install Logtail.

Features

Logtail can collect and upload container stdout and stderr logs together with container metadata to Log Service. The collection of Kubernetes stdout and stderr logs has the following features:

  • Collects stdout and stderr logs in real time.
  • Uses labels to specify containers for log collection.
  • Uses labels to exclude containers from log collection.
  • Uses environment variables to specify containers for log collection.
  • Uses environment variables to exclude containers from log collection.
  • Supports multi-line logs such as Java Stack logs.
  • Supports automatic labeling for Docker container logs.
  • Supports automatic labeling for Kubernetes container logs.
Note
  • The preceding labels are retrieved by using the docker inspect command. These labels are different from the labels that are specified in a Kubernetes cluster.
  • The preceding environment variables are the same as the environment variables that are specified to start containers.

Implementation

A Logtail container uses a UNIX domain socket to communicate with the Docker daemon. The Logtail container queries all Docker containers and locates the specified Docker containers based on specified labels and environment variables. Logtail collects the logs of the specified Docker containers by using the docker logs command.

When Logtail collects the stdout and stderr logs of a Docker container, Logtail periodically stores checkpoints to a checkpoint file. If Logtail is restarted, Logtail collects logs from the last checkpoint. Implementation

Limits

  • Logtail version: Only Logtail 0.16.0 or later that runs on Linux can be used to collect stdout and stderr logs. For more information about Logtail versions and version updates, see Install Logtail in Linux.
  • Logtail allows you to collect data from Docker and containerd engines.
    • Docker engines: Logtail accesses Docker engines by using the /run/docker.sock directory. You must make sure that the directory exists and Logtail has the required permissions.
    • Containerd engines: Logtail accesses containerd engines by using the /run/containerd/containerd.sock directory. You must make sure that the directory exists and Logtail has the required permissions.
  • Multi-line log entries: To ensure that a multi-line log entry is not split into multiple log entries due to output latency, the last collected multi-line log entry is cached for 3 seconds by default. You can set the cache time by specifying the BeginLineTimeoutMs parameter. The parameter value cannot be less than 1,000 ms. Otherwise, an error may occur.
  • Stop policy: If a container is stopped and Logtail detects the die event on the container, Logtail stops collecting stdout and stderr logs of the container. In this case, if a collection delay occurs, some stdout or stderr logs that are generated before the stop action may be lost.
  • Docker logging driver: The logging driver collects stdout and stderr logs only in JSON files.
  • Context: By default, logs that are collected from different containers by using a Logtail configuration file are in the same context. If you want the logs of each container to be in different contexts, create a Logtail configuration file for each container.
  • Data processing: The collected data is contained in the content field. The data can be processed by using a common processing method. For more information, see Overview.

Procedure

  1. Log on to the Log Service console.
  2. In the Import Data section, click Kubernetes - Standard Output.
  3. Select the created project and Logstore and click Next.
  4. In the Create Machine Group step, create a machine group as prompted, and then click Complete Installation.
    If a machine group is available, click Use Existing Machine Groups.
  5. Select and move the destination machine group from Source Server Groups to Applied Server Groups, and then click Next.
    Notice If you want to apply a machine group immediately after it is created, the heartbeat status of the machine group may be FAIL. This issue occurs because the machine group has not been connected to Log Service. In this case, you can click Automatic Retry. If the issue persists, see What can I do if the Logtail client has no heartbeat?
  6. In the Specify Data Source step, specify the data source, and then click Next.

    Enter log collection configurations in the Plug-in Config field. The following example shows the required parameters:

    {
     "inputs": [
         {
             "type": "service_docker_stdout",
             "detail": {
                 "Stdout": true,
                 "Stderr": true,
                 "IncludeLabel": {
                     "io.kubernetes.container.name": "nginx"
                 },
                 "ExcludeLabel": {
                     "io.kubernetes.container.name": "nginx-ingress-controller"
                 },
                 "IncludeEnv": {
                     "NGINX_SERVICE_PORT": "80"
                 },
                 "ExcludeEnv": {
                     "POD_NAMESPACE": "kube-system"
                 }
             }
         }
     ]
    }

    The type of input the data source is service_docker_stdout.

    Parameter Type Required Description
    IncludeLabel map Yes The value of the IncludeLabel parameter is a map. The keys and values of the map are strings. The default value of this parameter is an empty map. This default value indicates that logs from all containers are collected. If the keys are not empty and the values are empty, logs are collected from the containers whose label keys match the specified keys.
    Note
    • Key-value pairs are associated with the OR operator. If a label key-value pair of a container matches one of the specified key-value pairs, the logs of the container are collected.
    • By default, the values in the map are strings. Logs are collected from the containers whose names match the values. If you use a regular expression to specify a value, logs are collected from the containers whose names match the regular expression. If you specify a value that starts with a caret (^) and ends with a dollar sign ($), for example, ^(kube-system|istio-system)$, logs are collected from a container named kube-system and a container named istio-system.
    ExcludeLabel map No The value of the ExcludeLabel parameter is a map. The keys and values of the map are strings. The default value of this parameter is an empty map and indicates that logs from all containers are collected. If the keys are not empty and the values are empty, logs are not collected from the containers whose label keys match the specified keys.
    Note
    • Key-value pairs are associated with the OR operator. If a label key-value pair of a container matches one of the specified key-value pairs, the logs of the container are collected.
    • By default, the values in the map are strings. Logs are collected from the containers whose names match the values. If you use a regular expression to specify a value, logs are collected from the containers whose names match the regular expression. If you specify a value that starts with a caret (^) and ends with a dollar sign ($), for example, ^(kube-system|istio-system)$, logs are collected from a container named kube-system and a container named istio-system.
    IncludeEnv map No The value of the IncludeEnv parameter is a map. The keys and values of the map are strings. The default value of this parameter is an empty map and indicates that logs from all containers are collected. If the keys are not empty and the values are empty, logs are collected from the containers whose environment variable keys match the specified keys.
    Note
    • Key-value pairs are associated with the OR operator. If an environment variable key-value pair of a container matches one of the specified key-value pairs, the logs of the container are collected.
    • By default, the values in the map are strings. Logs are collected from the containers whose names match the values. If you use a regular expression to specify a value, logs are collected from the containers whose names match the regular expression. If you specify a value that starts with a caret (^) and ends with a dollar sign ($), for example, ^(kube-system|istio-system)$, logs are collected from a container named kube-system and a container named istio-system.
    ExcludeEnv map No The value of the ExcludeEnv parameter is a map. The keys and values of the map are strings. The default value of this parameter is an empty map and indicates that logs from all containers are collected. If the keys are not empty and the values are empty, logs are not collected from the containers whose environment variable keys match the specified keys.
    Note
    • Key-value pairs are associated with the OR operator. If an environment variable key-value pair of a container matches one of the specified key-value pairs, the logs of the container are collected.
    • By default, the values in the map are strings. Logs are collected from the containers whose names match the values. If you use a regular expression to specify a value, logs are collected from the containers whose names match the regular expression. if you specify a value that starts with a caret (^) and ends with a dollar sign ($) for example ^(kube-system|istio-system)$, logs are collected from the container named kube-system and the container named istio-system.
    Stdout bool No Default value: true. If you set the value of this parameter to false, stdout logs are not collected.
    Stderr bool No Default value: true. If you set the value of this parameter to false, stderr logs are not collected.
    BeginLineRegex string No The regular expression that is used to match a line for the first line of a log entry. The default value of this parameter is an empty string. If a line matches the specified regular expression, the line is considered the first line of a new log entry. Otherwise, the line is considered a part of the last log entry.
    BeginLineTimeoutMs int No The timeout period for the specified regular expression to match a line. Default value: 3000. Unit: ms. If no new log entry is generated within 3 seconds, the last log is uploaded.
    BeginLineCheckLength int No The size of the first line of a log entry that matches the specified regular expression. Default value: 10 × 1,024. Unit: bytes. You can specify this parameter to check whether the start of the line matches the regular expression. This improves match efficiency.
    MaxLogSize int No The maximum size of a log entry. Default value: 512 × 1,024. Unit: bytes. If the size of a log entry exceeds the specified value, the log entry is uploaded.
    Note
    • The preceding IncludeLabel and ExcludeLabel parameters are included in the label information retrieved by using the docker inspect command.
    • A namespace and a container name of a Kubernetes cluster can be mapped to a Docker label. The value of the LabelKey parameter for a namespace is io.kubernetes.pod.namespace. The value of the LabelKey parameter for a container name is io.kubernetes.container.name. For example, the namespace of the pod that you have created is backend-prod and the container name is worker-server. In this case, if you set the key-value pair of a whitelist label to io.kubernetes.pod.namespace : backend-prod, the logs of all containers in the pod are collected. If you set the key-value pair of a whitelist label to io.kubernetes.container.name : worker-server, the logs of the container are collected.
    • In a Kubernetes cluster, we recommend that you specify only the io.kubernetes.pod.namespace and io.kubernetes.container.name labels. You can also specify the IncludeEnv or ExcludeEnv parameter based on your business requirements.
  7. In the Configure Query and Analysis step, specify the indexes, and then click Next.
    Indexes are created by default. You can modify the indexes as needed.

Default fields

The following table lists the fields that are uploaded by default for each Kubernetes log entry.
Field name Description
_time_ The data upload time, for example, 2018-02-02T02:18:41.979147844Z.
_source_ The type of data source. Valid values: stdout and stderr.
_image_name_ The name of an image.
_container_name_ The name of a container.
_pod_name_ The name of a pod.
_namespace_ The namespace where the pod resides.
_pod_uid_ The unique identifier of the pod.
_container_id_ The IP address of the pod.

Configuration examples of single-line log collection

  • Configure environment variables

    Collect the stdout and stderr logs of the containers that meet the following conditions: Environment variables include NGINX_PORT_80_TCP_PORT=80 and exclude POD_NAMESPACE=kube-system.

    Configure environment variables
    The following script shows the log collection configurations:
    {
        "inputs": [
            {
                "type": "service_docker_stdout",
                "detail": {
                    "Stdout": true,
                    "Stderr": true,
                    "IncludeEnv": {
                        "NGINX_PORT_80_TCP_PORT": "80"
                    },
                    "ExcludeEnv": {
                        "POD_NAMESPACE": "kube-system"
                    }
                }
            }
        ]
    }
  • Configure labels

    Collect the stdout and stderr logs of the containers that meet the following conditions: The container labels include io.kubernetes.container.name=nginx and exclude type=pre.

    Configure labels
    The following script shows the label configurations:
    {
        "inputs": [
            {
                "type": "service_docker_stdout",
                "detail": {
                    "Stdout": true,
                    "Stderr": true,
                    "IncludeLabel": {
                        "io.kubernetes.container.name": "nginx"
                    },
                    "ExcludeLabel": {
                        "type": "pre"
                    }
                }
            }
        ]
    }

Configuration examples of multi-line log collection

Before you can collect Java exception stack logs, you must configure multi-line log collection. The following section describes how to collect stdout and stderr logs of standard Java applications.
  • Sample log entry
    2018-02-03 14:18:41.968  INFO [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : service start
    2018-02-03 14:18:41.969 ERROR [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : java.lang.NullPointerException
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
    ...
    2018-02-03 14:18:41.968  INFO [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : service start done
  • Log collection configuration

    Collect the logs of the containers that meet the following conditions: The container labels include app=monitor and the specified first bytes of a line is of a fixed-format data type. To improve match efficiency, only the first 10 bytes of each line are checked.

    {
    "inputs": [
      {
        "detail": {
          "BeginLineCheckLength": 10,
          "BeginLineRegex": "\\d+-\\d+-\\d+.*",
          "IncludeLabel": {
            "app": "monitor"
          }
        },
        "type": "service_docker_stdout"
      }
    ]
    }

Data processing examples

Logtail can process the collected Docker standard output. For more information, see Overview.
  • Collect the logs of the containers that meet the following conditions: The container labels include app=monitor and the specified first bytes of a line is of a fixed-format data type. To improve match efficiency, only the first 10 bytes of each line are checked. Regular expressions are used to parse logs into the values of the time, level, module, thread, and message. The following script shows the configurations of log collection and data processing:
    {
    "inputs": [
      {
        "detail": {
          "BeginLineCheckLength": 10,
          "BeginLineRegex": "\\d+-\\d+-\\d+.*",
          "IncludeLabel": {
            "app": "monitor"
          }
        },
        "type": "service_docker_stdout"
      }
    ],
    "processors": [
        {
            "type": "processor_regex",
            "detail": {
                "SourceKey": "content",
                "Regex": "(\\d+-\\d+-\\d+ \\d+:\\d+:\\d+\\.\\d+)\\s+(\\w+)\\s+\\[([^]]+)]\\s+\\[([^]]+)]\\s+:\\s+([\\s\\S]*)",
                "Keys": [
                    "time",
                    "level",
                    "module",
                    "thread",
                    "message"
                ],
                "NoKeyError": true,
                "NoMatchError": true,
                "KeepSource": false
            }
        }
    ]
    }
    The collected 2018-02-03 14:18:41.968 INFO [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : service start done log entry is processed, as shown in the following script:
    __tag__:__hostname__:logtail-dfgef
    _container_name_:monitor
    _image_name_:registry.cn-hangzhou.aliyuncs.xxxxxxxxxxxxxxx
    _namespace_:default
    _pod_name_:monitor-6f54bd5d74-rtzc7
    _pod_uid_:7f012b72-04c7-11e8-84aa-00163f00c369
    _source_:stdout
    _time_:2018-02-02T14:18:41.979147844Z
    time:2018-02-02 02:18:41.968
    level:INFO
    module:spring-cloud-monitor
    thread:nio-8080-exec-4
    class:c.g.s.web.controller.DemoController
    message:service start done
  • Collect the JSON logs of the containers that meet the following conditions: The container labels include app=monitor. The following script shows the configurations of log collection and data processing:
    {
    "inputs": [
      {
        "detail": {
          "IncludeLabel": {
            "app": "monitor"
          }
        },
        "type": "service_docker_stdout"
      }
    ],
    "processors": [
        {
            "type": "processor_json",
            "detail": {
                "SourceKey": "content",
                "NoKeyError":true,
                "KeepSource": false
            }
        }
    ]
    }