This topic describes how to create a Logtail configuration in the Log Service console and use the Logtail configuration to collect container stdout and stderr in DaemonSet mode.

Prerequisites

  • The Logtail component is installed. For more information, see Install Logtail components.
  • A Logstore is created in the project that you use to install the Logtail component. For more information, see Create a Logstore.

Features

Logtail can collect container stdout and stderr, and then upload the stdout and stderr together with container metadata to Log Service. Logtail supports the following features:

  • Collects stdout and stderr.
  • Uses the container label whitelist to specify containers from which stdout and stderr are collected.
  • Uses the container label blacklist to specify containers from which stdout and stderr are not collected.
  • Uses the environment variable whitelist to specify containers from which stdout and stderr are collected.
  • Uses the environment variable blacklist to specify containers from which stdout and stderr are not collected.
  • Collects multi-line logs. For example, Logtail can collect Java stack logs.
  • Automatically associates container metadata that needs to be uploaded together with the collected container stdout and stderr. The metadata includes container names, image names, pod names, namespaces, and environment variables.
  • If a container runs in a Kubernetes cluster, Logtail also supports the following features:
    • Uses Kubernetes namespaces, pod names, and container names to specify containers from which stdout and stderr are collected.
    • Uses the Kubernetes label whitelist to specify containers from which stdout and stderr are collected.
    • Uses the Kubernetes label blacklist to specify containers from which stdout and stderr are not collected.
    • Automatically associates Kubernetes labels that need to be uploaded together with the collected container stdout and stderr.

Implementation

Logtail communicates with the domain socket of Docker. Logtail queries all Docker containers and identifies the containers from which stdout and stderr are collected by using the specified labels and environment variables. Logtail runs the docker logs command to collect logs from the specified containers.

When Logtail collects stdout and stderr from a container, Logtail periodically stores checkpoints to a checkpoint file. If Logtail is stopped and is then restarted, Logtail collects logs from the last checkpoint. Implementation

Limits

  • You can use the Log Service console to collect stdout and stderr in DaemonSet mode only if Logtail runs V0.16.0 or later and runs on Linux. For more information about Logtail versions and version updates, see Install Logtail on a Linux server.
  • Logtail collects data from containers that use the Docker engine or containerd engine.
    • Docker: Logtail accesses the Docker engine in the /run/docker.sock directory. Make sure that the directory exists and Logtail has the permissions to access the directory.
    • containerd: Logtail accesses the containerd engine in the /run/containerd/containerd.sock directory. Make sure that the directory exists and Logtail has the permissions to access the directory.
  • By default, the last multi-line log that is collected by Logtail is cached for 3 seconds. This prevents the multi-line log from being split into multiple logs due to output latency. You can change the cache time by changing the value of the BeginLineTimeoutMs parameter. We recommend that you do not specify a value less than 1000 with millisecond precision. If you specify a value that is less than 1000, an error may occur.
  • If Logtail detects the die event on a container that is stopped, Logtail no longer collects stdout and stderr from the container. If collection latency occurs, some stdout and stderr that are collected before the container is stopped may be lost.
  • The logging driver collects stdout and stderr only in the JSON format from containers that use the Docker engine.
  • By default, stdout and stderr that are collected from different containers by using the same Logtail configuration have the same context. If you want to specify a different context for the stdout and stderr of each container, you must create a Logtail configuration for each container.
  • By default, the collected data is stored in the content field. Logtail can process the collected data. For more information, see Use Logtail plug-ins to process data.

Create a Logtail configuration

  1. Log on to the Log Service console.
  2. In the Import Data section, click Kubernetes - Standard Output.
  3. Select a project and a Logstore. Then, click Next.
    Select the project that you use to install the Logtail component and the Logstore that you create.
  4. Click Use Existing Machine Groups.
    After you install the Logtail component, Log Service automatically creates a machine group named k8s-group-${your_k8s_cluster_id}. You can select this machine group.
  5. Select the k8s-group-${your_k8s_cluster_id} machine group from Source Server Groups and move the machine group to Applied Server Groups. Then, click Next.
    Notice If the heartbeat status of the machine group is FAIL, you can click Automatic Retry. If the issue persists, see What do I do if no heartbeat connections are detected on Logtail?
  6. In the Specify Data Source step, specify the data source and click Next.

    Configure the parameters that are used to collect stdout and stderr in the Plug-in Config field. Example:

    {
        "inputs":[
            {
                "type":"service_docker_stdout",
                "detail":{
                    "Stdout":true,
                    "Stderr":true,
                    "IncludeContainerLabel":{
                        "LabelKey":"LabelValue"
                    },
                    "ExcludeContainerLabel":{
                        "LabelKey":"LabelValue"
                    },
                    "IncludeK8sLabel":{
                        "LabelKey":"LabelValue"
                    },
                    "ExcludeK8sLabel":{
                        "LabelKey":"LabelValue"
                    },
                    "IncludeEnv":{
                        "EnvKey":"EnvValue"
                    },
                    "ExcludeEnv":{
                        "EnvKey":"EnvValue"
                    },
                    "ExternalK8sLabelTag":{
                        "EnvKey":"EnvValue"
                    },
                    "ExternalEnvTag":{
                        "EnvKey":"EnvValue"
                    },
                    "K8sNamespaceRegex":"^(default|kube-system)$",
                    "K8sPodRegex":"^(deploy.*)$",
                    "K8sContainerRegex":"^(container1|container2)$"
                }
            }
        ]
    }

    Configure the following parameters:

    • Data source type

      The type of the data source is fixed as service_docker_stdout.

    • Parameters related to container filtering
      • For versions earlier than Logtail V1.0.29, containers can be filtered only by using environment variables and container labels.

        The namespace of a Kubernetes cluster and the name of a container in the Kubernetes cluster can be mapped to container labels. The value of the LabelKey parameter for the namespace is io.kubernetes.pod.namespace. The value of the LabelKey parameter for the container name is io.kubernetes.container.name. We recommend that you use the two container labels to filter containers. If the container labels do not meet your business requirements, you can use the environment variable whitelist or the environment variable blacklist to filter containers. For example, the namespace of a pod is backend-prod, and the name of a container in the pod is worker-server. If you want the logs of the worker-server container to be collected, you can specify "io.kubernetes.pod.namespace" : "backend-prod" or "io.kubernetes.container.name" : "worker-server" in the container label whitelist.

        Notice
        • Container labels are retrieved by running the docker inspect command. Container labels are different from Kubernetes labels. For more information, see Obtain container labels.
        • Environment variables are the same as the environment variables that are configured to start containers. For more information, see Obtain environment variables.
        • Do not specify duplicate values for the LabelKey parameter. If you specify duplicate values for the LabelKey parameter, only one of the values takes effect.
        Parameter Data type Required Description
        IncludeLabel Map (The values of the LabelKey and LabelValue parameters are strings.) No The container label whitelist. The whitelist specifies the containers from which stdout and stderr are collected. By default, this parameter is empty, which indicates that stdout and stderr are collected from all containers. When you configure the container label whitelist, the LabelKey parameter is required, and the LabelValue parameter is optional.
        • If you leave the LabelValue parameter empty, containers whose container labels contain the keys specified by LabelKey are matched.
        • If you specify a value for the LabelValue parameter, containers whose container labels consist of the key-value pairs specified by LabelKey and LabelValue are matched.

          By default, the value of the LabelValue parameter is a string. In this case, string matching is performed. Containers are matched only if the values of container labels are the same as the value of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

        Key-value pairs are connected by using the OR operator. If a container label consists of one of the specified key-value pairs, the container is matched.

        ExcludeLabel Map (The values of the LabelKey and LabelValue parameters are strings.) No The container label blacklist. The blacklist specifies the containers from which stdout and stderr are not collected. By default, this parameter is empty, which indicates that stdout and stderr are collected from all containers. When you configure the container label blacklist, the LabelKey parameter is required, and the LabelValue parameter is optional.
        • If you leave the LabelValue parameter empty, containers whose container labels contain the keys specified by LabelKey are filtered out.
        • If you specify a value for the LabelValue parameter, containers whose container labels consist of the key-value pairs specified by LabelKey and LabelValue are filtered out.

          By default, the value of the LabelValue parameter is a string. In this case, string matching is performed. Containers are matched only if the values of container labels are the same as the value of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

        Key-value pairs are connected by using the OR operator. If a container label consists of one of the specified key-value pairs, the container is filtered out.

        IncludeEnv Map (The values of the EnvKey and EnvValue parameters are strings.) No The environment variable whitelist. The whitelist specifies the containers from which stdout and stderr are collected. By default, this parameter is empty, which indicates that stdout and stderr are collected from all containers. When you configure the environment variable whitelist, the EnvKey parameter is required, and the EnvValue parameter is optional.
        • If you leave the EnvValue parameter empty, containers whose environment variables contain the keys specified by EnvKey are matched.
        • If you specify a value for the EnvValue parameter, containers whose environment variables consist of the key-value pairs specified by EnvKey and EnvValue are matched.

          By default, the value of the EnvValue parameter is a string. In this case, string matching is performed. Containers are matched only if the values of environment variables are the same as the value of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

        Key-value pairs are connected by using the OR operator. If an environment variable consists of one of the specified key-value pairs, the container is matched.

        ExcludeEnv Map (The values of the EnvKey and EnvValue parameters are strings.) No The environment variable blacklist. The blacklist specifies the containers from which stdout and stderr are not collected. By default, this parameter is empty, which indicates that stdout and stderr are collected from all containers. When you configure the environment variable blacklist, the EnvKey parameter is required, and the EnvValue parameter is optional.
        • If you leave the EnvValue parameter empty, containers whose environment variables contain the keys specified by EnvKey are filtered out.
        • If you specify a value for the EnvValue parameter, containers whose environment variables consist of the key-value pairs specified by EnvKey and EnvValue are filtered out.

          By default, the value of the EnvValue parameter is a string. In this case, string matching is performed. Containers are matched only if the values of environment variables are the same as the value of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

        Key-value pairs are connected by using the OR operator. If an environment variable consists of one of the specified key-value pairs, the container is filtered out.

      • For Logtail V1.0.29 or later, we recommend that you use different levels of Kubernetes information, such as pod names, namespaces, container names, and labels to filter containers.
        Note If you change Kubernetes labels when Kubernetes control resources, such as Deployment, are running, the operational pod is not restarted. Therefore, the pod cannot detect the change. This may cause a matching rule to become invalid. When you specify the Kubernetes label whitelist and the Kubernetes label blacklist, we recommend that you use the Kubernetes labels of pods. For more information about Kubernetes labels, see Labels and Selectors.
        Parameter Data type Required Description
        IncludeK8sLabel Map (The values of the LabelKey and LabelValue parameters are strings.) No The Kubernetes label whitelist. The whitelist specifies the containers from which stdout and stderr are collected. When you configure the Kubernetes label whitelist, the LabelKey parameter is required, and the LabelValue parameter is optional.
        • If you leave the LabelValue parameter empty, containers whose Kubernetes labels contain the keys specified by LabelKey are matched.
        • If you specify a value for the LabelValue parameter, containers whose Kubernetes labels consist of the key-value pairs specified by LabelKey and LabelValue are matched.

          By default, the value of the LabelValue parameter is a string. In this case, string matching is performed. Containers are matched only if the values of Kubernetes labels are the same as the value of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($), regular expression matching is performed. For example, if you set the LabelKey parameter to app and set the LabelValue parameter to ^(test1|test2)$, containers whose Kubernetes labels consist of app:test1 and app:test2 are matched.

        Key-value pairs are connected by using the OR operator. If a Kubernetes label consists of one of the specified key-value pairs, the container is matched.

        ExcludeK8sLabel Map (The values of the LabelKey and LabelValue parameters are strings.) No The Kubernetes label blacklist. The blacklist specifies the containers from which stdout and stderr are not collected. When you configure the Kubernetes label blacklist, the LabelKey parameter is required, and the LabelValue parameter is optional.
        • If you leave the LabelValue parameter empty, containers whose Kubernetes labels contain the keys specified by LabelKey are filtered out.
        • If you specify a value for the LabelValue parameter, containers whose Kubernetes labels consist of the key-value pairs specified by LabelKey and LabelValue are filtered out.

          By default, the value of the LabelValue parameter is a string. In this case, string matching is performed. Containers are matched only if the values of Kubernetes labels are the same as the value of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($), regular expression matching is performed. For example, if you set the LabelKey parameter to app and set the LabelValue parameter to ^(test1|test2)$, containers whose Kubernetes labels consist of app:test1 and app:test2 are matched.

        Key-value pairs are connected by using the OR operator. If a Kubernetes label consists of one of the specified key-value pairs, the container is filtered out.

        K8sNamespaceRegex string No The namespace. The namespace specifies the containers from which stdout and stderr are collected. Regular expression matching is supported. For example, if you specify "K8sNamespaceRegex":"^(default|nginx)$", all containers in the nginx and default namespaces are matched.
        K8sPodRegex string No The pod name. The pod name specifies the containers from which stdout and stderr are collected. Regular expression matching is supported. For example, if you specify "K8sPodRegex":"^(nginx-log-demo.*)$",, all containers in the pod whose name starts with nginx-log-demo are matched.
        K8sContainerRegex string No The container name. The container name specifies the containers from which stdout and stderr are collected. Regular expression matching is supported. Kubernetes container names are defined in spec.containers. For example, if you specify "K8sContainerRegex":"^(container-test)$", all containers whose name is container-test are matched.
    • Parameters related to log labels

      For Logtail V1.0.29 or later, we recommend that you specify environment variables and Kubernetes labels for logs as log labels.

      Parameter Data type Required Description
      ExternalEnvTag Map (The values of the EnvKey and EnvValue parameters are strings.) No After you specify environment variables as log labels, Log Service adds environment variable-related fields to logs. For example, if you set the EnvKey parameter to VERSION and set the EnvValue parameter to env_version, Log Service adds the __tag__:__env_version__: v1.0.0 field to logs if the environment variable configurations of a container include VERSION=v1.0.0.
      ExternalK8sLabelTag Map (The values of the LabelKey and LabelValue parameters are strings.) No After you specify Kubernetes labels as log labels, Log Service adds Kubernetes label-related fields to logs. For example, if you set the LabelKey parameter to app and set the LabelValue parameter to k8s_label_app, Log Service adds the __tag__:__k8s_label_app__: serviceA field to logs if the label configurations of a Kubernetes cluster include app=serviceA.
    • Other parameters
      Parameter Data type Required Description
      Stdout boolean No Specifies whether to collect stdout.

      By default, this parameter is empty, which indicates that stdout is collected.

      Stderr boolean No Specifies whether to collect stderr.

      By default, this parameter is empty, which indicates that stderr is collected.

      BeginLineRegex string No The regular expression that is used to match the beginning of the first line of a log.

      By default, this parameter is empty, which indicates that each line is regarded as a log.

      If the beginning of a line matches the specified regular expression, the line is regarded as the first line of a new log. If the beginning of a line does not match the specified regular expression, the line is regarded as a part of the last log.

      BeginLineTimeoutMs int No The timeout period for matching the beginning of the first line of a log based on the specified regular expression.

      By default, this parameter is empty, which indicates that the timeout period is 3,000 milliseconds.

      If no new log is generated within 3,000 milliseconds, Logtail stops matching the beginning of the first line of a log and uploads the last log to Log Service.

      BeginLineCheckLength int No The size of the beginning of the first line of a log that matches the specified regular expression.

      By default, this parameter is empty, which indicates that the size of the beginning of the first line of a log is 10,240 bytes.

      You can configure this parameter to check whether the beginning of the first line of a log matches the specified regular expression. We recommend that you configure this parameter to improve the match efficiency.

      MaxLogSize int No The maximum size of a log.

      By default, this parameter is empty, which indicates that the maximum size of a log is 524,288 bytes.

      If the size of a log exceeds the value of this parameter, Logtail stops matching the beginning of the first line of a log and uploads the log to Log Service.

      StartLogMaxOffset int No The maximum size of historical data that can be traced the first time Logtail collects logs from a log file. Valid values: [131072,1048576]. Unit: bytes.

      By default, this parameter is empty. In this case, the maximum size of historical data that can be traced is 131,072 bytes, which is equivalent to 128 KB.

  7. Preview data, configure indexes, and then click Next.
    By default, full-text indexing is enabled for Log Service. You can also configure field indexes based on collected logs in manual or automatic mode. For more information, see Configure indexes.

Examples of Logtail configurations for single-line logs

Example 1: Filter containers based on the environment variable whitelist and the environment variable blacklist

Collect stdout and stderr from the containers whose environment variable configurations include NGINX_SERVICE_PORT=80 but exclude POD_NAMESPACE=kube-system.

  1. Obtain environment variables.

    To view the environment variables of a container, you can log on to the host on which the container resides. For more information, see Obtain environment variables.

    Configuration example of environment variables
  2. Create a Logtail configuration.

    Example:

    {
        "inputs": [
            {
                "type": "service_docker_stdout",
                "detail": {
                    "Stdout": true,
                    "Stderr": true,
                    "IncludeEnv": {
                        "NGINX_SERVICE_PORT": "80"
                    },
                    "ExcludeEnv": {
                        "POD_NAMESPACE": "kube-system"
                    }
                }
            }
        ]
    }

Example 2: Filter containers based on the container label whitelist and the container label blacklist

Collect stdout and stderr from the containers whose container label is io.kubernetes.container.name=nginx.

  1. Obtain container labels.

    To view the labels of a container, you can log on to the host on which the container resides. For more information, see Obtain container labels.

    Configuration example of labels
  2. Create a Logtail configuration.

    Example:

    {
        "inputs": [
            {
                "type": "service_docker_stdout",
                "detail": {
                    "Stdout": true,
                    "Stderr": true,
                    "IncludeLabel": {
                        "io.kubernetes.container.name": "nginx"
                    }
                }
            }
        ]
    }

Example 3: Filter containers by using Kubernetes namespaces, pod names, and container names

Collect stdout and stderr from the nginx-log-demo-0 container in pods whose name starts with nginx-log-demo in the default namespace.

  1. Obtain different levels of Kubernetes information.
    1. Obtain information about pods. Kubernetes resources
    2. Obtain information about namespaces. Kubernetes resources
  2. Create a Logtail configuration.
    Example:
    {
        "inputs": [
            {
                "type": "service_docker_stdout",
                "detail": {
                    "Stdout": true,
                    "Stderr": true,
                    "K8sNamespaceRegex":"^(default)$",
                    "K8sPodRegex":"^(nginx-log-demo.*)$",
                    "K8sContainerRegex":"^(nginx-log-demo-0)$"
                }
            }
        ]
    }

Example 4: Filter containers by using Kubernetes labels

Collect stdout and stderr from containers whose Kubernetes labels contain the job-name key and a specific value. The value starts with nginx-log-demo.

  1. Obtain Kubernetes labels. Kubernetes resources
  2. Create a Logtail configuration.
    Example:
    {
        "inputs": [
            {
                "type": "service_docker_stdout",
                "detail": {
                    "Stdout": true,
                    "Stderr": true,
                    "IncludeK8sLabel":{
                        "job-name":"^(nginx-log-demo.*)$"
                    }
                }
            }
        ]
    }

Examples of Logtail configurations for multi-line logs

Java exception stack logs are multi-line logs. You can create a Logtail configuration to collect the Java exception stack logs based on the following descriptions:
  • Sample logs
    2021-02-03 14:18:41.968  INFO [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : service start
    2021-02-03 14:18:41.969 ERROR [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : java.lang.NullPointerException
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
    ...
    2021-02-03 14:18:41.968  INFO [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : service start done
  • Logtail configuration

    Collect the Java exception stack logs of the containers whose container label is app=monitor. The Java exception stack logs start with a date that is in a fixed format. Logtail matches only the first 10 bytes of each line to improve match efficiency. After the logs are collected to Log Service, Log Service uses regular expressions to parse the logs into fields such as time, level, module, thread, and message.

    • inputs is required and is used to configure the log collection settings for the Logtail configuration. You must configure inputs based on your data source.
      Note You can configure only one type of data source in inputs.
    • processors is optional and is used to configure the log processing settings for the Logtail configuration. You can specify one or more processing methods. For more information, see Use Logtail plug-ins to process data.
    {
    "inputs": [
      {
        "detail": {
          "BeginLineCheckLength": 10,
          "BeginLineRegex": "\\d+-\\d+-\\d+.*",
          "IncludeLabel": {
            "app": "monitor"
          }
        },
        "type": "service_docker_stdout"
      }
    ],
    "processors": [
        {
            "type": "processor_regex",
            "detail": {
                "SourceKey": "content",
                "Regex": "(\\d+-\\d+-\\d+ \\d+:\\d+:\\d+\\.\\d+)\\s+(\\w+)\\s+\\[([^]]+)]\\s+\\[([^]]+)]\\s+([\\s\\S]*)",
                "Keys": [
                    "time",
                    "level",
                    "module",
                    "thread",
                    "message"
                ],
                "NoKeyError": true,
                "NoMatchError": true,
                "KeepSource": false
            }
        }
    ]
    }
  • Parsed logs

    For example, if the collected log is 2018-02-03 14:18:41.968 INFO [spring-cloud-monitor] [nio-8080-exec-4] c.g.s.web.controller.DemoController : service start done, the log is parsed into the following fields:

    __tag__:__hostname__:logtail-dfgef
    _container_name_:monitor
    _image_name_:example.com-hangzhou.aliyuncs.xxxxxxxxxxxxxxx
    _namespace_:default
    _pod_name_:monitor-6f54bd5d74-rtzc7
    _pod_uid_:7f012b72-04c7-11e8-84aa-00163f00c369
    _source_:stdout
    _time_:2018-02-02T14:18:41.979147844Z
    time:2018-02-02 02:18:41.968
    level:INFO
    module:spring-cloud-monitor
    thread:nio-8080-exec-4
    class:c.g.s.web.controller.DemoController
    message:service start done

Log fields

The following table describes the fields that are uploaded by default for each log in a Kubernetes cluster.
Log field Description
_time_ The time at which the data is uploaded. Example: 2021-02-02T02:18:41.979147844Z.
_source_ The type of the data source. Valid values: stdout and stderr.
_image_name_ The name of the image.
_container_name_ The name of the container.
_pod_name_ The name of the pod.
_namespace_ The namespace of the pod.
_pod_uid_ The unique identifier of the pod.
_container_id_ The IP address of the pod.