All Products
Search
Document Center

Simple Log Service:Collect container logs (standard output and files) from a cluster using a Kubernetes CRD

Last Updated:Nov 05, 2025

Defining log collection settings as a Kubernetes CustomResourceDefinition (CRD) unifies management across all clusters, including Container Service for Kubernetes (ACK) and self-managed ones. This approach replaces inconsistent, error-prone manual processes with versioned automation through kubectl or CI/CD pipelines. When combined with LoongCollector's hot reloading capability, configuration changes take effect immediately without restarting collection components. This improves Operations and Maintenance (O&M) efficiency and system maintainability.

The legacy AliyunLogConfig CRD is no longer maintained. Use the new AliyunPipelineConfig CRD instead. For a comparison of the new and legacy versions, see CRD types.
Important

Collection configurations created using a CRD can be modified only by updating the corresponding CRD. Changes made in the Simple Log Service console are not synchronized to the CRD and do not take effect.

Applicability

  • Operating environment:

    • Supports ACK (managed and dedicated editions) and self-managed Kubernetes clusters.

    • Kubernetes version 1.16.0 or later that supports Mount propagation: HostToContainer.

    • Container runtime (Docker and Containerd only)

      • Docker:

        • Requires access permissions for docker.sock.

        • Standard output collection supports only the JSON log driver.

        • Supports only the overlay and overlay2 storage drivers. For other types, you must manually mount the log directories.

      • Containerd: Requires access permissions for containerd.sock.

  • Resource requirements: LoongCollector (Logtail) runs with system-cluster-critical high priority. Do not deploy it if cluster resources are insufficient, because it may evict existing pods on the node.

    • CPU: Reserve at least 0.1 Core.

    • Memory: At least 150 MB for the collection component and at least 100 MB for the controller component.

    • Actual usage depends on the collection rate, the number of monitored directories and files, and any sending blockages. Ensure that actual usage remains below 80% of the configured limit.

  • Permission requirements: The Alibaba Cloud account or RAM user used for deployment must have the AliyunLogFullAccess permission.

    To create custom permission policies, see the AliyunCSManagedLogRolePolicy system policy. Copy the permissions from this policy and grant them to the target RAM user or role to configure fine-grained permissions.

Collection configuration creation flow

  1. Install LoongCollector: Deploy LoongCollector as a DaemonSet to ensure that a collection container runs on each node in the cluster. This enables unified collection of logs from all containers on that node.

  2. Create a Logstore: A Logstore is a storage unit for log data. You can create multiple Logstores in a project.

  3. Create a collection configuration YAML file: Connect to the cluster using kubectl. Create the collection configuration file in one of the following ways:

    • Method 1: Use the collection configuration generator

      Use the collection configuration generator in the Simple Log Service console to enter parameters in a graphical user interface and automatically generate a standard YAML file.

    • Method 2: Manually write the YAML file

      Write a YAML file based on the examples and workflows in this topic. Start with a minimal configuration and progressively add processing logic and advanced features.

      For more information about complex use cases not covered in this topic or fields that require deep customization, see AliyunPipelineConfig parameters for a complete list of fields, value rules, and plugin capabilities.

    A complete collection configuration usually includes the following parts:

    • Minimal configuration (Required): Builds the data tunnel from the cluster to Simple Log Service. It includes two parts:

      • Inputs (inputs): Defines the source of the logs. Container logs include the following two log sources. To collect other types of logs, such as MySQL query results, see Input plugins.

        • Container standard output (stdout and stderr): Log content that the container program prints to the console.

        • Text log files: Log files written to a specified path inside the container.

      • Outputs (flushers): Defines the log destination. Sends collected logs to the specified Logstore.

        If the destination project or Logstore does not exist, the system automatically creates it. You can also manually create a project and a Logstore in advance.
    • Common processing configurations (Optional): Defines the processors field to perform structured parsing (such as regular expression or delimiter parsing), masking, or filtering on raw logs.

      This topic describes only native processing plugins that cover common log processing use cases. For more features, see Extended processing plugins.
    • Other advanced configurations (Optional): Implements features such as multi-line log collection and log tag enrichment to meet more fine-grained collection requirements.

    Structure example:

    apiVersion: telemetry.alibabacloud.com/v1alpha1 # Use the default value. Do not modify.
    kind: ClusterAliyunPipelineConfig               # Use the default value. Do not modify.
    metadata:
      name: test-config                             # Set the resource name. It must be unique within the Kubernetes cluster.
    spec:
      project:                                      # Set the name of the destination project.
        name: k8s-your-project                      
      config:                                       # Set the Logtail collection configuration.
        inputs:                                     # Set the input plugins for the Logtail collection configuration.
          ...
        processors:                                 # Set the processing plugins for the Logtail collection configuration.
          ...
        flushers:                                   # Set the output plugins for the Logtail collection configuration.
          ...
  4. Apply the configuration

    kubectl apply -f <your_yaml>

Install LoongCollector (Logtail)

LoongCollector is a next-generation log collection agent from Simple Log Service (SLS) and is an upgraded version of Logtail. LoongCollector and Logtail cannot be installed at the same time. To install Logtail, see Install and configure Logtail.

This topic describes only the basic installation steps for LoongCollector. For detailed parameters, see Install LoongCollector (Kubernetes). If you have already installed LoongCollector or Logtail, skip this step and proceed to create a Logstore to store the collected logs.

ACK cluster

Install LoongCollector from the Container Service for Kubernetes (ACK) console. By default, logs are sent to a Simple Log Service (SLS) project that belongs to the current Alibaba Cloud account.

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster to open its details page.

  3. In the navigation pane on the left, click Add-ons.

  4. On the Logs And Monitoring tab, find loongcollector and click Install.

    Note

    For a new cluster, on the Component Configurations page, select Enable Log Service. Then, select Create New Project or Use Existing Project.

    After the installation is complete, SLS automatically creates related resources in the region where the ACK cluster is located. You can log on to the Simple Log Service console to view them.

    Resource type

    Resource name

    Function

    Project

    k8s-log-${cluster_id}

    A resource management unit that isolates logs for different services.

    To create a project for more flexible log resource management, see Create a project.

    Machine group

    k8s-group-${cluster_id}

    A collection of log collection nodes.

    Logstore

    config-operation-log

    Important

    Do not delete this Logstore.

    Stores logs for the loongcollector-operator component. Its billing method is the same as that of a normal Logstore. For more information, see Billable items for the pay-by-ingested-data mode. Do not create collection configurations in this Logstore.

Self-managed cluster

  1. Connect to the Kubernetes cluster and run the command for your region to download LoongCollector and its dependent components:

    Regions in China:

    wget https://aliyun-observability-release-cn-shanghai.oss-cn-shanghai.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh

    Regions outside China:

    wget https://aliyun-observability-release-ap-southeast-1.oss-ap-southeast-1.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh
  2. Go to the loongcollector-custom-k8s-package directory and modify the ./loongcollector/values.yaml configuration file.

    # ===================== Required parameters =====================
    # The name of the project that manages collected logs. Example: k8s-log-custom-sd89ehdq.
    projectName: ""
    # The region where the project is located. Example for Shanghai: cn-shanghai
    region: ""
    # The ID of the Alibaba Cloud account that owns the project. Enclose the ID in quotation marks. Example: "123456789"
    aliUid: ""
    # The network type. Valid values: Internet and Intranet. Default value: Internet.
    net: Internet
    # The AccessKey ID and AccessKey secret of the Alibaba Cloud account or RAM user. The account or user must have the AliyunLogFullAccess system policy.
    accessKeyID: ""
    accessKeySecret: ""
    # The custom cluster ID. The ID can contain only uppercase letters, lowercase letters, digits, and hyphens (-).
    clusterID: ""
  3. In the loongcollector-custom-k8s-package directory, run the following command to install LoongCollector and other dependent components:

    bash k8s-custom-install.sh install
  4. After the installation is complete, check the running status of the components.

    If a pod fails to start, check whether the values.yaml configuration is correct and whether the relevant images were pulled successfully.
    # Check the pod status.
    kubectl get po -n kube-system | grep loongcollector-ds

    SLS also automatically creates the following resources. You can log on to the Simple Log Service console to view them.

    Resource type

    Resource name

    Function

    Project

    The value of projectName that you specified in the values.yaml file

    A resource management unit that isolates logs for different services.

    Machine group

    k8s-group-${cluster_id}

    A collection of log collection nodes.

    Logstore

    config-operation-log

    Important

    Do not delete this Logstore.

    Stores logs for the loongcollector-operator component. Its billing method is the same as that of a normal Logstore. For more information, see Billable items for the pay-by-ingested-data mode. Do not create collection configurations in this Logstore.

Create a Logstore

If you have already created a Logstore, skip this step and proceed to configure collection.

  1. Log on to the Simple Log Service console and click the name of the target project.

  2. In the navigation pane on the left, choose imageLog Storage and click the + icon.

  3. On the Create Logstore page, complete the following core configurations:

    • Logstore Name: Set a name that is unique within the project. This name cannot be changed after creation.

    • Logstore Type: Choose Standard or Query based on a comparison of their specifications.

    • Billing Method:

      • Pay-By-Feature: Billed independently for each resource, such as storage, indexing, and read/write operations. Suitable for small-scale use cases or when feature usage is uncertain.

      • Pay-By-Ingested-Data: Billed only by the amount of raw data ingested. Provides a 30-day free storage period and free features such as data transformation and delivery. The cost model is simple and suitable for use cases where the storage period is close to 30 days or the data processing pipeline is complex.

    • Data Retention Period: Set the number of days to retain logs. The value ranges from 1 to 3650 days. A value of 3650 indicates permanent storage. The default is 30 days.

  4. Keep the default settings for other configurations and click OK. For more information about other configurations, see Manage Logstores.

Minimal configuration

In <a baseurl="t3010624_v1_5_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#c801b53fd5xu4" id="2ba88d5208b3d">spec.config</a>, you configure the input (inputs) and output (flushers) plugins. These plugins define the core log collection path, which includes the log source and destination.

Container standard output - new version

Purpose: Collects container standard output logs (stdout/stderr) that are printed directly to the console.

inputs plugin

The starting point of the collection configuration. Defines the log source. Currently, only one input plugin can be configured.

  • Type String (Required)

    The plugin type. Set to input_container_stdio.

  • IgnoringStderr boolean (Optional)

    Specifies whether to ignore the standard error stream (stderr). Default value: false.

    • true: Does not collect stderr.

    • false: Collects stderr.

  • IgnoringStdout boolean (Optional)

    Specifies whether to ignore the standard output stream (stdout). Default value: false.

    • true: Does not collect stdout.

    • false: Collects stdout.

Example

apiVersion: telemetry.alibabacloud.com/v1alpha1
kind: ClusterAliyunPipelineConfig
metadata:
  # Set the resource name. It must be unique within the Kubernetes cluster and is also the name of the created Logtail collection configuration.
  name: new-stdio-config
spec:
  project:
    name: test-not-exist
  logstores:
    - name: new-stdio-logstore

  # Define the LoongCollector (Logtail) collection and processing configuration.
  config:
    # --- Input plugin: Defines where to collect logs from ---
    inputs:
      # Use the input_container_stdio plugin to collect container standard output.
      - Type: input_container_stdio
        IgnoringStderr: false
        IgnoringStdout: false

    # --- Processing plugins (Optional): Define how to parse and process logs ---
    processors: []

    # --- Output plugin: Defines where to send logs ---
    flushers:
      - Type: flusher_sls    # Specify the SLS output plugin.
        Logstore: new-stdio-logstore   

flushers output plugin

Configure the flusher_sls plugin to send collected logs to a specified Logstore in a project. Currently, only one output plugin can be configured.

  • Type String (Required)

    The plugin type. Set to flusher_sls.

  • Logstore String (Required)

    The name of the destination Logstore. This determines the actual storage location of the logs.

    Note

    The specified Logstore must exist or be declared in <a baseurl="t3010624_v1_6_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#471f2ca341s6j" id="4335117646mhr">spec.logstores</a>.

Collect container text files

Purpose: Collects logs written to a specific file path within a container, such as traditional access.log or app.log files.

inputs plugin

The starting point of the collection configuration. Defines the log source. Currently, only one input plugin can be configured.

  • Type String (Required)

    The plugin type. Set to input_file.

  • FilePaths String (Required)

    A list of paths to the log files that you want to collect.

    • Currently, only one path can be configured.

    • Supports wildcard characters:

      • *: Matches file names in a single-level directory.

      • **: Recursively matches multi-level subdirectories. Can appear only once and must be before the file name.

  • MaxDirSearchDepth integer (Optional)

    When the path contains **, specifies the maximum directory depth. Default value: 0. Value range: 0 to 1000.

  • FileEncoding String (Optional)

    The file encoding format. Default value: utf8. The supported values are:

    • utf8

    • gbk

  • EnableContainerDiscovery boolean (Optional)

    Specifies whether to enable the container discovery feature. Default value: true.

    Note

    This parameter takes effect only when LoongCollector (Logtail) runs in DaemonSet mode and the collection file path is a path within the container.

Example

apiVersion: telemetry.alibabacloud.com/v1alpha1
kind: ClusterAliyunPipelineConfig
metadata:
  name: easy-row-config
spec:
  # Specify the destination project to which logs are sent.
  project:
    name: test-not-exist
  logstores:
    - name: easy-row-logstore

  # Define the LoongCollector (Logtail) collection and processing configuration.
  config:
    # Log sample (optional)
    sample: ''
    # --- Input plugin: Defines where to collect logs from ---
    inputs:
      # Use the input_file plugin to collect container text files.
      - Type: input_file         
        # ... Specific configuration for the input plugin ...
        # File path within the container
        FilePaths:
          - /var/log/text1.log
        # Maximum directory monitoring depth  
        MaxDirSearchDepth: 0
        FileEncoding: utf8  
        # Enable the container discovery feature.
        EnableContainerDiscovery: true
        

    # --- Processing plugins (Optional): Define how to parse and process logs ---
    processors: []    

    # --- Output plugin: Defines where to send logs ---
    flushers:
      - Type: flusher_sls        # Specify the SLS output plugin.
        Logstore: easy-row-logstore

flushers output plugin

Configure the flusher_sls plugin to send collected logs to a specified Logstore in a project. Currently, only one output plugin can be configured.

  • Type String (Required)

    The plugin type. Set to flusher_sls.

  • Logstore String (Required)

    The name of the destination Logstore. This determines the actual storage location of the logs.

    Note

    The specified Logstore must exist or be declared in <a baseurl="t3010624_v1_6_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#471f2ca341s6j" id="0d8eb5204eevu">spec.logstores</a>.

Common processing configurations

After you complete the minimal configuration, you can add processors plugins to perform structured parsing, masking, or filtering on raw logs.

Core configuration: Add processors to <a baseurl="t3010624_v1_5_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#c801b53fd5xu4" id="49324d8ed176z">spec.config</a> to configure processing plugins. You can enable multiple plugins at the same time.

This topic describes only native processing plugins that cover common log processing use cases. For information about additional features, see Extended processing plugins.
Important

For Logtail 2.0 and later versions and the LoongCollector component, follow these plugin combination rules:

  • Use native plugins first.

  • If native plugins cannot meet your needs, configure extension plugins after the native plugins.

  • Native plugins can be used only before extension plugins.

Structured configuration

Regular expression parsing

Use a regular expression to extract log fields and parse the log into key-value pairs.

Key fields

Example

Type String (Required)

The plugin type. Set this to processor_parse_regex_native.

 # ...under spec.config...
 processors:
  # Use the regular expression parsing plugin to parse log content.
  - Type: processor_parse_regex_native
    # The source of the raw log field, typically content.
    SourceKey: content

    # The regular expression used to match and extract log fields.
    Regex: >-
      (\S+)\s-\s(\S+)\s$$([^]]+)$$\s"
      (\w+)\s(\S+)\s([^"]+)"
      \s(\d+)\s(\d+)\s"
      ([^"]+)"\s"
      ([^"]+).*

    # The list of extracted fields, which correspond to the regex groups in order.
    Keys:
      - remote_addr
      - remote_user
      - time_local
      - request_method
      - request_uri
      - request_protocol
      - status
      - body_bytes_sent
      - http_referer
      - http_user_agent

    # Specifies whether to keep the source field if parsing fails.
    KeepingSourceWhenParseFail: true

    # Specifies whether to keep the source field if parsing succeeds.
    KeepingSourceWhenParseSucceed: true

    # If the source field is kept, you can specify a new name for it.
    RenamedSourceKey: fail

SourceKey String _(Required)

The name of the source field.

Regex String (Required)

The regular expression that matches the log.

Keys String (Required)

A list of the extracted fields.

KeepingSourceWhenParseFail boolean (Optional)

Specifies whether to keep the source field when parsing fails. The default value is false.

KeepingSourceWhenParseSucceed boolean (Optional)

Specifies whether to keep the source field when parsing succeeds. The default value is false.

RenamedSourceKey _String (Optional)

When the source field is kept, this parameter specifies the new name for the field. By default, the field is not renamed.

Delimiter parsing

Structures log content using a delimiter and parses the content into key-value pairs. This method supports single-character and multi-character delimiters.

Key field details

Example

Type String (Required)

The plugin type. Set this to processor_parse_delimiter_native.

# ...under spec.config...
processors:
  # Delimiter parsing plugin configuration
  - Type: processor_parse_delimiter_native
    # The source of the raw field, typically content
    SourceKey: content

    Separator: ','

    Quote: '"'

    # Define the names for the extracted fields in order.
    Keys:
      - time
      - ip
      - request
      - status
      - size
      - user_agent

SourceKey String (Required)

The name of the source field.

Separator String _(Required)

The field separator. For example, CSV files use a comma (,).

Keys [String] (Required)

A list of the extracted fields.

Quote String (Optional)

The quote character. Use this to wrap field content that contains special characters, such as a comma.

AllowingShortenedFields boolean (Optional)

Specifies whether the number of extracted fields can be less than the number of keys. The default value is true. If set to false, this scenario is considered a parsing failure.

OverflowedFieldsTreatment String (Optional)

Specifies the action to take when the number of extracted fields is greater than the number of keys. The default value is extend. Valid values include the following:

  • extend: Keeps the extra fields. Each extra field is added to the log as a separate field. The name for an extra field is _column$i_, where $i is the ordinal number of the extra field, starting from 0.

  • keep: Keeps the extra fields, but adds the extra content as a single field to the log. The field name is _column0_.

  • discard: Discards the extra fields.

KeepingSourceWhenParseFail boolean (Optional)

Specifies whether to keep the source field when parsing fails. The default value is false.

KeepingSourceWhenParseSucceed boolean (Optional)

Specifies whether to keep the source field when parsing succeeds. The default value is false.

RenamedSourceKey String (Optional)

When the source field is kept, this parameter specifies the new name for the field. By default, the field is not renamed.

Standard JSON parsing

Structures object-type JSON logs and parses them into key-value pairs.

Key field details

Example

Type String (Required)

The plugin type. Set this to processor_parse_json_native.

# ...under spec.config...
processors:
  # JSON parsing plugin configuration
  - Type: processor_parse_json_native
    # The source of the raw log field
    SourceKey: content

SourceKey String (Required)

The name of the source field.

KeepingSourceWhenParseFail boolean (Optional)

Specifies whether to keep the source field when parsing fails. The default value is false.

KeepingSourceWhenParseSucceed boolean (Optional)

Specifies whether to keep the source field when parsing succeeds. The default value is false.

RenamedSourceKey String (Optional)

When the source field is kept, this parameter specifies the new name for the field. By default, the field is not renamed.

Nested JSON parsing

Parses nested JSON logs into key-value pairs by specifying an expansion depth.

Key field details

Example

Type String (Required)

The plugin type. Set this to processor_json.

# ...under spec.config...
processors:
  # Configure the JSON field expansion plugin.
  - Type: processor_json
    # Specify the name of the source field to parse.
    SourceKey: content
    
    ExpandDepth: 0

    ExpandConnector: '_'

    Prefix: expand

    IgnoreFirstConnector: false

    # Specifies whether to expand array elements into separate fields.
    ExpandArray: false

    # Specifies whether to keep the source field content.
    KeepSource: true

    # Specifies whether to report an error if the source field is missing.
    NoKeyError: true

    # Specifies whether to use the source field name as a prefix for the expanded field names.
    UseSourceKeyAsPrefix: false

    # Specifies whether to keep the raw log data if JSON parsing fails.
    KeepSourceIfParseError: true

SourceKey String (Required)

The name of the source field.

ExpandDepth integer (Optional)

The JSON expansion depth. The default value is 0.

  • 0: Expands to the deepest parsable level.

  • 1: Expands only the current level, and so on.

ExpandConnector String (Optional)

The connector used for field names during JSON expansion. The default value is an underscore (_).

Prefix String (Optional)

A prefix for the names of the expanded JSON fields.

IgnoreFirstConnector String _(Optional)

Specifies whether to ignore the first connector. This determines if a connector is added before the top-level field name. The default value is false.

ExpandArray boolean (Optional)

Specifies whether to expand array types. The default value is false.

  • false (default): The array is not expanded.

  • true: The array is expanded. For example, {"k":["1","2"]} is expanded to {"k[0]":"1","k[1]":"2"}.

Note

This parameter is supported in Logtail 1.8.0 and later.

KeepSource boolean (Optional)

Specifies whether to keep the source field in the parsed log. The default value is true.

  • true: Keep

  • false: Discard

NoKeyError boolean (Optional)

Specifies whether to report an error if the specified source field is not found in the raw log. The default value is true.

  • true: Report an error.

  • false: Do not report an error.

UseSourceKeyAsPrefix boolean (Optional)

Specifies whether to use the source field name as a prefix for all expanded JSON field names.

KeepSourceIfParseError boolean (Optional)

Specifies whether to keep the raw log data if parsing fails. The default value is true.

  • true: Keep

  • false: Discard

JSON array parsing

Use the json_extract function to extract JSON objects from a JSON array. For more information, see JSON functions.

Key field details

Example

Type String (Required)

The plugin type. For the Structured Process Language (SPL) plugin, set this to processor_spl.

# ...under spec.config...
processors:
  # Use an SPL script to process log fields.
  - Type: processor_spl
    # Script timeout in milliseconds.
    TimeoutMilliSeconds: 1000

    # The SPL script used to extract elements from the JSON array in the content field.
    Script: >-
      * | extend
        json1 = json_extract(content, '$[0]'),
        json2 = json_extract(content, '$[1]')

Script String (Required)

The content of the SPL script. This script is used to extract elements from the JSON array in the content field.

TimeoutMilliSeconds integer (Optional)

The script timeout period in milliseconds. The value must be in the range of 0 to 10,000. The default value is 1,000.

NGINX log parsing

Structures log content into key-value pairs based on the definition in log_format. If the default format does not meet your requirements, you can use a custom format.

Key field details

Example

Type String (Required)

The plugin type. For NGINX log parsing, set this to processor_parse_regex_native.

# ...under spec.config...
processors:
  # NGINX log parsing plugin configuration
  - Type: processor_parse_regex_native
    # The source of the raw log field
    SourceKey: content
    
    # Regular expression parsing rule
    Regex: >-
      (\S*)\s*-\s*(\S*)\s*\[
      (\d+/\S+/\d+:\d+:\d+:\d+)\s+\S+\]
      \s*"(\S+)\s+(\S+)\s+\S+"
      \s*(\S*)\s*(\S*)\s*(\S*)\s*(\S*)
      \s*"([^"]*)"\s*"([^"]*)".*
    
    # Extracted field mapping
    Keys:
      - remote_addr
      - remote_user
      - time_local
      - request_method
      - request_uri
      - request_time
      - request_length
      - status
      - body_bytes_sent
      - http_referer
      - http_user_agent
    
    # NGINX-specific configuration
    Extra:
      Format: >-
        log_format main  '$remote_addr - $remote_user [$time_local]
        "$request" ''$request_time $request_length ''$status
        $body_bytes_sent "$http_referer" ''"$http_user_agent"';
      LogType: NGINX

SourceKey String (Required)

The name of the source field.

Regex integer (Required)

The regular expression.

Keys String (Required)

A list of the extracted fields.

Extra

  • Format String (Required)

    The log configuration section from the NGINX configuration file. This section must start with log_format.

    In a production environment, the log_format definition here must match the definition in the NGINX configuration file, which is typically /etc/nginx/nginx.conf.
  • LogType String (Required)

    The type of log to parse. Set this to NGINX.

KeepingSourceWhenParseFail boolean (Optional)

Specifies whether to keep the source field when parsing fails. The default value is false.

KeepingSourceWhenParseSucceed boolean (Optional)

Specifies whether to keep the source field when parsing succeeds. The default value is false.

RenamedSourceKey String _(Optional)

When the source field is kept, this parameter specifies the new name for the field. By default, the field is not renamed.

Apache log parsing

Structures log content into key-value pairs based on the definition in the Apache log configuration file.

Key field details

Example

Type String (Required)

The plugin type. Set this to processor_parse_regex_native.

# ...under spec.config...
processors:
  # Configure the Apache Combined log parsing plugin (based on regular expressions).
  - Type: processor_parse_regex_native
    # The source of the raw log field, typically content.
    SourceKey: content

    # The regular expression used to match and extract Apache combined format logs.
    Regex: >-
      ([0-9.-]+)\s                          # remote_addr
      ([\w.-]+)\s                           # remote_ident
      ([\w.-]+)\s                           # remote_user
      (\[[^\[\]]+\]|-)\s                    # time_local
      "((?:[^"]|\")+)"\s                     # request_method + request_uri + request_protocol
      "((?:[^"]|\")+)"\s                     # request_uri (repeated capture? Check the logic)
      "((?:[^"]|\")+)"\s                     # request_protocol
      (\d{3}|-)\s                           # status
      (\d+|-)\s                             # response_size_bytes
      "((?:[^"]|\")+)"\s                     # http_referer
      "((?:[^"]|\"|')+)"                     # http_user_agent

    # The list of extracted fields, which correspond to the regex groups in order.
    Keys:
      - remote_addr
      - remote_ident
      - remote_user
      - time_local
      - request_method
      - request_uri
      - request_protocol
      - status
      - response_size_bytes
      - http_referer
      - http_user_agent

    # Additional plugin information (optional, used to describe the log format)
    Extra:
      Format: >-
        LogFormat "%h %l %u %t \"%r\" %>s %b
        \"%{Referer}i\" \"%{User-Agent}i\"" combined
      LogType: Apache
      SubType: combined

SourceKey String (Required)

The name of the source field.

Regex integer (Required)

The regular expression.

Keys String (Required)

A list of the extracted fields.

Extra

  • Format String (Required)

    The log configuration section from the Apache configuration file. This section usually starts with LogFormat.

  • LogType String (Required)

    The type of log to parse. Set this to Apache.

  • SubType String (Required)

    The log format.

    • common

    • combined

    • custom

KeepingSourceWhenParseFail boolean (Optional)

Specifies whether to keep the source field when parsing fails. The default value is false.

KeepingSourceWhenParseSucceed boolean _(Optional)

Specifies whether to keep the source field when parsing succeeds. The default value is false.

RenamedSourceKey String (Optional)

When the source field is kept, this parameter specifies the new name for the field. By default, the field is not renamed.

Data masking

Use the processor_desensitize_native plugin to mask sensitive data in logs.

Key fields

Example

Type String (Required)

The plugin type. Set the value to processor_desensitize_native.

# ...Under spec.config...
processors:
  # Configure the native log masking plugin.
  - Type: processor_desensitize_native

    # The source field name.
    SourceKey: content

    # The masking method. const replaces sensitive data with a constant string.
    Method: const

    # The constant string used for replacement.
    ReplacingString: '********'

    # The regular expression for the content that precedes the sensitive data.
    ContentPatternBeforeReplacedString: 'password'':'''

    # The regular expression for the sensitive data to be replaced.
    ReplacedContentPattern: '[^'']*'

    # Specifies whether to replace all matches. Default: true.
    ReplacingAll: true

SourceKey String (Required)

The source field name.

Method String (Required)

The masking method. Valid values:

  • const: Replaces sensitive content with a constant string.

  • md5: Replaces sensitive content with its MD5 hash.

ReplacingString String (Optional)

The constant string used to replace sensitive content. This parameter is required when Method is set to const.

ContentPatternBeforeReplacedString String (Required)

The regular expression for the prefix of the sensitive content.

ReplacedContentPattern String (Required)

The regular expression for the sensitive content.

ReplacingAll boolean (Optional)

Specifies whether to retain the original field after successful parsing. The default is true.

Content filtering

Configure the processor_filter_regex_native plugin to match log field values based on a regular expression and keep only the logs that meet the conditions.

Key fields

Example

Type String (Required)

The plugin type. The value is processor_filter_regex_native.

# ...Under spec.config...
processors:
  # Configure the regular expression filtering plugin (can be used for log masking or sensitive word filtering).
  - Type: processor_filter_regex_native

    # Define a list of regular expressions to match the content of log fields.
    FilterRegex:
      # Example: Match content that contains "WARNING" or "ERROR" in the log field value.
      - WARNING|ERROR

    # Specify the log field name to match. The example filters the level field.
    FilterKey:
      - level

FilterRegex String (Required)

The regular expression to match the log field.

FilterKey String (Required)

The name of the log field to match.

Time parsing

Configure the `processor_parse_timestamp_native` plugin to parse the time field in a log and set the parsing result as the log's __time__ field.

Key fields

Example

Type String (Required)

The plugin type. Set to processor_parse_timestamp_native.

# ...Under spec.config...
processors:
  # Configure the native time parsing plugin.
  - Type: processor_parse_timestamp_native
    # Source of the raw log field, usually content
    SourceKey: content

    # Time format definition, must exactly match the format of the time field in the log.
    SourceFormat: '%Y-%m-%d %H:%M:%S'
    
    SourceTimezone: 'GMT+00:00'

SourceKey String (Required)

The source field name.

SourceFormat String (Required)

Time format. This format must exactly match the format of the time field in the log.

SourceTimezone String (Optional)

The time zone of the log time. By default, the machine's time zone is used, which is the time zone of the environment where the LoongCollector process is located.

Format:

  • GMT+HH:MM: East time zone

  • GMT-HH:MM: West time zone

Other advanced configurations

After you complete the minimal configuration, you can perform the following operations to collect multi-line logs, configure log topic types, and configure more fine-grained log collection. The following are common advanced configurations and their functions:

  • Configure multi-line log collection: When a single log entry, such as an exception stack trace, spans multiple lines, you need to enable multi-line mode and configure a regular expression for the start of a line to match the beginning of a log. This ensures that the multi-line entry is collected and stored as a single log in an SLS Logstore.

  • Configure log topic types: Set different topics for different log streams to organize and categorize log data. This helps you better manage and retrieve relevant logs.

  • Specify containers for collection (filtering and blacklists): Specify specific containers and paths for collection, including whitelist and blacklist configurations.

  • Enrich log tags: Add metadata related to environment variables and pod labels to logs as extended fields.

Configure multi-line log collection

By default, Simple Log Service uses single-line mode, which splits and stores logs line by line. This causes multi-line logs that contain stack trace information to be split, with each line stored and displayed as an independent log, which is not conducive to analysis.

To address this issue, you can enable multi-line mode to change how Simple Log Service splits logs. By configuring a regular expression to match the start of a log entry, you can ensure that raw logs are split and stored according to the start-of-line rule.

Core configuration: In the <a baseurl="t3010624_v1_5_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#c801b53fd5xu4" id="f3ede8f0e8rnz">spec.config.inputs</a> configuration, add the Multiline parameter.

Key fields

Example

Multiline

Enables multi-line log collection.

  • Mode

    The mode selection. Default value: custom.

    • custom: Indicates a custom regular expression to match the start of a line.

    • JSON: Multi-line JSON.

  • StartPattern

    The regular expression for the start of a line. This is required when Mode is set to custom.

# ...Under spec.config...
inputs:
  - Type: input_file
    # Enable multi-line log collection.
    Multiline:
      # Mode selection: custom indicates a custom regular expression to match the start of a line.
      Mode: custom
      # The regular expression matches the start of each log entry (the marker for a new log).
      StartPattern: '\d+-\d+-\d+\s\d+:\d+:\d+'

Configure log topic types

Core configuration: In <a baseurl="t3010624_v1_5_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#c801b53fd5xu4" id="bd8af01817orq">spec.config</a>, add the global parameter to set the topic.

Key fields

Example

TopicType

The topic type. Optional values:

  • machine_group_topic: Machine group topic, used to distinguish logs from different machine groups.

  • filepath: File path extraction, used to distinguish log data generated by different users or applications.

  • custom: Custom, uses a custom static log topic.

Machine group topic

spec: 
  config:
    global: 
    #Use the machine group topic to which this configuration is applied as the topic.
      TopicType: machine_group_topic              

File path extraction

spec:  
  config:
    global: 
      TopicType: filepath
    # Topic format. Required when TopicType is set to filepath or custom.
    # The extraction results are __topic__: userA, __topic__: userB, and __topic__: userC.
      TopicFormat: \/data\/logs\/(.*)\/serviceA\/.*

Custom

spec:  
  config:
    global: 
      TopicType: custom
    # Topic format. Required when TopicType is set to filepath or custom.
      TopicFormat: customized:// + custom topic name

TopicFormat

The topic format. This is required when TopicType is set to filepath or custom.

Specify containers for collection (filtering and blacklists)

Filtering

Collects logs only from containers that meet the specified conditions. Multiple conditions are combined with a logical AND. An empty condition is ignored. Conditions support regular expressions.

Core configuration: In <a baseurl="t3010624_v1_5_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#c801b53fd5xu4" id="aa519613f0gdu">spec.config.inputs</a>, configure the ContainerFilters parameters for container filtering.

Key field details

Example

ContainerFilters

Container filtering

  • Pod label blacklists and whitelists

    • IncludeK8sLabel

      K8s pod label whitelist. Specifies the containers from which to collect logs.

    • ExcludeK8sLabel

      K8s pod label blacklist. Excludes log collection from containers that meet specific conditions.

  • Environment variable blacklists and whitelists

    • IncludeEnv

      Environment variable whitelist

    • ExcludeInv

      Environment variable blacklist

  • Pod, namespace, and container name matching with regular expressions

    • K8sNamespaceRegex

      Namespace regular expression matching

    • K8sPodRegex

      Pod name regular expression matching

    • K8sContainerRegex

      Container name regular expression matching

All regular expression matching uses Go's RE2 engine. This engine has fewer features than engines such as PCRE. Write regular expressions according to the limits described in Appendix: Regular expression limits (container filtering).
# ...Under spec.config...
inputs:
  - Type: input_file # or input_container_stdio
    # If the input plugin type is input_file, set EnableContainerDiscovery to true.
    EnableContainerDiscovery: true
    # Container filtering
    ContainerFilters:
      # K8s pod label whitelist: Specifies the containers from which to collect logs.
      IncludeK8sLabel:
        # Example: Match all pods that have the app label with a value of nginx or redis.
        app: ^(nginx|redis)$

      # K8s pod label blacklist: Excludes log collection from containers that meet specific conditions.
      ExcludeK8sLabel:
        # Example: Exclude all pods with the app:test label.
        app: test
      
      # Environment variable whitelist
      IncludeEnv:
        # Match all containers with NGINX_SERVICE_PORT=80 or NGINX_SERVICE_PORT=6379.
        NGINX_SERVICE_PORT: ^(80|6379)$

      # Environment variable blacklist
      ExcludeEnv:
        # Exclude all containers with ENVIRONMENT=test.
        ENVIRONMENT: test
      
      # Namespace regex matching. Example: Match all containers in the default and nginx namespaces.
      K8sNamespaceRegex: ^(default|nginx)$
      # Pod name regex matching. Example: Match containers in all pods whose names start with nginx-log-demo.
      K8sPodRegex: ^(nginx-log-demo.*)$
      # Container name regex matching. Example: Match all containers named container-test.
      K8sContainerRegex: ^(container-test)$

Blacklist

To exclude files that meet specified conditions, use the following parameters under config.inputs in the YAML file as needed:

Key field details

Example

# ...Under spec.config...
inputs:
  - Type: input_file
    # Blacklist for file paths. Excludes files that meet specified conditions. The path must be an absolute path and supports the asterisk (*) wildcard character.
    ExcludeFilePaths:
      - /var/log/*.log

    # Blacklist for file names. Excludes files that meet specified conditions. Supports the asterisk (*) wildcard character.
    ExcludeFiles:
      - test

    # Blacklist for directories. Excludes files that meet specified conditions. The path must be an absolute path and supports the asterisk (*) wildcard character.
    ExcludeDirs:
      - /var/log/backup*               

ExcludeFilePaths

Blacklist for file paths. Excludes files that meet specified conditions. The path must be an absolute path. The asterisk (*) wildcard character is supported.

ExcludeFiles

Blacklist for file names. Excludes files that meet specified conditions. The asterisk (*) wildcard character is supported.

ExcludeDirs

Blacklist for directories. Excludes files that meet specified conditions. The path must be an absolute path. The asterisk (*) wildcard character is supported.

Enrich log tags

Core configuration: By configuring ExternalEnvTag and ExternalK8sLabelTag in <a baseurl="t3010624_v1_5_0.xdita" data-node="6128095" data-root="16376" data-tag="xref" href="t3145878.xdita#c801b53fd5xu4" id="c7eb12a5deg0j">spec.config.inputs</a>, you can add tags related to container environment variables and Pod labels to logs.

Key fields

Example

ExternalEnvTag

Maps the value of a specified environment variable to a tag field. Format: <environment_variable_name>: <tag_name>.

# ...Under spec.config...
inputs:
  - Type: input_file # or input_container_stdio
    ExternalEnvTag:
      <environment_variable_name>: <tag_name>
    
    ExternalK8sLabelTag:
      <pod_label_name>: <tag_name>          

ExternalK8sLabelTag

Maps the value of a Kubernetes Pod label to a tag field. Format: <pod_label_name>: <tag_name>.

Configuration examples

Scenario 1: Collect and parse NGINX access logs into structured fields

Parses NGINX logs and structures the log content into multiple key-value pairs based on the definition in log_format.

Complete YAML example

apiVersion: telemetry.alibabacloud.com/v1alpha1
kind: ClusterAliyunPipelineConfig
metadata:
  name: nginx-config
spec:
  config:
    aggregators: []
    global: {}
    inputs:
      - Type: input_file
        FilePaths:
          - /root/log/text1.log
        MaxDirSearchDepth: 0
        FileEncoding: utf8
        EnableContainerDiscovery: true
    processors:
      - Type: processor_parse_regex_native
        SourceKey: content
        Regex: >-
          (\S*)\s*-\s*(\S*)\s*\[(\d+/\S+/\d+:\d+:\d+:\d+)\s+\S+\]\s*"(\S+)\s+(\S+)\s+\S+"\s*(\S*)\s*(\S*)\s*(\S*)\s*(\S*)\s*"([^"]*)"\s*"([^"]*)".*
        Keys:
          - remote_addr
          - remote_user
          - time_local
          - request_method
          - request_uri
          - request_time
          - request_length
          - status
          - body_bytes_sent
          - http_referer
          - http_user_agent
        Extra:
          Format: >-
            log_format main  '$remote_addr - $remote_user [$time_local]
            "$request" ''$request_time $request_length ''$status
            $body_bytes_sent "$http_referer" ''"$http_user_agent"';
          LogType: NGINX
    flushers:
      - Type: flusher_sls
        Logstore: my-log-logstore
    sample: >-
      192.168.*.* - - [15/Apr/2025:16:40:00 +0800] "GET /nginx-logo.png
      HTTP/1.1" 0.000 514 200 368 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)
      AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.*.* Safari/537.36"
  project:
    name: my-log-project
  logstores:
    - name: my-log-logstore
    

Scenario 2: Collect and process multi-line logs

By default, Simple Log Service uses single-line mode, which splits and stores logs line by line. This causes multi-line logs that contain stack trace information to be split, with each line stored and displayed as an independent log, which is not conducive to analysis.

To address this issue, you can enable multi-line mode to change how Simple Log Service splits logs. By configuring a regular expression to match the start of a log entry, you can ensure that raw logs are split and stored according to the start-of-line rule. The following is an example:

Complete YAML example

apiVersion: telemetry.alibabacloud.com/v1alpha1
kind: ClusterAliyunPipelineConfig
metadata:
  name: multiline-config
spec:
  config:
    aggregators: []
    global: {}
    inputs:
      - Type: input_file
        FilePaths:
          - /root/log/text1.log
        MaxDirSearchDepth: 0
        FileEncoding: utf8
        Multiline:
          StartPattern: '\[\d+-\d+-\w+:\d+:\d+,\d+]\s\[\w+]\s.*'
          Mode: custom
          UnmatchedContentTreatment: single_line
        EnableContainerDiscovery: true
    processors: []
    flushers:
      - Type: flusher_sls
        Logstore: my-log-logstore
    sample: |-
      [2023-10-01T10:30:01,000] [INFO] java.lang.Exception: exception happened
          at TestPrintStackTrace.f(TestPrintStackTrace.java:3)
          at TestPrintStackTrace.g(TestPrintStackTrace.java:7)
          at TestPrintStackTrace.main(TestPrintStackTrace.java:16)
  project:
    name: my-log-project
  logstores:
    - name: my-log-logstore

FAQ

How do I send logs from an ACK cluster to a project in another Alibaba Cloud account?

Manually install the Simple Log Service LoongCollector (Logtail) component in the ACK cluster and configure it with the destination account's Alibaba Cloud account ID or access credential (AccessKey). This lets you send container logs to a Simple Log Service project in another Alibaba Cloud account.

Scenario: For reasons such as organizational structure, permission isolation, or unified monitoring, you need to collect log data from an ACK cluster to a Simple Log Service project in a separate Alibaba Cloud account. To do this, manually install LoongCollector (Logtail) for cross-account configuration.

Procedure: This section uses the manual installation of LoongCollector as an example. For information about how to install Logtail, see Install and configure Logtail.

  1. Connect to the Kubernetes cluster and run the command for your region to download LoongCollector and its dependent components:

    Regions in China:

    wget https://aliyun-observability-release-cn-shanghai.oss-cn-shanghai.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh

    Regions outside China:

    wget https://aliyun-observability-release-ap-southeast-1.oss-ap-southeast-1.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh
  2. Go to the loongcollector-custom-k8s-package directory and modify the ./loongcollector/values.yaml configuration file.

    # ===================== Required parameters =====================
    # The name of the project that manages collected logs. Example: k8s-log-custom-sd89ehdq.
    projectName: ""
    # The region where the project is located. Example for Shanghai: cn-shanghai
    region: ""
    # The ID of the Alibaba Cloud account that owns the project. Enclose the ID in quotation marks. Example: "123456789"
    aliUid: ""
    # The network type. Valid values: Internet and Intranet. Default value: Internet.
    net: Internet
    # The AccessKey ID and AccessKey secret of the Alibaba Cloud account or RAM user. The account or user must have the AliyunLogFullAccess system policy.
    accessKeyID: ""
    accessKeySecret: ""
    # The custom cluster ID. The ID can contain only uppercase letters, lowercase letters, digits, and hyphens (-).
    clusterID: ""
  3. In the loongcollector-custom-k8s-package directory, run the following command to install LoongCollector and other dependent components:

    bash k8s-custom-install.sh install
  4. After the installation is complete, check the running status of the components.

    If a pod fails to start, check whether the values.yaml configuration is correct and whether the relevant images were pulled successfully.
    # Check the pod status.
    kubectl get po -n kube-system | grep loongcollector-ds

    SLS also automatically creates the following resources. You can log on to the Simple Log Service console to view them.

    Resource type

    Resource name

    Function

    Project

    The value of projectName that you specified in the values.yaml file

    A resource management unit that isolates logs for different services.

    Machine group

    k8s-group-${cluster_id}

    A collection of log collection nodes.

    Logstore

    config-operation-log

    Important

    Do not delete this Logstore.

    Stores logs for the loongcollector-operator component. Its billing method is the same as that of a normal Logstore. For more information, see Billable items for the pay-by-ingested-data mode. Do not create collection configurations in this Logstore.

How to collect logs from a single log file or container standard output with multiple collection configurations?

By default, to prevent data duplication, Simple Log Service restricts each log source to a single collection configuration:

  • A text log file can match only one Logtail collection configuration.

  • The standard output (stdout) of a container can be collected by only one standard output collection configuration.

  1. Log on to the Simple Log Service console and go to the target project.

  2. In the navigation pane, choose imageLogstores and find the target Logstore.

  3. Click the image icon next to its name to expand the Logstore.

  4. Click Logtail Configurations. In the configuration list, find the target Logtail configuration and click Manage Logtail Configuration in the Actions column.

  5. On the Logtail configuration page, click Edit and scroll down to the Input Configurations section:

    • To collect logs from text files: Enable Allow File To Be Collected Multiple Times.

    • To collect container standard output: Enable Allow Standard Output To Be Collected Multiple Times.

Appendix: Regular expression usage limits (container filtering)

The regular expressions used for container filtering are based on Go's RE2 engine, which has syntax limitations compared to other engines such as PCRE. Keep the following points in mind when you write regular expressions:

1. Differences in named group syntax

Go uses the (?P<name>...) syntax to define named groups. It does not support the (?<name>...) syntax from PCRE.

  • Correct example: (?P<year>\d{4})

  • Incorrect example: (?<year>\d{4})

2. Unsupported regular expression features

The following common but complex regular expression features are not available in RE2. Avoid using them:

  • Assertions: (?=...), (?!...), (?<=...), and (?<!...)

  • Conditional expressions: (?(condition)true|false)

  • Recursive matching: (?R) and (?0)

  • Subroutine references: (?&name) and (?P>name)

  • Atomic groups: (?>...)

3. Recommendations

When you debug regular expressions with a tool such as Regex101, select the Golang (RE2) mode for validation to ensure compatibility. If you use any unsupported syntax, the plugin cannot parse or match the expression correctly.