Collect cluster container logs via the console - Container Service for Kubernetes

Requirements

Runtime environment:
- Supports Container Service for Kubernetes (ACK) (managed and dedicated editions) and self-managed Kubernetes clusters.
- Requires Kubernetes 1.10.0 or later with support for Mount propagation: HostToContainer.
- Container runtime (Docker and Containerd only)
  - Docker:
    - Requires access to docker.sock.
    - Standard output collection supports only the JSON log driver.
    - Only the overlay and overlay2 storage drivers are supported. For other storage drivers, you must manually mount the log directory.
  - Containerd: Requires access to containerd.sock.
Resource requirements: LoongCollector (Logtail) runs with system-cluster-critical priority. Do not deploy it if your cluster has insufficient resources; otherwise, existing Pods on a node may be evicted.
- CPU: Reserve at least 0.1 core.
- Memory: Reserve at least 150 MB for the collection component and at least 100 MB for the controller component.
- Actual resource usage depends on the collection rate, the number of monitored directories and files, and data transfer congestion. Ensure that resource utilization stays below 80% of the limit.

Permission requirements: The Alibaba Cloud account or RAM user used for deployment must have the AliyunLogFullAccess permission.

To create a custom policy, refer to the AliyunCSManagedLogRolePolicy system policy. Copy the permissions from the system policy and grant them to the target RAM user or role for fine-grained permission configuration.

Collection configuration

Install LoongCollector: Deploy LoongCollector in DaemonSet mode. This ensures that a collection container runs on each node in the cluster to collect logs from all containers on that node.

For information about the sidecar pattern, see Collect text logs from Kubernetes Pods by using the sidecar pattern.
Create a Logstore: A Logstore stores collected logs.
Create and configure log collection rules
1. Global and input configuration: Define the name of the collection configuration and specify the log source and scope.
2. Log processing and structuring: Configure processing rules based on the log format.
  - Multiline logs: For single log entries that span multiple lines, such as Java stack traces or Python tracebacks. Use a regular expression to identify the start of each log entry.
  - Structured parsing: Configure parsing plugins, such as regular expression, delimiter, or NGINX mode, to extract raw log strings into structured key-value pairs for easier querying and analysis.
3. Log filtering: Configure collection blacklists and content filtering rules to filter for valid logs. This helps reduce redundant data transmission and storage.
4. Log categorization: Configure log topics and tags to distinguish logs from different services, containers, or source paths.
Query and analysis configuration: Full-text index is enabled by default to support keyword searches. We recommend enabling field index to perform precise queries and analyses on structured fields, which improves search efficiency.
Verification and troubleshooting: After you complete the configuration, verify that logs are collected successfully. If you encounter issues such as missing data, heartbeat failures, or parsing errors, see Troubleshooting FAQ.

Step 1: Install LoongCollector

LoongCollector is the next-generation log collection agent for Simple Log Service and an upgrade to Logtail. LoongCollector and Logtail cannot coexist. To install Logtail, see Install, run, upgrade, and uninstall Logtail.

This topic covers only the basic installation of LoongCollector. For more information about the parameters, see Installation and configuration. If you have already installed LoongCollector or Logtail, skip this step and proceed to Step 2: Create a logstore.

Note

A change in the host time while LoongCollector (Logtail) is running can cause duplicate log collection or data loss.

ACK cluster

By default, LoongCollector sends logs to a Simple Log Service Project in the current Alibaba Cloud account.

Log on to the ACK console. In the left navigation pane, click Clusters.
Click the name of the target cluster to open its details page.
In the left-side navigation pane, click Add-ons.

On the Logs and Monitoring tab, find loongcollector and click Install.

Note

When you create a cluster, you can select Enable Log Service on the Component Configurations page. You can choose to Create Project or Select Project.

After the installation, Simple Log Service automatically creates the following resources in the current account. You can view them on the Simple Log Service console.

Type

Parameter

Description

Project

k8s-log-${cluster_id}

A resource management unit that isolates logs from different services.

To create a Project for more flexible log resource management, see Create a Project.

machine group

k8s-group-${cluster_id}

A set of log collection nodes.

Important

The LoongCollector component does not create a logstore named config-operation-log. If this logstore already exists, LoongCollector will not write new logs to it.

Self-managed cluster

Connect to your Kubernetes cluster and run the command that corresponds to your cluster's region:

Chinese mainland regions

wget https://aliyun-observability-release-cn-shanghai.oss-cn-shanghai.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh

Regions outside China

wget https://aliyun-observability-release-ap-southeast-1.oss-ap-southeast-1.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh

Change to the loongcollector-custom-k8s-package directory and modify the ./loongcollector/values.yaml configuration file.

# ===================== Required information =====================
# The Project where logs from this cluster are collected. Example: k8s-log-custom-sd89ehdq
projectName: ""
# The region where the Project is located. Example: cn-shanghai
region: ""
# The ID of the Alibaba Cloud account that owns the Project. Enclose the ID in quotation marks (""). Example: "123456789"
aliUid: ""
# The network type. Valid values: Internet and Intranet. Default value: Internet.
net: Internet
# The AccessKey ID and AccessKey secret of the Alibaba Cloud account or RAM user. The AliyunLogFullAccess policy is required.
accessKeyID: ""
accessKeySecret: ""
# A custom cluster ID. The ID can contain only uppercase letters, lowercase letters, digits, and hyphens (-).
clusterID: ""

In the loongcollector-custom-k8s-package directory, run the following command to install LoongCollector and its dependencies:
```
bash k8s-custom-install.sh install
```

After the installation is complete, check the status of the component.

If a pod fails to start, verify the configuration in the values.yaml file and ensure the required images have been pulled.

# Check the pod status
kubectl get po -n kube-system | grep loongcollector-ds

Simple Log Service also automatically creates the following resources, which you can view on the Simple Log Service console.

Type

Parameter

Description

Project

The value of projectName defined in the values.yaml file

A resource management unit that isolates logs from different services.

To create a Project for more flexible log resource management, see Create a Project.

machine group

k8s-group-${cluster_id}

A set of log collection nodes.

Important

The LoongCollector component does not create a logstore named config-operation-log. If this logstore already exists, LoongCollector will not write new logs to it.

Step 2: Create a logstore

A logstore is the storage unit in Simple Log Service for storing collected logs.

Log on to the Simple Log Service console and click the name of the target project.
In the left-side navigation pane, choose Logstores, and then click + to create a logstore:
- Logstore Name: Enter a unique name within the project. This name cannot be modified after creation.
- Logstore Type: Select Standard or Query based on the feature comparison.
- Billing Mode:
  - Pay-by-feature (Cannot Be Changed): You are billed separately for each resource, such as storage, indexes, and read/write operations. This mode is suitable for small-scale use cases or scenarios where feature usage is uncertain.
  - Pay-by-ingested-data: You are billed only for the volume of raw data that you write. This mode provides 30 days of free storage and free features such as data transformation and delivery. It is suitable for business scenarios where the storage period is around 30 days or the data processing pipeline is complex.
- Data Retention Period: Set the number of days to retain logs, from 1 to 3,650. A value of 3,650 indicates permanent retention. The default is 30 days.
- Leave the other settings at their default values and click OK. For more information, see Manage a logstore.

Step 3: Configure log collection rules

Define what logs LoongCollector collects, how to parse them, and how to filter content. Then, apply the configuration to a registered machine group.

On the LogStores page, click the icon next to the target LogStore name to expand it.
Click the icon next to Import Data. In the Quick Data Import dialog box, select a template for your log source, and then click Integrate Now:
- For container standard output, select K8s-Standard Output-New Version.
  
  Templates for collecting container standard output are available in new and old versions. We recommend that you use the new version. For a comparison of the new and old versions, see Appendix: Comparison of New and Old Container Standard Output Versions.
- For cluster text logs, select Kubernetes-File.
Complete the Machine Group Configurations, and then click Next:
- Scenario: Select Docker Containers.
- Deployment Method: Select ACK DaemonSet or Self-managed Cluster in DaemonSet Mode.
- From the Source Machine Group list, add the system-created machine group k8s-group-${cluster_id} to the Applied Machine Group list on the right.
On the Logtail Configuration page, specify the following parameters and click Next.

1. Global and input configuration

Before you start, ensure you have selected a data collection template and applied it to a machine group. In this step, you define the collection configuration name, log source, and scope.

Container standard output

Global Configurations

Configuration Name: Specify a custom name for the collection configuration. The name must be unique within the Project and cannot be changed after creation. The name must meet the following requirements:
- It can contain only lowercase letters, digits, hyphens (-), and underscores (_).
- It must start and end with a lowercase letter or a digit.

Input Configuration

Enable the switches for Stdout and Stderr and/or Standard Error as needed. By default, both are enabled.

Important
We recommend not enabling both standard output and standard error, as this may lead to disorganized log entries.

Cluster text logs

Global Configurations:

Configuration Name: Specify a custom name for the collection configuration. The name must be unique within the Project and cannot be changed after creation. The name must meet the following requirements:
- It can contain only lowercase letters, digits, hyphens (-), and underscores (_).
- It must start and end with a lowercase letter or a digit.

Input Configurations:

File Path Type:
- Path in Container: Collect log files from within the container.
- Host Path: Collect local service logs from the host.
File Path: The absolute path for log collection.
- Linux: The path must start with a forward slash (/). For example, /data/mylogs/**/*.log indicates all files that have the .log extension in the /data/mylogs directory and its subdirectories.
- Windows: The path must start with a drive letter. Example: C:\Program Files\Intel\**\*.Log.
Maximum Directory Monitoring Depth: The maximum directory depth that the wildcard ** can match in the File Path. The default value is 0, which indicates the current directory only. The valid range is 0 to 1,000.

We recommend setting this value to 0 and specifying the path to the directory where the files are located.

2. Log processing and structuring

Configure log processing rules to convert raw, unstructured logs into structured, searchable data to improve log query and analysis efficiency. We recommend adding a log sample before you configure the rules.

In the Processor Configurations section of the Logtail Configuration page, click Add Sample Log and enter a sample of the log content. The system identifies the log format from the sample and helps you generate regular expressions and parsing rules, simplifying the configuration process.

Scenario 1: Process multi-line logs

Logs such as Java exception stack traces and JSON objects often span multiple lines. In the default collection mode, these logs are split into multiple incomplete records, resulting in a loss of context. To address this, enable multi-line mode and configure a first-line regular expression to merge consecutive lines of a log into a single, complete entry.

Example:

Unprocessed raw log	Default collection mode: Each line becomes a separate log, breaking the stack trace and losing context.	Multi-line mode enabled: A first-line regular expression identifies the full log, preserving its semantic structure.
The raw log contains a complete Java exception stack trace. It starts with a timestamp and an ERROR level, and includes a `java.sql.SQLException` with a multi-level `at` call stack.	In the default mode, the raw log is split into multiple separate records. Each line of the stack trace is stored in an individual `content` field. The exception context is lost, making it impossible to correlate the lines with the original error log.	With multi-line mode enabled, the complete exception stack trace is merged into a single log record. The `content` field contains the entire content, from the ERROR line to the final call stack line, preserving the full semantic structure.

Procedure: In the Processor Configurations section of the Logtail Configuration page, enable Multi-line Mode:

Type: Select Custom or Multi-line JSON.
- Custom: If the log format is not fixed, you must configure a Regex to Match First Line to identify the starting line of each entry.
  - Regex to Match First Line: You can automatically generate or manually enter a regular expression. The expression must match an entire line. For example, the regular expression that matches the data in the preceding example is \[\d+-\d+-\w+:\d+:\d+,\d+]\s\[\w+]\s.*.
    - Automatic generation: Click Auto-Generate Regular Expression. Then, in the Log Sample text box, select the log content to extract and click Generate Regex.
    - Manual input: Click Manually Enter Regular Expression. After you enter the expression, click Validate.
- Multi-line JSON: Select this option if your raw logs are all in standard JSON format. The service automatically handles line breaks within a single JSON log.

Processing Method If Splitting Fails:
- Discard: If a text segment does not match the first-line rule, it is discarded.
- Retain Single Line: Unmatched text is split and retained as single-line logs.

Scenario 2: Structuring logs

When raw logs are unstructured or semi-structured text, such as NGINX access logs or application output logs, direct querying and analysis can be inefficient. The service provides a variety of data parsing processors that can automatically convert raw logs of different formats into structured data. This creates a solid data foundation for subsequent analysis, monitoring, and alerting.

Example:

Unprocessed raw log

Structured log output

192.168.*.* - - [15/Apr/2025:16:40:00 +0800] "GET /nginx-logo.png HTTP/1.1" 0.000 514 200 368 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.*.* Safari/537.36"

body_bytes_sent: 368
http_referer: -
http_user_agent : Mozi11a/5.0 (Nindows NT 10.0; Win64; x64) AppleMebKit/537.36 (KHTML, like Gecko) Chrome/131.0.x.x Safari/537.36
remote_addr:192.168.*.*
remote_user: -
request_length: 514
request_method: GET
request_time: 0.000
request_uri: /nginx-logo.png
status: 200
time_local: 15/Apr/2025:16:40:00

Procedure: In the Processor Configurations section of the Logtail Configuration page:

Add parsing processor: Click Add Processor, and then configure a regular expression, delimiter, or JSON parsing processor based on the log format. In this example, an NGINX log is collected. Select Native Processor > Data Parsing (NGINX Mode).
NGINX Log Configuration: Copy the entire log_format definition from the NGINX server configuration file (nginx.conf) and paste it into the text box.

Example:
```
log_format main  '$remote_addr - $remote_user [$time_local] "$request" ''$request_time $request_length ''$status $body_bytes_sent "$http_referer" ''"$http_user_agent"';
```
Important
The format definition here must exactly match the format used to generate the logs on the server. Otherwise, log parsing will fail.
Common configuration parameters: The following parameters are common to multiple data parsing processors and have consistent functions and usage.
- Original Field: Specify the name of the source field to parse. The default value is content, which represents the entire collected log entry.
- Keep Original Field on Parse Failure: We recommend enabling this option. If a log cannot be successfully parsed by the processor (for example, due to a format mismatch), this option ensures that the original log content is retained in the specified original field.
- Keep Original Field on Parse Success: If selected, the original log content is retained even if the log is successfully parsed.

3. Log filtering

During log collection, indiscriminately collecting large volumes of low-value or irrelevant logs, such as those at the DEBUG or INFO level, can waste storage resources, increase costs, impair query efficiency, and introduce data leakage risks. To address these issues, you can implement fine-grained filtering policies for efficient and secure log collection.

Filter by content

Filter logs based on field content, such as collecting only logs where the level is WARNING or ERROR.

Example:

Unprocessed raw logs

Collect only WARNING or ERROR logs

{"level":"WARNING","timestamp":"2025-09-23T19:11:40+0800","cluster":"yilu-cluster-0728","message":"Disk space is running low","freeSpace":"15%"}
{"level":"ERROR","timestamp":"2025-09-23T19:11:42+0800","cluster":"yilu-cluster-0728","message":"Failed to connect to database","errorCode":5003}
{"level":"INFO","timestamp":"2025-09-23T19:11:47+0800","cluster":"yilu-cluster-0728","message":"User logged in successfully","userId":"user-123"}

{"level":"WARNING","timestamp":"2025-09-23T19:11:40+0800","cluster":"yilu-cluster-0728","message":"Disk space is running low","freeSpace":"15%"}
{"level":"ERROR","timestamp":"2025-09-23T19:11:42+0800","cluster":"yilu-cluster-0728","message":"Failed to connect to database","errorCode":5003}

Procedure: In the Processor Configurations section of the Logtail Configuration page:

Click Add Processor and select Native Processor > Data Filtering.

Field Name: The log field to filter by.
Field Value: The regular expression used for filtering. Only full-text matching is supported. Partial keyword matching is not supported.

Blacklist

Use a blacklist to exclude specified directories or files, preventing irrelevant or sensitive logs from being uploaded.

Procedure: In the Input Configurations > Other Input Configurations section of the Logtail Configuration page, enable Collection Blacklist and click Add.

Supports exact and wildcard matching for directories and filenames. The only supported wildcards are the asterisk (*) and the question mark (?).

File Path Blacklist: The file paths to ignore. Examples:
- /home/admin/private*.log: Ignores all files in the /home/admin/ directory that start with private and end with .log.
- /home/admin/private*/*_inner.log: Ignores files that end with _inner.log located in any subdirectory that starts with private under the /home/admin/ directory.
File Blacklist: The filenames to ignore during collection. Example:
- app_inner.log: Ignores all files named app_inner.log during collection.
Directory Blacklist: The directory path must not end with a forward slash (/). Examples:
- /home/admin/dir1/: This blacklist entry is invalid and will be ignored.
- /home/admin/dir*: Ignores all files in subdirectories under /home/admin/ whose names start with dir.
- /home/admin/*/dir: Ignores all files in any second-level subdirectory named dir under the /home/admin/ directory. For example, files in the /home/admin/a/dir directory are ignored, but files in the /home/admin/a/b/dir directory are collected.

Container filtering

Set collection conditions based on container metadata, such as environment variables, pod labels, namespaces, and container names, to precisely control log collection from specific containers.

Procedure: In the Input Configurations section of the Logtail Configuration page, enable Container Filtering and click Add.

Multiple conditions are combined with a logical AND. All regular expression matching is based on Go's RE2 engine, which has some limitations compared to engines like PCRE. Ensure your regular expressions comply with the guidelines in Appendix: Regular Expression Usage Limits (Container Filtering).

Environment Variable Blacklist/Whitelist: Specify conditions based on the environment variables of the target containers.
K8s Pod Label Blacklist/Whitelist: Specify conditions based on the labels of the Pods where the target containers are located.
K8s Pod Name Regex Match: Specify containers to collect from by matching the Pod name.
K8s Namespace Regex Match: Specify containers to collect from by matching the namespace name.
K8s Container Name Regex Match: Specify containers to collect from by matching the container name.
Container Label Blacklist/Whitelist: Collect logs from containers whose labels match the specified conditions. This feature is intended for Docker scenarios and is not recommended for Kubernetes scenarios.

4. Log categorization

In scenarios where multiple applications or instances share the same log format, it can be difficult to distinguish the log source. This leads to missing context during queries and inefficient analysis. To address this, you can configure log topics and log tagging for automated context association and logical categorization.

Log topic

When logs from multiple applications or instances share the same format but different paths (e.g., /apps/app-A/run.log and /apps/app-B/run.log), it can be hard to distinguish their sources. In such cases, you can generate a topic based on machine group, a custom name, or file path extraction to flexibly differentiate logs from various business services or paths.

Procedure: Go to Global Configurations > Other Global Configurations > Log Topic Type and select a topic generation method. The following three methods are supported:

Machine Group Topic: If a collection configuration is applied to multiple machine groups, LoongCollector automatically uses the server's machine group name as the value of the __topic__ field. This method is suitable for scenarios where logs are categorized by host cluster.
Custom: Use the format customized://<custom_topic_name>, for example, customized://app-login. This method is suitable for static topic scenarios with fixed business identifiers.

File Path Extraction: Extract key information from the full path of the log file to dynamically tag the log source. This method is suitable for situations where multiple users or applications share the same log filename but have different paths.

If multiple users or services write logs to different top-level directories but use the same subdirectory paths and filenames, the source cannot be distinguished by filename alone. Example:

/data/logs
├── userA
│   └── serviceA
│       └── service.log
├── userB
│   └── serviceA
│       └── service.log
└── userC
    └── serviceA
        └── service.log

In this case, you can configure File Path Extraction and use a regular expression to extract key information from the full path. The matched result is then uploaded to the LogStore as the log topic.

Extraction rules

When you configure a regular expression, the system automatically determines the output field format based on the number and naming of the capturing groups, as described in the following rules:

In the regular expression for the file path, you must escape the forward slash (/).

Capturing group type	Use case	Generated field	Regex example	Matched path example	Generated field
Single capturing group (one `(.*?)`)	Only one dimension is needed to distinguish the source, such as a username or environment.	Generates the `__topic__` field.	`\/logs\/(.*?)\/app\.log`	`/logs/userA/app.log`	`__topic__:userA`
Multiple unnamed capturing groups (multiple instances of `(.*?)`)	Multiple dimensions are needed, but without semantic labels.	Generates a tag field in the format `__tag__:__topic_{i}__`, where `{i}` is the sequence number of the capturing group.	`\/logs\/(.?)\/(.?)\/app\.log`	`/logs/userA/svcA/app.log`	`__tag__:__topic_1__:userA`; `__tag__:__topic_2__:svcA`
Multiple named capturing groups (using `(?P<name>.*?)`)	Multiple dimensions are needed, and clear field meanings are desired for easy querying and analysis.	Generates a tag field in the format `__tag__:{name}`.	`\/logs\/(?P<user>.?)\/(?P<service>.?)\/app\.log`	`/logs/userA/svcA/app.log`	`__tag__:user:userA`; `__tag__:service:svcA`

Log tagging

Enable log tag enrichment to extract key information from container environment variables or Kubernetes Pod labels and attach this information as tags for fine-grained log grouping.

Procedure: In the Input Configurations section of the Logtail Configuration page, enable Log Tag Enrichment and click Add.

Environment Variables: Configure an environment variable name and a tag name. The value of the environment variable is stored as the value of the tag.
- Environment Variable Name: The name of the environment variable to extract.
- Tag Name: The name of the tag that stores the environment variable's value.

Pod Labels: Configure a Pod label name and a tag name. The value of the Pod label is stored as the value of the tag.
- Pod Label Name: The name of the Kubernetes Pod label to extract.
- Tag Name: The name of the tag that stores the Pod label's value.

5. Output configuration

By default, all logs are collected and sent to the current LogStore with lz4 compression. To dispatch logs from the same source to different LogStores, follow the steps below.

Multi-destination dispatch

Important

Multi-destination dispatch is available only in LoongCollector 3.0.0 and later. Logtail does not support this feature.
You can configure a maximum of five output destinations.
After you configure multiple output destinations, this collection configuration will no longer be listed under the current LogStore. To view, modify, or delete a multi-destination dispatch configuration, see How do I manage multi-destination dispatch configurations?.

Procedure: In the Output Configurations section of the Logtail Configuration page:

Click to expand the output configuration.
Click Add Output Targets and complete the following configuration:
- Logstores: Select the destination LogStore.
- Compression Method: Supported types are lz4 and zstd.
- Route Settings: Route and dispatch logs based on their tag fields. Logs that match the routing configuration are sent to the destination LogStore. If this configuration is left empty, all collected logs are sent to the destination LogStore.
  - Tag Name: The name of the tag field used for routing. Enter the field name directly, such as __path__, without the __tag__: prefix. Tag fields fall into the following two categories:
    
    For more information about tags, see Manage LoongCollector Collection Tags.
    - Agent-related: Tags that originate from the collection agent and are independent of processors. Examples: __hostname__, __user_defined_id__.
    - Input processor-related: Tags that are sourced from input processors, which add contextual information to logs. Examples: __path__ for file collection; _pod_name_ and _container_name_ for K8s collection.
  - Tag Value: If a log's tag field matches this value, the log is sent to the destination LogStore.
  - Discard this tag?: If enabled, the uploaded logs will not contain this tag field.

Step 4: Query and analysis

After configuring log processing and plugins, click Next to open the Query and Analysis Configurations page:

By default, the full-text index is enabled, allowing you to search for keywords in raw log content.
To run precise queries by field, click Automatic Index Generation after the Preview Data loads. Log Service then generates a field index based on the first entry in the preview data.

After completing the configuration, click Next to finish the collection process.

Step 5: Validation and troubleshooting

After creating a collection configuration and applying it to a machine group, the system automatically deploys the configuration and starts collecting incremental logs.

View collected logs

Confirm that new logs are being written to the log file: LoongCollector collects only incremental logs. Run the tail -f /path/to/your/log/file command and trigger a service operation to generate new logs.

Query logs: Go to the query and analysis page of the target Logstore and click Search & Analyze. The default time range is the last 15 minutes. Check for new logs. By default, each collected container text log includes the following fields:

Field name	Description
__tag__:__hostname__	The name of the container host.
__tag__:__path__	The path of the log file within the container.
__tag__:_container_ip_	The IP address of the container.
__tag__:_image_name_	The name of the image used by the container.
__tag__:_pod_name_	The name of the Pod.
__tag__:_namespace_	The Pod's namespace.
__tag__:_pod_uid_	The unique identifier (UID) of the Pod.

Troubleshooting common issues

Machine group heartbeat failure

Check the user identifier: If your server is not an ECS instance, or if the ECS instance and the Project belong to different Alibaba Cloud accounts, verify that the specified directory contains the correct user identifier.
- Linux: Run the cd /etc/ilogtail/users/ && touch <uid> command to create the user identifier file.
- Windows: Go to the C:\LogtailData\users\ directory and create an empty file named <uid>.
If a file named after the current Project's Alibaba Cloud account ID exists in the specified path, the user identifier is configured correctly.
Check the machine group identifier: If you use a machine group with a custom ID, verify that a user_defined_id file exists in the specified directory. If it exists, check that its content matches the custom ID configured for the machine group.
- Linux:
```
# Configure the custom ID. If the directory does not exist, create it manually.
echo "user-defined-1" > /etc/ilogtail/user_defined_id
```
- Windows: In the C:\LogtailData directory (create it if it does not exist), create a user_defined_id file and write the custom ID to it.
If both the user identifier and machine group identifier are configured correctly, see LoongCollector (Logtail) Machine Group Troubleshooting Guide for more troubleshooting steps.

Log collection or format errors

Troubleshooting approach: These errors indicate that the network connection and basic configuration are normal. The cause is typically a mismatch between the log content and the parsing rules. View the specific error message to locate the cause:

On the Logtail Configuration page, click the name of the failing LoongCollector (Logtail) configuration. On the Log Collection Error tab, click Select Time Range to set the query time range.
In the section, check the alert type of the error log and find the corresponding solution in Common Data Collection Errors.

Next steps

Log query and analysis:
Data visualization: Use a visualization dashboard to monitor key metric trends.
Automatic alerting for data anomalies: Set an alert policy to detect system anomalies in real time.

Troubleshooting container log collection

After you configure LoongCollector (Logtail), if the target log file contains no new logs, LoongCollector (Logtail) does not collect data from it.

2. View Logtail runtime logs

View the runtime logs of LoongCollector (Logtail) for detailed error information.

Log in to the Logtail container:

Query the Logtail pod.

kubectl get po -n kube-system | grep logtail

The command returns output similar to the following:

logtail-ds-****d                                             1/1       Running    0          8d
logtail-ds-****8                                             1/1       Running    0          8d

Log in to the pod.
```
kubectl exec -it -n kube-system logtail-ds-****d -- bash
```
In this command, logtail-ds-****d is the pod name. Replace it with the actual pod name.

View Logtail runtime logs:
Logtail logs are stored in the /usr/local/ilogtail/ directory within the Logtail container. The filenames are ilogtail.LOG and logtail_plugin.LOG. Log in to the Logtail container and run the following commands to view the log files:
```
# Go to the /usr/local/ilogtail/ directory.
cd /usr/local/ilogtail

# View the ilogtail.LOG and logtail_plugin.LOG files.
cat ilogtail.LOG
cat logtail_plugin.LOG
```
Purpose: Identify the alarm type in the error logs and find the corresponding solution in Common errors in Simple Log Service data collection.

3. Check the machine group heartbeat

Check the machine group heartbeat status: Go to the Resource Group > Machine Groups page and click the name of the target machine group. In the Machine Group Configurations > Machine Group Status section, view the Heartbeat status and count the number of nodes whose status is OK.

Check the number of worker nodes in the container cluster.

Obtain the cluster KubeConfig and connect to the cluster by using kubectl.

View the number of worker nodes in the cluster.

kubectl get node | grep -v master

The command returns output similar to the following:

NAME                                 STATUS    ROLES     AGE       VERSION
cn-hangzhou.i-bp17enxc2us3624wexh2   Ready     <none>    238d      v1.10.4
cn-hangzhou.i-bp1ad2b02jtqd1shi2ut   Ready     <none>    220d      v1.10.4

Verify that the number of nodes with an OK heartbeat status matches the number of worker nodes in the container cluster. Choose a troubleshooting method based on the comparison result.
- The heartbeat status of all nodes in the machine group is Failed:
  - If you use a self-managed cluster, check whether the following parameters are configured correctly: {regionId}, {aliuid}, {access-key-id}, and {access-key-secret}.
    
    If any of these parameters are incorrect, run the helm del --purge alibaba-log-controller command to delete and then reinstall the installation package.
- The number of nodes with an OK heartbeat status is less than the number of worker nodes in the cluster.
  - Determine whether a DaemonSet was manually deployed.
    1. Run the following command. If it returns any output, a DaemonSet was previously deployed manually.
      kubectl get po -n kube-system -l k8s-app=logtail
    2. Download the latest version of the DaemonSet template.
    3. Configure parameters such as ${your_region_name}, ${your_aliyun_user_id}, and ${your_machine_group_name} with your specific values.
    4. Apply the updated configuration.
      kubectl apply -f ./logtail-daemonset.yaml

4. Check collection filters

In the Simple Log Service console, check the Logtail collection configuration. Verify that the IncludeLabel, ExcludeLabel, IncludeEnv, ExcludeEnv, and other filter settings are configured correctly for your requirements.

The term "label" here refers to a Docker container label (from the output of docker inspect), not a Kubernetes label.
To test this, temporarily remove the IncludeLabel, ExcludeLabel, IncludeEnv, and ExcludeEnv configurations and check if logs are collected. If they are, it confirms an error in the configuration of these parameters.

FAQ

Manage multi-destination distribution configurations

Because multi-destination distribution configurations are associated with multiple Logstores, they must be managed on the Project-level management page:

Log on to the Simple Log Service Console and click the name of the target Project.
On the Project page, click Resource Group > Configurations in the left-side navigation pane.

Note
This page centrally manages all collection configurations in the Project, including remaining configurations from Logstores that were accidentally deleted.

Transfer ACK cluster logs to another account

You can send container logs to a Simple Log Service Project in another Alibaba Cloud account by manually installing the LoongCollector (Logtail) component in the ACK cluster and configuring it with the target account's Alibaba Cloud account ID or access credential (AccessKey).

Scenario: To collect log data from an ACK cluster into a Simple Log Service Project in a different Alibaba Cloud account for reasons such as organizational structure, permission isolation, or centralized monitoring, manually install LoongCollector (Logtail) to enable cross-account collection.

Procedure: Follow these steps to manually install LoongCollector. For more information, see Install and configure Logtail.

Connect to your Kubernetes cluster and run the command that corresponds to your cluster's region:

Chinese mainland regions

wget https://aliyun-observability-release-cn-shanghai.oss-cn-shanghai.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh

Regions outside China

wget https://aliyun-observability-release-ap-southeast-1.oss-ap-southeast-1.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh

Change to the loongcollector-custom-k8s-package directory and modify the ./loongcollector/values.yaml configuration file.

# ===================== Required information =====================
# The Project where logs from this cluster are collected. Example: k8s-log-custom-sd89ehdq
projectName: ""
# The region where the Project is located. Example: cn-shanghai
region: ""
# The ID of the Alibaba Cloud account that owns the Project. Enclose the ID in quotation marks (""). Example: "123456789"
aliUid: ""
# The network type. Valid values: Internet and Intranet. Default value: Internet.
net: Internet
# The AccessKey ID and AccessKey secret of the Alibaba Cloud account or RAM user. The AliyunLogFullAccess policy is required.
accessKeyID: ""
accessKeySecret: ""
# A custom cluster ID. The ID can contain only uppercase letters, lowercase letters, digits, and hyphens (-).
clusterID: ""

In the loongcollector-custom-k8s-package directory, run the following command to install LoongCollector and its dependencies:
```
bash k8s-custom-install.sh install
```

After the installation is complete, check the status of the component.

If a pod fails to start, verify the configuration in the values.yaml file and ensure the required images have been pulled.

# Check the pod status
kubectl get po -n kube-system | grep loongcollector-ds

Simple Log Service also automatically creates the following resources, which you can view on the Simple Log Service console.

Type

Parameter

Description

Project

The value of projectName defined in the values.yaml file

A resource management unit that isolates logs from different services.

To create a Project for more flexible log resource management, see Create a Project.

machine group

k8s-group-${cluster_id}

A set of log collection nodes.

Important

The LoongCollector component does not create a logstore named config-operation-log. If this logstore already exists, LoongCollector will not write new logs to it.

Multiple configurations for a single source

By default, Simple Log Service uses only one collection configuration per log source to prevent data duplication:

Each text log file can be matched with only one Logtail collection configuration.
For a container's standard output (stdout):
- If you use the new standard output template, only one collection configuration can collect from stdout by default.
- The legacy standard output template supports multiple collections by default without extra configuration.

Log on to the Simple Log Service Console and go to the target Project.
In the left-side navigation pane, click Logstores and find the target Logstore.
Click the icon next to the Logstore name to expand its details.
Click Logtail Configuration. In the configuration list, find the target Logtail configuration and click Manage Logtail Configuration in the Actions column.
On the configuration page, click Edit and scroll down to the Input Configurations section.
- For text file logs, enable Allow File to Be Collected for Multiple Times.
- For container standard output, enable Allow Collection by Different Logtail Configurations.

Dependency error during uninstallation

Symptom: When you try to uninstall the loongcollector (logtail-ds) log collection component in Container Service for Kubernetes (ACK), the system reports the following error:

Dependencies of addons are not met: terway-eniip depends on logtail-ds(>0.0) whose version is v3.x.x.x-aliyun or will be v3.x.x.x-aliyun

Cause: This error occurs because the terway-eniip network plugin's log collection feature creates a dependency on the loongcollector (logtail-ds) component. Therefore, ACK prevents uninstallation until this dependency is removed.

Solution: Follow these steps to remove the dependency before uninstalling the component:

Log on to the ACK Console.
In the cluster list, click the name of the target cluster.
In the left-side navigation pane, click Add-ons.
On the Add-ons page, find the terway-eniip component and click Disable Logging.
In the confirmation dialog box that appears, click OK.
After the configuration is applied, try to uninstall the loongcollector (logtail-ds) component again.

Delayed or truncated last log entry

Analysis: Log truncation often occurs when the last line in a log file lacks a line feed character, or when a multiline log entry (such as an exception stack) is incomplete. This happens because the collector cannot determine if a log entry is complete, which may cause the final segment to be split prematurely or reported after a delay. The handling mechanism varies by the LoongCollector (Logtail) version:

Versions earlier than 1.8:
If the last line of a log lacks a line feed (or carriage return), or if a multiline log is incomplete, the collector waits for the next write operation to trigger the output. This can cause the last log entry to be held for an extended period until a new log is written.
Version 1.8 and later:
These versions introduce a timeout flush mechanism to prevent logs from being held indefinitely. When the collector detects an incomplete log line, a timer starts. After the timeout period, the current buffer is automatically submitted, ensuring the log is eventually collected.
- Default timeout: 60 seconds (ensures log integrity in most scenarios).
- You can adjust this value based on your needs, but do not set it to 0. Setting the value to 0 may cause log truncation or partial data loss.

Solution:

You can extend the timeout period to ensure logs are fully written before being collected:

Log on to the Simple Log Service Console and go to the target Project.
In the left-side navigation pane, click Logstores and find the target Logstore.
Click the icon next to the Logstore name to expand its details.
Click Logtail Configuration. In the configuration list, find the target Logtail configuration and click Manage Logtail Configuration in the Actions column.
On the configuration page, click Edit.
- Navigate to Input Configurations > Other Input Configurations > Advanced Parameters and add the following JSON configuration to customize the timeout period.
```
{
  "FlushTimeoutSecs": 1
}
```
  - Default value: Determined by the default_reader_flush_timeout startup parameter, which is typically a few seconds.
  - Unit: Seconds.
  - Recommended value: ≥1. Do not set this to 0, as it may cause log truncation or partial data loss.
After you complete the configuration, click OK.

Network endpoint failover

If LoongCollector (Logtail) detects a communication anomaly on the internal network, such as a network failure or connection timeout, it automatically fails over to the public network endpoint for data transmission. This ensures continuous and reliable log collection, preventing log backlogs or data loss.

LoongCollector: Automatically switches back to the internal network once connectivity is restored.
Logtail: Does not automatically switch back. You must manually restart the component to resume communication over the internal network.

Appendix: Native processor reference

In the Processor Configurations section of the Logtail Configuration page, add processors to structure raw logs. To add a processing plugin to an existing configuration:

In the navigation pane on the left, choose Logstores and find the target logstore.
Click the icon before its name to expand the logstore.
Click Logtail Configuration. In the configuration list, find the target Logtail configuration and click Manage Logtail Configuration in the Actions column.
On the Logtail configuration page, click Edit.

This section introduces only commonly used processing plugins that cover common log processing use cases. For more features, see Extended processors.

Important

Rules for combining plugins (for LoongCollector / Logtail 2.0 and later):

Native and extended processors can be used independently or combined as needed.
Prioritize native processors because they offer better performance and stability.
When native features cannot meet your business needs, add extended processors after the configured native ones for supplementary processing.

Order constraint:

Plugins run sequentially in configured order, forming a processing chain. All native processors must precede any extended processors. After adding an extended processor, you cannot add more native processors.

Data parsing (regex mode)

You can use regular expressions to extract fields from logs and parse the logs into key-value pairs. Each field can then be queried and analyzed independently.

Example:

Raw log

Result

127.0.0.1 - - [16/Aug/2024:14:37:52 +0800] "GET /wp-admin/admin-ajax.php?action=rest-nonce HTTP/1.1" 200 41 "http://www.example.com/wp-admin/post-new.php?post_type=page" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36 Edg/127.0.0.0"

body_bytes_sent: 41
http_referer: http://www.example.com/wp-admin/post-new.php?post_type=page
http_user_agent: Mozilla/5.0 (Windows NT 10.0; Win64; ×64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36 Edg/127.0.0.0
remote_addr: 127.0.0.1
remote_user: -
request_method: GET
request_protocol: HTTP/1.1
request_uri: /wp-admin/admin-ajax.php?action=rest-nonce
status: 200
time_local: 16/Aug/2024:14:37:52 +0800

Procedure: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and then select Native Processor > Data Parsing (Regex Mode):

Regular Expression: The regular expression used to match logs. You can automatically generate or manually enter an expression.
- Auto-generate a regular expression:
  - Click Auto-generate regular expression.
  - In the Log Sample field, highlight the log content that you want to extract.
  - Click Generate regular expression.
    
    After you highlight the content, the Generate regular expression button appears above the log text.
- Manually enter a regular expression based on the log format.
After you complete the configuration, click Validate to test whether the regular expression can correctly parse the log content.
Extracted Field: Set the field names (keys) for the extracted log values.
For information about other parameters, see the description of common configuration parameters in Use Case 2: Structured logs.

Data parsing (delimiter mode)

You can use a delimiter to parse log content into key-value pairs. Both single-character and multi-character delimiters are supported.

Example:

Raw log

Using a specified character, split a field

05/May/2025:13:30:28,10.10.*.*,"POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1",200,18204,aliyun-sdk-java

ip:10.10.*.*
request:POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1
size:18204
status:200
time:05/May/2025:13:30:28
user_agent:aliyun-sdk-java

Procedure: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and then select Native Processor > Data Parsing (Delimiter Mode):

Delimiter: The character used to split the log content.

Example: For CSV files, select Custom and enter a half-width comma (,).
Quote: If a field value contains the delimiter, you must specify a quote to enclose that field to prevent incorrect splitting.
Extracted Field: Set field names (keys) for the resulting columns in order. The following rules apply:
- Field names can contain only letters, digits, and underscores (_).
- Field names must start with a letter or an underscore (_).
- The maximum length is 128 bytes.
For information about other parameters, see the description of common configuration parameters in Use Case 2: Structured logs.

Data parsing (JSON mode)

You can parse JSON objects from your logs into key-value pairs.

Example:

Raw log

Result

{"url": "POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=U0Ujpek********&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=pD12XYLmGxKQ%2Bmkd6x7hAgQ7b1c%3D HTTP/1.1", "ip": "10.200.98.220", "user-agent": "aliyun-sdk-java", "request": {"status": "200", "latency": "18204"}, "time": "05/Jan/2025:13:30:28"}

ip: 10.200.98.220
request: {"status": "200", "latency" : "18204" }
time: 05/Jan/2025:13:30:28
url: POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=U0Ujpek******&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=pD12XYLmGxKQ%2Bmkd6x7hAgQ7b1c%3D HTTP/1.1
user-agent:aliyun-sdk-java

Procedure: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and then select Native Processor > Data Parsing (JSON Mode):

Original Field: The default value is content. This field stores the raw log content to be parsed.
For information about other parameters, see the description of common configuration parameters in Use Case 2: Structured logs.

Expand JSON field

You can parse nested JSON logs into key-value pairs by specifying an expansion depth.

Example:

Raw log

Expansion depth: 0

Expansion depth: 1

{"s_key":{"k1":{"k2":{"k3":{"k4":{"k51":"51","k52":"52"},"k41":"41"}}}}}

0_s_key_k1_k2_k3_k41:41
0_s_key_k1_k2_k3_k4_k51:51
0_s_key_k1_k2_k3_k4_k52:52

1_s_key:{"k1":{"k2":{"k3":{"k4":{"k51":"51","k52":"52"},"k41":"41"}}}}

Procedure: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and then select Extended Processor > Expand JSON Field:

Original Field: The name of the source field to expand, such as content.
JSON Expansion Depth: The number of levels to expand within a JSON object. A value of 0 (the default) expands all levels. A value of 1 prevents the expansion of nested objects.
Character to Concatenate Expanded Keys: The character used to join nested field names. The default is an underscore _.
Name Prefix of Expanded Keys: The prefix for field names after JSON expansion.
Expand Array: Enable this option to expand an array into indexed key-value pairs.

Example: {"k":["a","b"]} is expanded to {"k[0]":"a","k[1]":"b"}.

To rename an expanded field, for example from prefix_s_key_k1 to new_field_name, you can add a Rename Fields processor to perform the mapping.
For information about other parameters, see the description of common configuration parameters in Use Case 2: Structured logs.

JSON array parsing

Use the json_extract function to extract JSON objects from a JSON array.

Example:

Raw log	Extracted objects
`[{"key1":"value1"},{"key2":"value2"}]`	`json1:{"key1":"value1"} json2:{"key2":"value2"}`

Procedure: On the Logtail Configuration page, in the Processor Configurations section, switch the Processing Method to SPL, configure the SPL statement, and use the json_extract function to extract JSON objects from a JSON array.

Example: Extract elements from the JSON array in the log field content and store the results in new fields named json1 and json2.

* | extend json1 = json_extract(content, '$[0]'), json2 = json_extract(content, '$[1]')

Data parsing (Apache mode)

You can structure log content into multiple key-value pairs based on the definitions in your Apache log configuration file.

Example:

Raw log

Apache Common Log Format combined Parsing

1 192.168.1.10 - - [08/May/2024:15:30:28 +0800] "GET /index.html HTTP/1.1" 200 1234 "https://www.example.com/referrer" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.X.X Safari/537.36"

http_referer:https://www.example.com/referrer
http_user_agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.X.X Safari/537.36
remote_addr:192.168.1.10
remote_ident:-
remote_user:-
request_method:GET
request_protocol:HTTP/1.1
request_uri:/index.html
response_size_bytes:1234
status:200
time_local:[08/May/2024:15:30:28 +0800]

Procedure: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and then select Native Processor > Data Parsing (Apache Mode):

Log Format: combined
APACHE LogFormat Configuration: The system automatically populates this field based on the selected Log Format.

Important
You must verify the auto-populated content to ensure it is identical to the LogFormat definition in the Apache configuration file on your server, which is typically located at /etc/apache2/apache2.conf.
For information about other parameters, see the description of common configuration parameters in Use Case 2: Structured logs.

Data masking

Mask sensitive data in your logs.

Example:

Raw log

Masked result

[{'account':'1812213231432969','password':'04a23f38'}, {'account':'1812213685634','password':'123a'}]

[{'account':'1812213231432969','password':'********'}, {'account':'1812213685634','password':'********'}]

Procedure: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and then select Native Processor > Data Masking:

Original Field: The field that contains the content to be masked.
Data Masking Method:
- const: Replaces sensitive content with a specified string.
- md5: Replaces sensitive content with its MD5 hash value.
Replacement String: If you set Data Masking Method to const, you must enter the string to replace the sensitive content.
Content Expression that Precedes Replaced Content: A regular expression that matches the content immediately before the sensitive data. This expression uses the RE2 syntax.
Content Expression to Match Replaced Content: A regular expression that matches the sensitive content. This expression uses the RE2 syntax.

Time parsing

Parses the time field in the log and sets the parsed result as the __time__ field of the log.

Example:

Raw log	Result
`{"level":"INFO","timestamp":"2025-09-23T19:11:47+0800","cluster":"yilu-cluster-0728","message":"User logged in successfully","userId":"user-123"}`	The parsed logs are displayed as structured key-value pairs, with the time field correctly parsed as the log time (for example, 09-29 09:56:01) and other fields such as `cluster`, `level`, `message`, and `userId` displayed independently.

Procedure: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and then select Native Processor > Time Parsing:

Original Field: The field that contains the time value to be parsed.
Time Format: The time format that corresponds to the time value in the log.
Time Zone: The time zone of the source time field. By default, the system uses the time zone of the environment where the LoongCollector (Logtail) process is running.

Appendix: Regular expression limits

Regular expressions for container filtering use the Go RE2 engine, which has syntax limitations compared to other engines like Perl Compatible Regular Expressions (PCRE). Keep the following in mind when writing regular expressions:

1. Named group syntax differences

Go uses the (?P<name>...) syntax for named groups and does not support the (?<name>...) syntax used in PCRE.

Correct: (?P<year>\d{4})
Incorrect: (?<year>\d{4})

2. Unsupported regular expression features

RE2 does not support the following common but complex regular expression features. Avoid using them:

Assertion: (?=...), (?!...), (?<=...), or (?<!...)
Conditional expression: (?(condition)true|false)
Recursive matching: (?R) or (?0)
Subprogram reference: (?&name) or (?P>name)
Atomic group: (?>...)

3. Recommendations

To debug your regular expressions, use a tool like Regex101. To ensure compatibility, select the Golang (RE2) mode for validation. The plugin will fail to parse or find matches with expressions that use unsupported syntax.

Appendix: Container standard output version comparison

To improve storage efficiency and collection consistency, the log metadata format for container standard output has been upgraded. In the new format, all metadata is consolidated under the __tag__ field to optimize storage and standardize the format.

Core advantages of the new version
- Significant performance gains
  - Refactored in C++, the new version delivers a 180% to 300% performance improvement over the previous Go implementation.
  - It supports native plugins for data processing and multi-threaded parallel processing to fully utilize system resources.
  - It supports flexible combinations of native and Go plugins to handle complex scenarios.
- Enhanced reliability
  - It supports a log rotation queue for container standard output and unifies the log collection mechanism with the file collection mechanism. This unification ensures high reliability during rapid log rotation.
- Lower resource consumption
  - CPU usage is reduced by 20% to 25%.
  - Memory usage is reduced by 20% to 25%.
- Enhanced O&M consistency
  - Unified parameter configuration: The configuration parameters for the new container standard output collection plugin are consistent with those for the file collection plugin.
  - Unified metadata management: The field names and storage location for container metadata are now consistent with the file collection format. As a result, the consumer side only needs one set of processing logic.

Feature comparison of new and old versions

Feature	Previous behavior	New behavior
Storage method	Metadata is embedded in the log content as individual fields.	Metadata is consolidated under the `__tag__` field.
Storage efficiency	Each log entry contains a full, duplicate set of metadata, which consumes more storage space.	Multiple log entries from the same source can reuse metadata, saving storage costs.
Format consistency	The format is inconsistent with the file collection format.	Field names and storage structure are now consistent with the file collection format, providing a unified experience.
Query access method	You can query metadata fields directly by name, such as `_container_name_`.	You must access the corresponding key-value pair through the `__tag__` object, such as `__tag__: _container_name_`.

Container metadata field mapping

Previous parameter	New parameter
_container_ip_	__tag__:_container_ip_
_container_name_	__tag__:_container_name_
_image_name_	__tag__:_image_name_
_namespace_	__tag__:_namespace_
_pod_name_	__tag__:_pod_name_
_pod_uid_	__tag__:_pod_uid_

In the new version, all metadata fields are stored in the log's tag section in the format __tag__:<key> instead of being embedded in the log content.

Impact on users
- Consumer-side adaptation: Because the storage location has changed from "Content" to "Tag", you must adjust your log consumption logic. For example, you must use __tag__ to access the field when you run a query.
- SQL query compatibility: SQL queries are automatically backward-compatible, so you do not need to modify existing queries to process logs from both versions.

More information

Global parameters

Parameter	Description
Configuration name	The name of the Logtail configuration. This name must be unique within the project and cannot be modified after the configuration is created.
Log topic type	The method to generate a log topic. You can generate a topic from a machine group, extract a topic from a file path, or specify a custom topic.
Advanced parameters	For other optional advanced parameters for the global configuration, see Create a Logtail pipeline configuration.

Input parameters

Parameter	Description
Logtail deployment mode	DaemonSet: Deploys a LoongCollector instance on each node in the cluster to collect logs from all containers on the node. Sidecar: Deploys a LoongCollector container in each pod to collect logs from all containers in the pod. Log collection is isolated between pods.
File path type	You can configure a Path in Container or a Host Path. Path in Container: Collects text log files from a specific path within a container. Host Path: Collects service logs from a specific path on a cluster node.
File path	Set the log directory and file name based on the location of the logs on the host (for example, ECS). On a Linux host, the log path must start with a forward slash (/). Example: `/apsara/nuwa//app.Log`. On a Windows host, the log path must start with a drive letter. Example: `C:\Program Files\Intel\\.Log`. You can specify exact names or use wildcards for directories and files. For more information, see Wildcard matching. Only the asterisk () and question mark (?) wildcards are supported in log paths. The file discovery method uses multi-level directory matching, which finds all matching files in the specified directory and its subdirectories. Examples: `/apsara/nuwa/*/.log` specifies all files that have the .log extension in the `/apsara/nuwa` directory and its subdirectories. `/var/logs/app_//.log` specifies all files that have the `.log` extension in all directories that match the `app_` format in the `/var/logs` directory and their subdirectories. `/var/log/nginx//access` specifies all files whose names start with `access` in the `/var/log/nginx` directory and its subdirectories.
Maximum directory monitoring depth	The maximum monitoring depth for the directory. This value specifies the maximum depth that the `` wildcard can match in the File Path**. A value of 0 monitors only the current directory.
Standard output	If you enable Stdout and Stderr, Logtail collects the container's standard output.
Standard error	If you enable Standard Error, Logtail collects the container's standard error.
Allow multiple collections for standard output	By default, the standard output of a container can match only one Logtail configuration for standard output collection. If you need to collect standard output by using multiple configurations, enable the Allow File to Be Collected for Multiple Times switch.
Enable container metadata preview	After you enable Enable Container Metadata Preview and create a Logtail configuration, you can view container metadata. The metadata includes information about matched containers and all containers.
Container filtering	Filter conditions Important A container label, which you can retrieve by running the `docker inspect` command, is different from a Kubernetes label. For more information, see Obtain container labels. An environment variable is a variable that you configure during container startup. For more information, see Obtain container environment variables. In Kubernetes scenarios, we recommend that you use Kubernetes-level information, such as K8s Pod Name Regular Matching, K8s Namespace Regular Matching, K8s Container Name Regular Matching, and Kubernetes Pod Label Whitelist, to filter containers. The Kubernetes namespace and container name are mapped to the `io.kubernetes.pod.namespace` and `io.kubernetes.container.name` container labels. We recommend that you use these container labels to filter containers. For example, a pod belongs to the `backend-prod` namespace and its container is named `worker-server`. To collect logs from this container, you can add `io.kubernetes.pod.namespace : backend-prod` or `io.kubernetes.container.name : worker-server` to the container label whitelist. If these two container labels do not meet your filtering requirements, use an environment variable whitelist or blacklist to filter containers. K8s Pod Name Regular Matching Specifies the containers from which you want to collect logs based on pod names. Regular expression matching is supported. For example, if you set this parameter to `^(nginx-log-demo.)$`, all containers in pods whose names start with nginx-log-demo are matched. K8s Namespace Regular Matching* Specifies the containers from which you want to collect logs based on the namespace names. Regular expression matching is supported. For example, if you set this parameter to `^(default\|nginx)$`, all containers in the nginx and default namespaces are matched. K8s Container Name Regular Matching Specifies the containers from which you want to collect logs based on the container names. The Kubernetes container name is defined in `spec.containers`. Regular expression matching is supported. For example, if you set this parameter to `^(container-test)$`, all containers named container-test are matched. Container Label Whitelist (We recommend that you configure this parameter in a Docker environment and do not configure this parameter in a Kubernetes environment.) The container label whitelist selects containers for log collection. By default, this parameter is empty, which indicates that the standard output of all containers is collected. To configure a container label whitelist, you must specify LabelKey. LabelValue is optional. If you leave LabelValue empty, all containers whose labels contain the specified LabelKey match. If you specify LabelValue, only containers whose labels are the specified LabelKey=LabelValue pair are matched. By default, string matching is used for LabelValue. A match occurs only if LabelValue is identical to the value of the container label. If the value starts with `^` and ends with `$`, regular expression matching is used. For example, you can set LabelKey to `io.kubernetes.container.name` and LabelValue to `^(nginx\|cube)$`. This configuration matches containers named nginx or cube. If you configure multiple whitelists, the whitelists are evaluated with OR logic. A container is matched if its label matches any of the whitelists. Container Label Blacklist (We recommend that you configure this parameter in a Docker environment and do not configure this parameter in a Kubernetes environment.) The container label blacklist specifies the containers that you want to exclude from log collection. By default, this parameter is empty, which indicates that no containers are excluded. To configure a container label blacklist, you must specify LabelKey. LabelValue is optional. If you leave LabelValue empty, this configuration excludes all containers whose labels contain the specified LabelKey. If you specify LabelValue, this configuration excludes only containers whose labels are the specified LabelKey=LabelValue pair. By default, string matching is used for LabelValue. A match occurs only if LabelValue is identical to the value of the container label. If the value starts with `^` and ends with `$`, regular expression matching is used. For example, you can set LabelKey to `io.kubernetes.container.name` and LabelValue to `^(nginx\|cube)$`. This configuration excludes containers named nginx or cube. If you configure multiple blacklists, the blacklists are evaluated with OR logic. A container is excluded if its label matches any of the blacklists. Environment Variable Whitelist The environment variable whitelist specifies the containers from which you want to collect logs. By default, this parameter is empty, which indicates that the standard output of all containers is collected. To configure an environment variable whitelist, you must specify EnvKey. EnvValue is optional. If you leave EnvValue empty, all containers whose environment variables contain the specified EnvKey match. If you specify EnvValue, only containers whose environment variables are the specified EnvKey=EnvValue pair match. By default, string matching is used for EnvValue. A match occurs only if EnvValue is identical to the value of the environment variable. If the value starts with `^` and ends with `$`, regular expression matching is used. For example, you can set EnvKey to `NGINX_SERVICE_PORT` and EnvValue to `^(80\|6379)$`. This configuration matches containers whose service port is 80 or 6379. If you configure multiple whitelists, the whitelists are evaluated with OR logic. A container is matched if its environment variables match any of the whitelists. Environment Variable Blacklist The environment variable blacklist specifies the containers that you want to exclude from log collection. By default, this parameter is empty, which indicates that no containers are excluded. To configure an environment variable blacklist, you must specify EnvKey. EnvValue is optional. If you leave EnvValue empty, this configuration excludes all containers whose environment variables contain the specified EnvKey. If you specify EnvValue, this configuration excludes only containers whose environment variables are the specified EnvKey=EnvValue pair. By default, string matching is used for EnvValue. A match occurs only if EnvValue is identical to the value of the environment variable. If the value starts with `^` and ends with `$`, regular expression matching is used. For example, you can set EnvKey to `NGINX_SERVICE_PORT` and EnvValue to `^(80\|6379)$`. This configuration excludes containers whose service port is 80 or 6379. If you configure multiple blacklists, the blacklists are evaluated with OR logic. A container is excluded if its environment variables match any of the blacklists. Kubernetes Pod Label Whitelist Specifies the containers from which you want to collect logs based on a Kubernetes label whitelist. To configure a Kubernetes label whitelist, you must specify LabelKey. LabelValue is optional. If you leave LabelValue empty, all containers whose Kubernetes labels contain the specified LabelKey match. If you specify LabelValue, only containers whose Kubernetes labels are the specified LabelKey=LabelValue pair match. By default, string matching is used for LabelValue. A match occurs only if LabelValue is identical to the value of the Kubernetes label. If the value starts with `^` and ends with `$`, regular expression matching is used. For example, you can set LabelKey to `app` and LabelValue to `^(test1\|test2)$`. This configuration matches containers that have the Kubernetes label `app:test1` or `app:test2`. If you configure multiple whitelists, the whitelists are evaluated with OR logic. A container is matched if its Kubernetes label matches any of the whitelists. Note If you change the labels of a Kubernetes management resource, such as a deployment, at runtime, the specific pod is not restarted. Therefore, the pod cannot detect the change, which may cause matching rules to fail. When you configure K8s pod label whitelists and blacklists, make sure that you base them on the pod labels. For more information about Kubernetes labels, see Labels and Selectors. Kubernetes Pod Label Blacklist Specifies the containers that you want to exclude from log collection based on a Kubernetes label blacklist. To configure a Kubernetes label blacklist, you must specify LabelKey. LabelValue is optional. If you leave LabelValue empty, this configuration excludes all containers whose Kubernetes labels contain the specified LabelKey. If you specify LabelValue, this configuration excludes only containers whose Kubernetes labels are the specified LabelKey=LabelValue pair. By default, string matching is used for LabelValue. A match occurs only if LabelValue is identical to the value of the Kubernetes label. If the value starts with `^` and ends with `$`, regular expression matching is used. For example, you can set LabelKey to `app` and LabelValue to `^(test1\|test2)$`. This configuration excludes containers that have the Kubernetes label `app:test1` or `app:test2`. If you configure multiple blacklists, the blacklists are evaluated with OR logic. A container is excluded if its Kubernetes label matches any of the blacklists. Note If you change the labels of a Kubernetes management resource, such as a deployment, at runtime, the specific pod is not restarted. Therefore, the pod cannot detect the change, which may cause matching rules to fail. When you configure K8s pod label whitelists and blacklists, make sure that you base them on the pod labels. For more information about Kubernetes labels, see Labels and Selectors.
Log tag enrichment	You can add environment variables and Kubernetes labels to logs as log tags. Environment Variables After you configure this parameter, Log Service adds environment variable fields to logs. For example, if you set Environment Variable Name to `VERSION` and Tag Name to `env_version`, and a container has the environment variable `VERSION=v1.0.0`, Log Service adds the field __tag__:__env_version__: v1.0.0 to the log. Pod Labels After you configure this parameter, Log Service adds fields that are related to Kubernetes pod labels to logs. For example, if you set Pod Label Name to `app` and Tag Name to `k8s_pod_app`, and a Kubernetes pod has the label `app=serviceA`, Log Service adds the field __tag__:__k8s_pod_app__: serviceA to the log.
File encoding	The encoding format of the log files.
Initial collection size	When the configuration first takes effect, this parameter specifies the starting position for log collection relative to the end of a file. The default value is 1,024 KB. During the first collection, if a file is smaller than 1,024 KB, Logtail collects logs from the beginning of the file. During the first collection, if a file is larger than 1,024 KB, Logtail collects logs starting from 1,024 KB before the end of the file. You can modify the First Collection Size. The value can range from 0 to 10,485,760 KB.
Collection blacklist	After you turn on the Collection Blacklist switch, you can configure a blacklist to ignore specified directories or files during log collection. You can specify exact names or use wildcards for directories and files. Only the asterisk () and question mark (?) wildcards are supported. Important* If you use wildcards when you configure the File Path but need to exclude some of the matched paths, you must specify the full paths in the Collection Blacklist to ensure that the blacklist takes effect. For example, you set File Path to `/home/admin/app/log/.log` but want to exclude all subdirectories in the `/home/admin/app1` directory. In this case, you must select Directory Blacklist* and set the directory to `/home/admin/app1/`. If you set the directory to `/home/admin/app1`, the blacklist does not take effect. Matching against a blacklist incurs computational overhead. We recommend that you specify a maximum of 10 blacklist entries. A directory path cannot end with a forward slash (/). For example, if you set the path to `/home/admin/dir1/`, the directory blacklist does not take effect. You can configure blacklists for file paths, files, and directories. The following table describes the types of blacklists. File Path Blacklist If you select File Path Blacklist and set the path to `/home/admin/private.log`, all files whose names start with private and end with .log in the `/home/admin/` directory are ignored during log collection. If you select File Path Blacklist* and set the path to `/home/admin/private/_inner.log`, files whose names end with _inner.log in directories whose names start with private in the `/home/admin/` directory are ignored during log collection. For example, the `/home/admin/private/app_inner.log` file is ignored, but the `/home/admin/private/app.log` file is collected. File blacklist If you select File Blacklist and set the file name to `app_inner.log`, all files named `app_inner.log` are ignored during log collection. Directory blacklist If you select Directory Blacklist and set the directory to `/home/admin/dir1`, all files in the `/home/admin/dir1` directory are ignored during log collection. If you select Directory Blacklist and set the directory to `/home/admin/dir`, all files in subdirectories whose names start with dir in the `/home/admin/` directory are ignored during log collection. If you select Directory Blacklist* and set the directory to `/home/admin/*/dir`, all files in the second-level subdirectories named dir in the `/home/admin/` directory are ignored during log collection. For example, files in the `/home/admin/a/dir` directory are ignored, but files in the `/home/admin/a/b/dir` directory are collected.
Allow a file to be collected multiple times	By default, only one Logtail configuration can collect logs from a given file. To collect logs from a file multiple times, enable the Allow File to Be Collected for Multiple Times switch.
Advanced parameters	For more information about other optional advanced parameters for the file input plugin, see Create a Logtail pipeline configuration.

Processing parameters

Parameter	Description
Log sample	A sample of your log data. We recommend that you use logs from your production environment. The log sample helps you configure log processing parameters and simplifies the configuration. You can add multiple samples. The total length cannot exceed 1,500 characters. `[2023-10-01T10:30:01,000] [INFO] java.lang.Exception: exception happened at TestPrintStackTrace.f(TestPrintStackTrace.java:3) at TestPrintStackTrace.g(TestPrintStackTrace.java:7) at TestPrintStackTrace.main(TestPrintStackTrace.java:16)`
Multi-line mode	Multi-line log type: A multi-line log is a single log entry that spans multiple consecutive lines. You must define a rule to identify the beginning of each log entry. Custom: Use a Regex to Match First Line to match the first line of each log entry. Multi-line JSON: Each JSON object is expanded into multiple lines. Example: `{ "name": "John Doe", "age": 30, "address": { "city": "New York", "country": "USA" } }` Action on parsing failure: `Exception in thread "main" java.lang.NullPointerException at com.example.MyClass.methodA(MyClass.java:12) at com.example.MyClass.methodB(MyClass.java:34) at com.example.MyClass.main(MyClass.java:½0)` If Log Service fails to parse the preceding log content: Discard: Discards the log. Retain Single Line: Retains each line as a separate log. In this case, this process generates four separate logs.
Processing method	Processors: You can use a combination of Native Processor and Extended Processor. For more information about processors, see Process data by using Logtail plugins. Important The limits on processor usage are subject to the information on the console page. Logtail 2.0 and later You can combine native processors in any order. You can use native processors and extended processors at the same time. However, extended processors must be placed after all native processors. Logtail versions earlier than 2.0 You cannot use native processors and extended processors at the same time. Native processors can be used only to collect text logs. When you use native processors, you must meet the following requirements: The first processor must be a regular expression parsing plugin, delimiter-based parsing plugin, JSON parsing plugin, NGINX pattern parsing plugin, Apache pattern parsing plugin, or IIS pattern parsing plugin. The subsequent processors can include a maximum of one time parsing plugin, one filter plugin, and multiple data masking plugins. For the Retain Original Field if Parsing Fails and Retain Original Field if Parsing Succeeds parameters, only the following combinations are valid. Upload only successfully parsed logs: Clear the Keep Source Field on Parse Failure and Keep Source Field on Parse Success check boxes. Upload parsed logs on successful parsing and raw logs on parsing failure: Select the Keep Source Field on Parse Failure check box, clear the Keep Source Field on Parse Success check box, and then set Renamed Source Field to `__raw__`. Upload parsed logs and the raw log field on successful parsing, and upload only raw logs on parsing failure: For example, if the raw log `"content": "{"request_method":"GET", "request_time":"200"}"` is successfully parsed, appending the raw field adds another field to the parsed log. The name of the added field is specified by Renamed Source Field (if not specified, the original field name is used), and the value is the raw log `{"request_method":"GET", "request_time":"200"}`. Select the Keep Source Field on Parse Failure and Keep Source Field on Parse Success check boxes. Then, set Renamed Source Field to the name of the field that stores the raw log, enable advanced parameters, and then set Advanced Parameters to `{"CopyingRawLog":true}`.

Requirements

Collection configuration

Step 1: Install LoongCollector

ACK cluster

Self-managed cluster

Step 2: Create a logstore

Step 3: Configure log collection rules

1. Global and input configuration

Container standard output

Cluster text logs

2. Log processing and structuring

Scenario 1: Process multi-line logs

Scenario 2: Structuring logs

3. Log filtering

Filter by content

Blacklist

Container filtering

4. Log categorization

Log topic

Extraction rules

Log tagging

5. Output configuration

Multi-destination dispatch

Step 4: Query and analysis

Step 5: Validation and troubleshooting

View collected logs

Troubleshooting common issues

Machine group heartbeat failure

Log collection or format errors

Next steps

Troubleshooting container log collection

FAQ

Manage multi-destination distribution configurations

Transfer ACK cluster logs to another account

Multiple configurations for a single source

Dependency error during uninstallation

Delayed or truncated last log entry

Network endpoint failover

Appendix: Native processor reference

Data parsing (regex mode)

Data parsing (delimiter mode)

Data parsing (JSON mode)

Expand JSON field

JSON array parsing

Data parsing (Apache mode)

Data masking

Time parsing

Appendix: Regular expression limits

Appendix: Container standard output version comparison

More information

Global parameters

Input parameters

File Path Blacklist

File blacklist

Directory blacklist

Processing parameters