In a Kubernetes environment, the Sidecar mode is an ideal log collection solution for fine-grained management of application logs, multi-tenant data isolation, or ensuring log collection is strictly bound to the application lifecycle. This mode works by injecting a separate LoongCollector (Logtail) container into your application pod. This setup enables dedicated log collection for that pod and offers powerful flexibility and isolation.
How it works
In Sidecar mode, an application container and a LoongCollector (Logtail) log collection container run side-by-side within your application pod. They work together using shared volumes and lifecycle synchronization mechanisms.
Log sharing: The application container writes its log files to a shared volume, typically an
emptyDirvolume. The LoongCollector (Logtail) container mounts the same shared volume, which allows it to read and collect these log files in real time.Configuration association: Each LoongCollector (Logtail) Sidecar container declares its identity by setting a unique
custom identifier. In the Simple Log Service console, you can create a machine group that uses the same identifier. This way, all Sidecar instances with the same identifier automatically apply the collection configurations from that machine group.Lifecycle synchronization: To prevent log loss when a pod terminates, the application container and the LoongCollector (Logtail) container communicate using signal files (
cornerstoneandtombstone) in a shared volume. This mechanism works in conjunction with the pod'sgraceful termination period(terminationGracePeriodSeconds) to ensure a graceful shutdown. The application container stops writing first, LoongCollector finishes sending all remaining logs, and then both containers exit together.
Preparations
Before you collect logs, you must create a project and a Logstore to manage and store the logs. If you already have these resources, skip this step and proceed to Step 1: Inject the LoongCollector Sidecar container.
Project: A resource management unit in Simple Log Service that is used to isolate and manage logs for different projects or services.
Logstore: A log storage unit that is used to store logs.
Create a project
Create a Logstore
Step 1: Inject the LoongCollector Sidecar container
Inject a LoongCollector Sidecar container into the application pod and configure shared volumes to enable log collection. If you have not deployed the application or are just testing, you can use the Appendix: YAML example to quickly validate the process.
1. Modify the application pod YAML configuration
Define shared volumes
In
spec.template.spec.volumes, add three shared volumes at the same level ascontainers:volumes: # Shared log directory (written by the application container, read by the Sidecar) - name: ${shared_volume_name} # <-- The name must match the name in volumeMounts emptyDir: {} # Signal directory for inter-container communication (for graceful shutdown) - name: tasksite emptyDir: medium: Memory # Use memory as the medium for better performance sizeLimit: "50Mi" # Shared host timezone configuration: Synchronizes the timezone for all containers in the pod - name: tz-config # <-- The name must match the name in volumeMounts hostPath: path: /usr/share/zoneinfo/Asia/Shanghai # Modify the timezone as neededConfigure application container mounts
In the
volumeMountssection of your application container, such asyour-business-app-container, add the following mount items:Ensure that the application container writes logs to the
${shared_volume_path}directory to enable log collection by LoongCollector.volumeMounts: # Mount the shared log volume to the application log output directory - name: ${shared_volume_name} mountPath: ${shared_volume_path} # Example: /var/log/app # Mount the communication directory - name: tasksite mountPath: /tasksite # Shared directory for communication with the Loongcollector container # Mount the timezone file - name: tz-config mountPath: /etc/localtime readOnly: trueInject the LoongCollector Sidecar container
In the
spec.template.spec.containersarray, append the following Sidecar container definition:- name: loongcollector image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun command: ["/bin/bash", "-c"] args: - | echo "[$(date)] LoongCollector: Starting initialization" # Start the LoongCollector service /etc/init.d/loongcollectord start # Wait for the configuration to download and the service to be ready sleep 15 # Verify the service status if /etc/init.d/loongcollectord status; then echo "[$(date)] LoongCollector: Service started successfully" touch /tasksite/cornerstone else echo "[$(date)] LoongCollector: Failed to start service" exit 1 fi # Wait for the application container to complete (via the tombstone file signal) echo "[$(date)] LoongCollector: Waiting for business container to complete" until [[ -f /tasksite/tombstone ]]; do sleep 2 done # Allow time to upload remaining logs echo "[$(date)] LoongCollector: Business completed, waiting for log transmission" sleep 30 # Stop the service echo "[$(date)] LoongCollector: Stopping service" /etc/init.d/loongcollectord stop echo "[$(date)] LoongCollector: Shutdown complete" # Health check livenessProbe: exec: command: ["/etc/init.d/loongcollectord", "status"] initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3 # Resource configuration resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "2000m" memory: "2048Mi" # Environment variable configuration env: - name: ALIYUN_LOGTAIL_USER_ID value: "${your_aliyun_user_id}" - name: ALIYUN_LOGTAIL_USER_DEFINED_ID value: "${your_machine_group_user_defined_id}" - name: ALIYUN_LOGTAIL_CONFIG value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json" # Enable full drain mode to ensure all logs are sent before the pod terminates - name: enable_full_drain_mode value: "true" # Append pod environment context as log tags - name: ALIYUN_LOG_ENV_TAGS value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_" # Automatically inject pod and node metadata as log tags - name: "_pod_name_" valueFrom: fieldRef: fieldPath: metadata.name - name: "_pod_ip_" valueFrom: fieldRef: fieldPath: status.podIP - name: "_namespace_" valueFrom: fieldRef: fieldPath: metadata.namespace - name: "_node_name_" valueFrom: fieldRef: fieldPath: spec.nodeName - name: "_node_ip_" valueFrom: fieldRef: fieldPath: status.hostIP # Volume mounts (shared with the application container) volumeMounts: # Read-only mount for the application log directory - name: ${shared_volume_name} # <-- Shared log directory name mountPath: ${dir_containing_your_files} # <-- Path to the shared directory in the sidecar readOnly: true # Mount the communication directory - name: tasksite mountPath: /tasksite # Mount the timezone - name: tz-config mountPath: /etc/localtime readOnly: true
2. Modify the application container's lifecycle logic
Depending on the workload type, modify the application container to support a coordinated exit with the Sidecar:
Short-lived tasks (Job/CronJob)
# 1. Wait for LoongCollector to be ready
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# 2. Execute core business logic (ensure logs are written to the shared directory)
echo "Hello, World!" >> /app/logs/business.log
# 3. Save the exit code
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# 4. Notify LoongCollector that the business task is complete
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcodeLong-lived services (Deployment / StatefulSet)
# Define the signal handler function
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE3. Set the graceful termination period
In spec.template.spec, set a sufficient termination grace period to ensure LoongCollector has enough time to upload the remaining logs.
spec:
# ... Your other existing spec configurations ...
template:
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful shutdown period4. Variable descriptions
Variable | Description |
| Set this to the ID of your Alibaba Cloud account. For more information, see Configure user identifiers. |
| Set a custom ID for the machine group. This ID is used to create a custom machine group. Example: Important Ensure that this ID is unique within the region of your project. |
| Specify the configuration based on the region of your Simple Log Service project and the network type used for access. For information about regions, see Service regions. Example: If your project is in the China (Hangzhou) region, use |
| Set a custom name for the volume. Important The |
| Set the mount path. This is the directory in the container where the text logs to be collected are located. |
5. Apply the configuration and verify the result
Run the following command to deploy the changes:
kubectl apply -f <YOUR-YAML>Check the pod status to confirm that the LoongCollector container was injected successfully:
kubectl describe pod <YOUR-POD-NAME>If the status of the two containers (the application container and
loongcollector) is Normal, the injection is successful.
Step 2: Create a machine group with a custom ID
This step registers the LoongCollector Sidecar instances with Simple Log Service. This lets you centrally manage and deliver collection configurations.
Procedure
Create a machine group
In the target project, click
in the navigation pane on the left.On the Machine Groups page, click .
Configure the machine group
Configure the following parameters and click OK:
Name: The name of the machine group. This cannot be changed after creation. The name must meet the following requirements:
The name can contain only lowercase letters, digits, hyphens (-), and underscores (_).
The name must start and end with a lowercase letter or a digit.
The length must be 2 to 128 characters.
Machine Group Identifier: Select Custom Identifier.
Custom Identifier: Enter the value of the
ALIYUN_LOGTAIL_USER_DEFINED_IDenvironment variable that you set for the LoongCollector container in the YAML file in Step 1. The value must be an exact match. Otherwise, the association fails.
Check the machine group heartbeat status
After the machine group is created, click its name and check the heartbeat status in the machine group status area.
OK: Indicates that LoongCollector has successfully connected to Simple Log Service and the machine group is registered.
FAIL:
The configuration may not have taken effect. It takes about 2 minutes for the configuration to become effective. You can refresh the page and try again later.
If the status is still FAIL after 2 minutes, see Troubleshoot Logtail machine group issues to diagnose the problem.
Each pod corresponds to a separate LoongCollector instance. We recommend that you use different custom IDs for different services or environments to facilitate fine-grained management.
Step 3: Create a collection configuration
In this step, you will define which log files LoongCollector collects, how it parses the log structure, and how it filters content. Then, you will apply the configuration to the machine group.
Procedure
On the
Logstores page, click
before the name of the target Logstore to expand it.Click the
icon next to Data Ingestion. In the Quick Data Ingestion dialog box, find the Kubernetes - File card and click Ingest Now.Configure the machine group, and then click Next:
Scenario: Select Kubernetes Clusters.
Deployment Method: Select Sidecar.
Select machine group: In the Source Machine Group list, select the custom identifier-based machine group that you created in Step 2, and click
to add it to the Applied Machine Group list.
On the Logtail Configuration page, configure the Logtail collection rule.
1. Global and input configurations
Define the name of the collection configuration, the log source, and the collection scope.
Global Configurations:
Configuration Name: A custom name for the collection configuration. This name must be unique within the project and cannot be changed after it is created. Naming conventions:
Can contain only lowercase letters, digits, hyphens (-), and underscores (_).
Must start and end with a lowercase letter or a digit.
Input Configuration:
Type: Text Log Collection.
Logtail Deployment Mode: Select Sidecar.
File Path Type:
Path In Container: Collects log files from within the container.
Host Path: Collects logs from local services on the host.
File Path: The path from which logs are collected.
Linux: The path must start with a forward slash (/). For example,
/data/mylogs/**/*.logspecifies all files with the .log extension in the/data/mylogsdirectory.Windows: The path must start with a drive letter. For example,
C:\Program Files\Intel\**\*.Log.
Maximum Directory Monitoring Depth: The maximum directory depth that the wildcard character
**in the File Path can match. The default value is 0, which means that only the current directory is monitored.
2. Log processing and structuring
Configure log processing rules to transform raw, unstructured logs into structured, searchable data. This improves the efficiency of log queries and analysis. We recommend that you first add a log sample:
In the Processor Configurations section of the Logtail Configuration page, click Add Sample Log and enter the log content to be collected. The system identifies the log format based on the sample and helps generate regular expressions and parsing rules, which simplifies the configuration.
Use case 1: Process multiline logs (such as Java stack logs)
Because logs such as Java exception stacks and JSON objects often span multiple lines, the default collection mode splits them into multiple incomplete records, which causes a loss of context. To prevent this, enable multiline mode and configure a Regex to Match First Line to merge consecutive lines of the same log into a single, complete log.
Example:
Raw log without any processing | In default collection mode, each line is a separate log, breaking the stack trace and losing context | With multiline mode enabled, a Regex to Match First Line identifies the complete log, preserving its full semantic structure. |
|
|
|
Procedure: In the Processor Configurations section of the Logtail Configuration page, enable Multi-line Mode:
For Type, select Custom or Multi-line JSON.
Custom: For raw logs with a variable format, configure a Regex to Match First Line to identify the starting line of each log.
Regex to Match First Line: Automatically generate or manually enter a regular expression that matches a complete line of data. For example, the regular expression for the preceding example is
\[\d+-\d+-\w+:\d+:\d+,\d+]\s\[\w+]\s.*.Automatic generation: Click Generate. Then, in the Log Sample text box, select the log content that you want to extract and click Automatically Generate.
Manual entry: Click Manually Enter Regular Expression. After you enter the expression, click Validate.
Multi-line JSON: SLS automatically handles line breaks within a single raw log if the log is in standard JSON format.
Processing Method If Splitting Fails:
Discard: Discards a text segment if it does not match the start-of-line rule.
Retain Single Line: Retains unmatched text on separate lines.
Scenario 2: Structured logs
When raw logs are unstructured or semi-structured text, such as NGINX access logs or application output logs, direct querying and analysis are often inefficient. SLS provides various data parsing plugins that can automatically convert raw logs of different formats into structured data. This provides a solid data foundation for subsequent analysis, monitoring, and alerting.
Example:
Raw log | Structured log |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page:
Add a parsing plugin: To add a parsing plugin, click Add Processor and configure a plugin, such as a regular expression, separator, or JSON parsing plugin, based on the log format. For example, to collect NGINX logs, select .
NGINX Log Configuration: Copy the
log_formatdefinition from your Nginx server configuration file (nginx.conf) in its entirety and paste it into this text box.Example:
log_format main '$remote_addr - $remote_user [$time_local] "$request" ''$request_time $request_length ''$status $body_bytes_sent "$http_referer" ''"$http_user_agent"';ImportantThe format definition here must be exactly the same as the format that generates the logs on the server. Otherwise, log parsing fails.
General configuration parameter descriptions: The following parameters appear in multiple data parsing plugins, and their functions and usage are consistent.
Original Field: Specifies the source field to be parsed. The default is
content, which is the entire collected log entry.Retain Original Field if Parsing Fails: We recommend that you enable this option. If a log cannot be parsed by the plugin, for example, due to a format mismatch, this option ensures that the raw log content is retained in the specified raw field.
Retain Original Field if Parsing Succeeds: If selected, the raw log content is retained even if the log is parsed successfully.
3. Log filtering
Collecting a large volume of low-value or irrelevant logs, such as DEBUG or INFO level logs, wastes storage resources, increases costs, affects query efficiency, and poses data breach risks. You can implement fine-grained filtering policies for efficient and secure log collection.
Reduce costs by filtering content
Filter fields based on log content, such as collecting only logs where the level is WARNING or ERROR.
Example:
Raw log without any processing | Collect only |
| |
Procedure: In the Processor Configurations section of the Logtail Configuration page
Click Add Processor and select :
Field Name: The log field to use for filtering.
Field Value: The regular expression used for filtering. Only full matches are supported, not partial keyword matches.
Control the collection scope with a blacklist
Use a blacklist to exclude specified directories or files, which prevents irrelevant or sensitive logs from being uploaded.
Procedure: In the section of the Logtail Configuration page, enable Collection Blacklist and click Add.
Supports full and wildcard matching for directories and filenames. The only supported wildcard characters are the asterisk (*) and the question mark (?).
File Path Blacklist: Specifies the file paths to exclude. Examples:
/home/admin/private*.log: Ignores all files in the/home/admin/directory that start with private and end with .log./home/admin/private*/*_inner.log: Ignores files that end with _inner.log within directories that start with private under the/home/admin/directory.
File Blacklist: A list of filenames to ignore during collection. Example:
app_inner.log: Ignores all files namedapp_inner.logduring collection.
Directory Blacklist: Directory paths cannot end with a forward slash (/). Examples:
/home/admin/dir1/: The directory blacklist will not take effect./home/admin/dir*: Ignores files in all subdirectories that start with dir under the/home/admin/directory during collection./home/admin/*/dir: Ignores all files in subdirectories named dir at the second level of the/home/admin/directory. For example, files in the/home/admin/a/dirdirectory are ignored, but files in the/home/admin/a/b/dirdirectory are collected.
Container filtering
You can set collection conditions based on container metadata, such as environment variables, pod labels, namespaces, and container names, to precisely control which containers' logs are collected.
Configuration steps: On the Logtail Configuration page, in the Input Configurations area, enable Container Filtering, and click Add.
Multiple conditions have an AND relationship. All regular expression matching is based on Go's RE2 regular expression engine, which has some limitations compared to engines such as PCRE. When you write regular expressions, follow the guidelines in Appendix: Regular expression limits (Container filtering).
Environment Variable Blacklist/Whitelist: Specify conditions for the environment variables of the containers to be collected.
K8s Pod Label Blacklist/Whitelist: Specify conditions for the labels of the pods where the containers to be collected are located.
K8s Pod Name Regex Match: Specify the containers to be collected by pod name.
K8s Namespace Regex Match: Specify the containers to be collected by namespace name.
K8s Container Name Regex Match: Specify the containers to be collected by container name.
Container Label Blacklist/Whitelist: Collects containers whose labels meet the specified conditions. This is used for Docker scenarios and is not recommended for Kubernetes scenarios.
4. Log classification
When multiple applications or instances share the same log format, it can be difficult to distinguish the log source. This lack of context reduces query and analysis efficiency. To solve this issue, you can configure topics and log tags to automate context association and logical classification.
Configure topics
When logs from multiple applications or instances have the same format but different paths, such as /apps/app-A/run.log and /apps/app-B/run.log, it is difficult to distinguish the log source. You can generate topics based on machine groups, custom names, or file path extraction to flexibly distinguish logs from different services or paths.
Procedure: : Select a method for generating topics. The following three types are supported:
Machine Group Topic: When a collection configuration is applied to multiple machine groups, LoongCollector automatically uses the name of the server's machine group as the
__topic__field for upload. This is suitable for use cases where logs are divided by host.Custom: Uses the format
customized://<custom_topic_name>, such ascustomized://app-login. This format is suitable for static topic use cases with fixed business identifiers.File Path Extraction: Extract key information from the full path of the log file to dynamically mark the log source. This is suitable for situations where multiple users or applications share the same log filename but have different paths. For example, when multiple users or services write logs to different top-level directories but the sub-paths and filenames are identical, the source cannot be distinguished by filename alone:
/data/logs ├── userA │ └── serviceA │ └── service.log ├── userB │ └── serviceA │ └── service.log └── userC └── serviceA └── service.logConfigure File Path Extraction and use a regular expression to extract key information from the full path. The matched result is then uploaded to the logstore as the topic.
File path extraction rule: Based on regular expression capturing groups
When you configure a regular expression, the system automatically determines the output field format based on the number and naming of capturing groups. The rules are as follows:
In the regular expression for a file path, you must escape the forward slash (/).
Capturing group type
Use case
Generated field
Regex example
Matching path example
Generated field example
Single capturing group (only one
(.*?))Only one dimension is needed to distinguish the source (such as username or environment)
Generates the
__topic__field\/logs\/(.*?)\/app\.log/logs/userA/app.log__topic__: userAMultiple capturing groups - unnamed (multiple
(.*?))Multiple dimensions are needed to distinguish the source, but no semantic tags are required
Generates a tag field
__tag__:__topic_{i}__, where{i}is the ordinal number of the capturing group\/logs\/(.*?)\/(.*?)\/app\.log/logs/userA/svcA/app.log__tag__:__topic_1__userA__tag__:__topic_2__svcAMultiple capturing groups - named (using
(?P<name>.*?)Multiple dimensions are needed to distinguish the source, and the field meanings should be clear for easy querying and analysis
Generates a tag field
__tag__:{name}\/logs\/(?P<user>.*?)\/(?P<service>.*?)\/app\.log/logs/userA/svcA/app.log__tag__:user:userA;__tag__:service:svcA
Log tagging
You can enable the log tag enrichment feature to extract key information from container environment variables or Kubernetes pod labels and append it as tags. This allows for fine-grained grouping of logs.
Configuration steps: On the Logtail Configuration page, in the Input Configurations section, enable Log Tag Enrichment and click Add.
Environment Variables: Configure the environment variable name and tag name. The environment variable value is stored in the tag name.
Environment Variable Name: Specify the name of the environment variable to extract.
Tag Name: The name of the environment variable tag.
Pod Labels: Configure the pod label name and tag name. The pod label value is stored in the tag.
Pod Label Name: The name of the Kubernetes pod label to extract.
Tag Name: The name of the tag.
5. Output configuration
By default, all logs are collected and sent to the current Logstore using lz4 compression. To distribute logs from the same source to different Logstores, perform the following steps:
Dynamic multi-destination distribution
Multi-destination sending is applicable only to LoongCollector 3.0.0 and later. Logtail does not support this feature.
You can configure up to five output destinations.
After you configure multiple output destinations, this collection configuration is no longer displayed in the collection configuration list of the current Logstore. For more information about how to view, modify, or delete the multi-destination distribution configuration, see How do I manage multi-destination distribution configurations?.
Configuration steps: On the Logtail Configuration page, in the Output Configuration area.
Click
to expand the output configuration.Click Add Output Target and complete the following configuration:
Logstore: Select the target Logstore.
Compression Method: Supports lz4 and zstd.
Routing Configuration: Routes logs based on their tag field. Logs that meet the routing configuration are uploaded to the target Logstore. If the routing configuration is empty, all collected logs are uploaded to the target Logstore.
Tag Name: The name of the tag field used for routing. Enter the field name directly, such as
__path__. The__tag__:prefix is not required. Tag fields are divided into the following two types:For more information about tags, see Manage LoongCollector collection tags.
Agent-related: Related to the collection agent itself and independent of any plugins. Examples include
__hostname__and__user_defined_id__.Input plugin-related: Dependent on the input plugin, which provides and enriches the log with relevant information. Examples include
__path__for file collection, and_pod_name_and_container_name_for Kubernetes collection.
Tag Value: If the value of a log's tag field matches this value, the log is sent to the target Logstore.
Discard Tag Field: If you enable this option, the uploaded logs do not contain the tag field.
Step 4: Configure query and analysis settings
After you configure log processing and plugins, click Next to go to the Query and Analysis Configurations page:
Full-text index is enabled by default, which supports keyword searches on raw log content.
For precise queries by field, wait for the Preview Data to load, and then click Automatic Index Generation. SLS generates a field index based on the first entry in the preview data.
After the configuration is complete, click Next to finish setting up the entire collection process.
Step 5: View uploaded logs
After you create a collection configuration and apply it to a machine group, the system automatically delivers the configuration and starts collecting incremental logs.
Confirm that new content is added to the log file: LoongCollector collects only incremental logs. You can run
tail -f /path/to/your/log/fileand trigger a business operation to ensure that new logs are written.Query logs: Go to the query and analysis page of the target Logstore and click Search & Analyze to check if new logs are being ingested. The default time range is the last 15 minutes. The default fields for container text logs are as follows:
Field Name
Description
__tag__:__hostname__
The name of the container's host.
__tag__:__path__
The path of the log file in the container.
__tag__:_container_ip_
The IP address of the container.
__tag__:_image_name_
The name of the image used by the container.
NoteIf there are multiple images with the same hash but different names or tags, the collection configuration will select one of the names based on the hash for collection. It cannot be guaranteed that the selected name will be consistent with the one defined in the YAML file.
__tag__:_pod_name_
The name of the pod.
__tag__:_namespace_
The namespace to which the pod belongs.
__tag__:_pod_uid_
The unique identifier (UID) of the pod.
Key configuration notes for log collection integrity
Ensuring the integrity of log collection is a core goal of a LoongCollector Sidecar deployment. The following configuration parameters directly affect the integrity and reliability of log data.
LoongCollector resource configuration
In high-data-volume scenarios, a reasonable resource configuration is fundamental to ensuring collection performance on the client side. The key configuration parameters are as follows:
# Configure CPU and memory resources based on the log generation rate
resources:
limits:
cpu: "2000m"
memory: "2Gi"
# Parameters that affect collection performance
env:
- name: cpu_usage_limit
value: "2"
- name: mem_usage_limit
value: "2048"
- name: max_bytes_per_sec
value: "209715200"
- name: process_thread_count
value: "8"
- name: send_request_concurrency
value: "20"
For more information about the relationship between specific data volumes and corresponding configurations, see Logtail network types, startup parameters, and configuration files.
Server-side quota configuration
Server-side quota limits or network anomalies can obstruct data sending on the client side. This creates backpressure on the file collection side and affects log integrity. We recommend that you use CloudLens for SLS to monitor project resource quotas.
Initial collection configuration optimization
The initial file collection policy at pod startup directly affects data integrity, especially in high-speed data writing scenarios.
By configuring the initial collection size, you can specify the starting position for the first collection from a new file. The default initial collection size is 1024 KB.
During the first collection, if the file is smaller than 1024 KB, collection starts from the beginning of the file.
During the first collection, if the file is larger than 1024 KB, collection starts from the position 1024 KB from the end of the file.
The initial collection size can range from 0 to 10,485,760 KB.
enable_full_drain_mode
This is a key parameter for ensuring data integrity. It guarantees that LoongCollector completes all data collection and sending when it receives a SIGTERM signal.
# Parameter that affects collection integrity
env:
- name: enable_full_drain_mode
value: "true" # Enable full drain mode
FAQ
How do I manage multi-destination distribution configurations?
Because multi-destination distribution configurations are associated with multiple Logstores, they must be maintained on the project-level management page:
Log on to the Simple Log Service console and click the name of the target project.
On the target project page, choose
in the navigation pane on the left.NoteThis page centrally manages all collection configurations under the project, including those that remain after a Logstore is accidentally deleted.
What to do next
Data visualization: Use visualization dashboards to monitor key metric trends.
Automated alerting for data anomalies: Set up alert policies to detect system anomalies in real time.
Simple Log Service collects only incremental logs. To collect historical logs, see Import historical log files.
Appendix: YAML example
This example shows a complete Kubernetes Deployment configuration that includes an application container (Nginx) and a LoongCollector Sidecar container. It is suitable for collecting container logs using the Sidecar mode.
Before you begin, make the following three key replacements:
Replace
${your_aliyun_user_id}with the UID of your Alibaba Cloud account.Replace
${your_machine_group_user_defined_id}with the custom ID of the machine group that you created in Step 3. The custom ID must be an exact match.Replace
${your_region_config}with the configuration name that matches the region and network type of your Simple Log Service project.Example: If your project is in China (Hangzhou) and you use internal network access, set the value to
cn-hangzhou. If you use public network access, set the value tocn-hangzhou-internet.
Short-lived tasks (Job/CronJob)
apiVersion: batch/v1
kind: Job
metadata:
name: demo-job
spec:
backoffLimit: 3
activeDeadlineSeconds: 3600
completions: 1
parallelism: 1
template:
spec:
restartPolicy: Never
terminationGracePeriodSeconds: 300
containers:
# Application container
- name: demo-job
image: debian:bookworm-slim
command: ["/bin/bash", "-c"]
args:
- |
# Wait for LoongCollector to be ready
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# Execute business logic
echo "Hello, World!" >> /app/logs/business.log
# Save the exit code
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# Notify LoongCollector that the business task is complete
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcode
# Resource limits
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500"
memory: "512Mi"
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
- name: tasksite
mountPath: /tasksite
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-hongkong.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service
/etc/init.d/loongcollectord start
# Wait for the configuration to download and the service to be ready
sleep 15
# Verify the service status
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "your-user-id"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "your-user-defined-id"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/cn-hongkong/ilogtail_config.json"
- name: ALIYUN_LOG_ENV_TAGS
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_"
# Pod information injection
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: app-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "10Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
Long-lived services (Deployment / StatefulSet)
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
namespace: production
labels:
app: nginx-demo
version: v1.0.0
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
version: v1.0.0
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful shutdown period
containers:
# Application container - Web application
- name: nginx-demo
image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
# Startup command and signal handling
command: ["/bin/sh", "-c"]
args:
- |
# Define the signal handler function
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE
# Resource configuration
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "1Gi"
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service
/etc/init.d/loongcollectord start
# Wait for the configuration to download and the service to be ready
sleep 15
# Verify the service status
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "2000m"
memory: "2048Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "${your_aliyun_user_id}"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "${your_machine_group_user_defined_id}"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json"
# Enable full drain mode to ensure all logs are sent when the pod stops
- name: enable_full_drain_mode
value: "true"
# Append pod environment context as log tags
- name: "ALIYUN_LOG_ENV_TAGS"
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_"
# Get pod and node information
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: "_node_ip_"
valueFrom:
fieldRef:
fieldPath: status.hostIP
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: nginx-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "50Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
Appendix: Native parsing plugin details
In the Processor Configurations section of the Logtail Configuration page, add processors to structure raw logs. To add a processing plugin to an existing collection configuration, follow these steps:
In the navigation pane on the left, choose
Logstores and find the target logstore.Click the
icon before its name to expand the logstore.Click Logtail Configuration. In the configuration list, find the target Logtail configuration and click Manage Logtail Configuration in the Actions column.
On the Logtail configuration page, click Edit.
This section introduces only commonly used processing plugins that cover common log processing use cases. For more features, see Extended processors.
Rules for combining plugins (for LoongCollector / Logtail 2.0 and later):
Native and extended processors can be used independently or combined as needed.
Prioritize native processor because they offer better performance and higher stability.
When native features cannot meet your business needs, add extended processors after the configured native ones for supplementary processing.
Order constraint:
All plugins are executed sequentially in the order they are configured, which forms a processing chain. Note: All native processors must precede any extended processors. After you add an extended processor, you cannot add more native processors.
Regular expression parsing
You can extract log fields using a regular expression and parse the log into key-value pairs. Each field can be independently queried and analyzed.
Example:
Raw log without any processing | Using the regular expression parsing plugin |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Regular Expression: Used to match logs. You can generate it automatically or enter it manually:
Automatic generation:
Click Auto-generate Regular Expression.
In the Log Sample, select the log content that you want to extract.
Click Generate Regular Expression.

Manual input: Manually Enter A Regular Expression based on the log format.
After you configure the expression, click Validate to test whether the regular expression can correctly parse the log content.
Extracted Field: Set the corresponding field name (Key) for the extracted log content (Value).
For more information about the other parameters, see the general configuration parameter descriptions in Scenario 2: Structured logs.
Delimiter-based parsing
You can structure log content using a delimiter to parse it into multiple key-value pairs. Both single-character and multi-character delimiters are supported.
Example:
Raw log without any processing | Fields split by the specified character |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor, and select :
Delimiter: Specify the character used to split the log content.
Example: For a CSV file, select Custom and enter a comma (,).
Quote: If a field value contains the separator, you must specify a quote character to wrap the field to avoid incorrect splitting.
Extracted Field: Set a field name (Key) for each column in the order that they are separated. The following rules apply:
The field name can contain only letters, digits, and underscores (_).
The name must start with a letter or an underscore (_).
The maximum length is 128 bytes.
For more information about the other parameters, see the general configuration parameter descriptions in Scenario 2: Structured logs.
Standard JSON parsing
You can structure an Object-type JSON log by parsing it into key-value pairs.
Example:
Raw log without any processing | Automatic extraction of standard JSON key-value pairs |
| |
Configuration steps: On the Logtail Configuration page, in the Processor Configurations section, click Add Processor, and select :
Original Field: The default value is `content`. This field is used to store the raw log content to be parsed.
For more information about the other parameters, see the general configuration parameter descriptions in Scenario 2: Structured logs.
Nested JSON parsing
You can parse a nested JSON log into key-value pairs by specifying the expansion depth.
Example:
Raw log without any processing | Expansion depth: 0, using expansion depth as a prefix | Expansion depth: 1, using expansion depth as a prefix |
| | |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor, and select :
Original Field: The name of the raw field to be expanded, for example,
content.JSON Expansion Depth: The expansion level of the JSON object. A value of 0 indicates that the object is fully expanded, which is the default. A value of 1 indicates that only the current level is expanded.
Character To Concatenate Expanded Keys: The character used to concatenate field names during JSON expansion. The default is an underscore (_).
JSON Expansion Prefix: Specify a prefix for the field names after JSON expansion.
Expand Array: Enable this option to expand arrays into key-value pairs with indexes.
Example:
{"k":["a","b"]}is expanded to{"k[0]":"a","k[1]":"b"}.To rename the expanded fields, for example, from prefix_s_key_k1 to new_field_name, you can add a Rename Fields plugin to complete the mapping.
For more information about the other parameters, see the general configuration parameter descriptions in Scenario 2: Structured logs.
JSON array parsing
Use the json_extract function to extract JSON objects from a JSON array.
Example:
Raw log without any processing | Extract JSON array structure |
| |
Procedure: In the Processor Configurations section of the Logtail Configuration page, switch the Processing Mode to SPL, configure the SPL Statement, and use the json_extract function to extract JSON objects from the JSON array.
Example: Extract elements from the JSON array in the log field content and store the results in new fields json1 and json2.
* | extend json1 = json_extract(content, '$[0]'), json2 = json_extract(content, '$[1]')Apache log parsing
You can structure log content based on the definitions in the Apache log configuration file by parsing it into multiple key-value pairs.
Example:
Raw log without any processing | Apache Common Log Format |
| |
Configuration steps: In the Processing Configuration section of the Logtail Configuration page, click Add Processing Plugin, and select :
Log Format: combined
Apache Configuration Fields: The system automatically fills in the configuration based on the Log Format.
ImportantMake sure that the auto-filled content is exactly the same as the LogFormat defined in the Apache configuration file on the server. The file is usually located at /etc/apache2/apache2.conf.
For more information about the other parameters, see the general configuration parameter descriptions in Scenario 2: Structured logs.
Data masking
Mask sensitive data in logs.
Example:
Raw log without any processing | Masking result |
| |
Procedure: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Original Field: The field that contains the log content before parsing.
Data Masking Method:
const: Replaces sensitive content with a constant string.
md5: Replaces sensitive content with its MD5 hash.
Replacement String: If Data Masking Method is set to const, enter a string to replace the sensitive content.
Content Expression that Precedes Replaced Content: The expression used to find sensitive content, which is configured using RE2 syntax.
Content Expression to Match Replaced Content: The regular expression used to match sensitive content. The expression must be written in RE2 syntax.
Time parsing
Parse the time field in the log and set the parsing result as the log's __time__ field.
Example:
Raw log without any processing | Time parsing |
|
|
Procedure: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Original Field: The field that contains the log content before parsing.
Time Format: Set the time format that corresponds to the timestamps in the log.
Time Zone: Select the time zone for the log time field. By default, this is the time zone of the environment where the LoongCollector (Logtail) process is running.
Appendix: Regular expression limits (Container filtering)
The regular expressions used for Container filtering are based on Go's RE2 engine, which has some syntax limitations compared to other engines such as PCRE. Note the following when you write regular expressions:
1. Named group syntax differences
Go uses the (?P<name>...) syntax to define named groups and does not support the (?<name>...) syntax from PCRE.
Correct example:
(?P<year>\d{4})Incorrect syntax:
(?<year>\d{4})
2. Unsupported regular expression features
The following common but complex regular expression features are not available in RE2. You should avoid using them:
Assertions:
(?=...),(?!...),(?<=...), or(?<!...)Conditional expressions:
(?(condition)true|false)Recursive matching:
(?R)or(?0)Subprogram references:
(?&name)or(?P>name)Atomic groups:
(?>...)
3. Recommendations
We recommend that you use a tool such as Regex101 to debug regular expressions. You can select the Golang (RE2) mode for validation to ensure compatibility. If you use any of the unsupported syntax, the plugin does not parse or match correctly.
> Create Machine Group


