In a Kubernetes environment, the Sidecar mode is an ideal log collection solution for fine-grained management of application logs, multi-tenant data isolation, or ensuring log collection is strictly bound to the application lifecycle. This mode works by injecting a separate LoongCollector (Logtail) container into your application pod. This setup enables dedicated log collection for that pod and offers powerful flexibility and isolation.
How it works
In Sidecar mode, an application container and a LoongCollector (Logtail) log collection container run side-by-side within your application pod. They work together using shared volumes and lifecycle synchronization mechanisms.
Log sharing: The application container writes its log files to a shared volume, typically an
emptyDir. The LoongCollector (Logtail) container mounts the same shared volume, which allows it to read and collect these log files in real time.Configuration association: Each LoongCollector (Logtail) Sidecar container declares its identity by setting a unique
custom identifier. In the Simple Log Service (SLS) console, you must create a machine group that uses the same identifier. This way, all Sidecar instances with the same identifier automatically apply the collection configurations from that machine group.Lifecycle synchronization: To prevent log loss when a pod terminates, the application container and the LoongCollector (Logtail) container communicate using signal files (
cornerstoneandtombstone) in a shared volume. This mechanism works in conjunction with the pod'sgraceful termination period(terminationGracePeriodSeconds) to ensure a graceful shutdown. The application container stops writing first, LoongCollector finishes sending all remaining logs, and then both containers exit together.
Preparations
Before you collect logs, you must plan and create a project and a logstore to manage and store the logs. If you already have these resources, skip this step and proceed to Step 1: Inject the LoongCollector Sidecar container.
Project: A resource management unit in SLS that is used to isolate and manage logs for different projects or services.
Logstore: A log storage unit that is used to store logs.
Create a project
Create a logstore
Step 1: Inject the LoongCollector Sidecar container
Inject a LoongCollector Sidecar container into the application pod and configure shared volumes to enable log collection. If you have not deployed the application or are just testing, use the Appendix: YAML example to quickly validate the process.
1. Modify the application pod YAML configuration
Define shared volumes
In
spec.template.spec.volumes, add three shared volumes at the same level ascontainers:volumes: # Shared log directory (written by the application container, read by the Sidecar) - name: ${shared_volume_name} # <-- The name must match the name in volumeMounts emptyDir: {} # Signal directory for inter-container communication (for graceful shutdown) - name: tasksite emptyDir: medium: Memory # Use memory as the medium for better performance sizeLimit: "50Mi" # Shared host timezone configuration: Synchronizes the timezone for all containers in the pod - name: tz-config # <-- The name must match the name in volumeMounts hostPath: path: /usr/share/zoneinfo/Asia/Shanghai # Modify the timezone as neededConfigure application container mounts
In the
volumeMountssection of your application container, such asyour-business-app-container, add the following mount items:Ensure that the application container writes logs to the
${shared_volume_path}directory to enable log collection by LoongCollector.volumeMounts: # Mount the shared log volume to the application log output directory - name: ${shared_volume_name} mountPath: ${shared_volume_path} # Example: /var/log/app # Mount the communication directory - name: tasksite mountPath: /tasksite # Shared directory for communication with the Loongcollector container # Mount the timezone file - name: tz-config mountPath: /etc/localtime readOnly: trueInject the LoongCollector Sidecar container
In the
spec.template.spec.containersarray, append the following Sidecar container definition:- name: loongcollector image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun command: ["/bin/bash", "-c"] args: - | echo "[$(date)] LoongCollector: Starting initialization" # Start the LoongCollector service /etc/init.d/loongcollectord start # Wait for the configuration to download and the service to be ready sleep 15 # Verify the service status if /etc/init.d/loongcollectord status; then echo "[$(date)] LoongCollector: Service started successfully" touch /tasksite/cornerstone else echo "[$(date)] LoongCollector: Failed to start service" exit 1 fi # Wait for the application container to complete (via the tombstone file signal) echo "[$(date)] LoongCollector: Waiting for business container to complete" until [[ -f /tasksite/tombstone ]]; do sleep 2 done # Allow time to upload remaining logs echo "[$(date)] LoongCollector: Business completed, waiting for log transmission" sleep 30 # Stop the service echo "[$(date)] LoongCollector: Stopping service" /etc/init.d/loongcollectord stop echo "[$(date)] LoongCollector: Shutdown complete" # Health check livenessProbe: exec: command: ["/etc/init.d/loongcollectord", "status"] initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3 # Resource configuration resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "2000m" memory: "2048Mi" # Environment variable configuration env: - name: ALIYUN_LOGTAIL_USER_ID value: "${your_aliyun_user_id}" - name: ALIYUN_LOGTAIL_USER_DEFINED_ID value: "${your_machine_group_user_defined_id}" - name: ALIYUN_LOGTAIL_CONFIG value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json" # Enable full drain mode to ensure all logs are sent before the pod terminates - name: enable_full_drain_mode value: "true" # Append pod environment information as log tags - name: ALIYUN_LOG_ENV_TAGS value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_" # Automatically inject pod and node metadata as log tags - name: "_pod_name_" valueFrom: fieldRef: fieldPath: metadata.name - name: "_pod_ip_" valueFrom: fieldRef: fieldPath: status.podIP - name: "_namespace_" valueFrom: fieldRef: fieldPath: metadata.namespace - name: "_node_name_" valueFrom: fieldRef: fieldPath: spec.nodeName - name: "_node_ip_" valueFrom: fieldRef: fieldPath: status.hostIP # Volume mounts (shared with the application container) volumeMounts: # Read-only mount for the application log directory - name: ${shared_volume_name} # <-- Shared log directory name mountPath: ${dir_containing_your_files} # <-- Path to the shared directory in the sidecar readOnly: true # Mount the communication directory - name: tasksite mountPath: /tasksite # Mount the timezone - name: tz-config mountPath: /etc/localtime readOnly: true
2. Modify the application container's lifecycle logic
Depending on the workload type, you must modify the application container to support a coordinated exit with the Sidecar.
Short-lived tasks (Job/CronJob)
# 1. Wait for LoongCollector to be ready
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# 2. Execute core business logic (ensure logs are written to the shared directory)
echo "Hello, World!" >> /app/logs/business.log
# 3. Save the exit code
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# 4. Notify LoongCollector that the business task is complete
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcodeLong-lived services (Deployment/StatefulSet)
# Define the signal handler function
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE3. Set the graceful termination period
In spec.template.spec, set a sufficient termination grace period to ensure LoongCollector has enough time to upload the remaining logs.
spec:
# ... Your other existing spec configurations ...
template:
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful shutdown period4. Variable descriptions
Variable | Description |
| Set this to the ID of your Alibaba Cloud account. For more information, see Configure user identifiers. |
| Set a custom ID for the machine group. This ID is used to create a custom machine group. Example: Important Ensure that this ID is unique within the region of your project. |
| Specify the configuration based on the region of your SLS project and the network type used for access. For information about regions, see Service regions. Example: If your project is in the China (Hangzhou) region, use |
| Set a custom name for the volume. Important The |
| Set the mount path. This is the directory in the container where the text logs to be collected are located. |
5. Apply the configuration and verify the result
Run the following command to deploy the changes:
kubectl apply -f <YOUR-YAML>Check the pod status to confirm that the LoongCollector container was injected successfully:
kubectl describe pod <YOUR-POD-NAME>If the status of the two containers (the application container and
loongcollector) is Normal, the injection is successful.
Step 2: Create a machine group with a custom ID
This step registers the LoongCollector Sidecar instances with SLS. This lets you centrally manage and deliver collection configurations.
Procedure
Create a machine group
In the target project, click
in the navigation pane on the left.On the Machine Groups page, click .
Configure the machine group
Configure the following parameters and click OK:
Name: The name of the machine group. This cannot be changed after creation. The naming conventions are as follows:
Can contain only lowercase letters, digits, hyphens (-), and underscores (_).
Must start and end with a lowercase letter or a digit.
Must be 2 to 128 characters in length.
Machine Group Identifier: Select Custom Identifier.
Custom Identifier: Enter the value of the
ALIYUN_LOGTAIL_USER_DEFINED_IDenvironment variable that you set for the LoongCollector container in the YAML file in Step 1. The value must be an exact match, otherwise the association will fail.
Check the machine group heartbeat status
After the machine group is created, click its name and check the heartbeat status in the machine group status area.
OK: Indicates that LoongCollector has successfully connected to SLS and the machine group is registered.
FAIL:
The configuration may not have taken effect. It takes about 2 minutes for the configuration to become effective. Refresh the page and try again later.
If the status is still FAIL after 2 minutes, see Troubleshoot Logtail machine group issues to diagnose the problem.
Each pod corresponds to a separate LoongCollector instance. We recommend that you use different custom IDs for different services or environments to facilitate fine-grained management.
Step 3: Create a collection configuration
Define which log files LoongCollector collects, how it parses the log structure, and how it filters content. Then, bind the configuration to the machine group.
Procedure
On the
Logstores page, click
before the name of the target logstore to expand it.Click the
icon next to Data Collection. In the Quick Data Import dialog box, find the Kubernetes - File card and click Integrate Now.Configure the machine group, and then click Next:
Scenario: Select Kubernetes Clusters.
Deployment Method: Select Sidecar.
Select Machine Group: In the Source Machine Group list, select the machine group with the custom ID that you created in Step 2, and click
to add it to the applied machine group list.
On the Logtail Configuration page, configure the Logtail collection rule as follows.
1. Global and input configurations
Define the name of the collection configuration, the log source, and the collection scope.
Global Configurations:
Configuration Name: A custom name for the collection configuration. This name must be unique within the project and cannot be changed after it is created. Naming conventions:
Can contain only lowercase letters, digits, hyphens (-), and underscores (_).
Must start and end with a lowercase letter or a digit.
Input Configurations:
Type: Text Log Collection.
Logtail Deployment Mode: Select Sidecar
File Path Type:
Path in Container: Collect log files from within the container.
Host Path: Collect logs from local services on the host.
File Path: The path from which to collect logs.
Linux: The path must start with a forward slash (/). For example,
/data/mylogs/**/*.logspecifies all files with the .log extension in the/data/mylogsdirectory.Windows: The path must start with a drive letter. For example,
C:\Program Files\Intel\**\*.Log.
Maximum Directory Monitoring Depth: The maximum directory depth that the wildcard character
**in the File Path can match. The default is 0, which means only the current directory is monitored.
2. Log processing and structuring
Configure log processing rules to transform raw, unstructured logs into structured, searchable data. This improves the efficiency of log queries and analysis. We recommend that you first add a log sample:
In the Processor Configurations section of the Logtail Configuration page, click Add Sample Log and enter the log content to be collected. The system identifies the log format based on the sample and helps generate regular expressions and parsing rules, which simplifies the configuration.
Use case 1: Process multiline logs (such as Java stack logs)
Because logs such as Java exception stacks and JSON objects often span multiple lines, the default collection mode splits them into multiple incomplete records, which causes a loss of context. To prevent this, enable multiline mode and configure a Regex to Match First Line to merge consecutive lines of the same log into a single, complete log.
Example:
Raw log without any processing | In default collection mode, each line is a separate log, breaking the stack trace and losing context | With multiline mode enabled, a Regex to Match First Line identifies the complete log, preserving its full semantic structure. |
|
|
|
Procedure: In the Processor Configurations section of the Logtail Configuration page, enable Multi-line Mode:
For Type, select Custom or Multi-line JSON.
Custom: For raw logs with a variable format, configure a Regex to Match First Line to identify the starting line of each log.
Regex to Match First Line: Automatically generate or manually enter a regular expression that matches a complete line of data. For example, the regular expression for the preceding example is
\[\d+-\d+-\w+:\d+:\d+,\d+]\s\[\w+]\s.*.Automatic generation: Click Generate. Then, in the Log Sample text box, select the log content that you want to extract and click Automatically Generate.
Manual entry: Click Manually Enter Regular Expression. After you enter the expression, click Validate.
Multi-line JSON: SLS automatically handles line breaks within a single raw log if the log is in standard JSON format.
Processing Method If Splitting Fails:
Discard: Discards a text segment if it does not match the start-of-line rule.
Retain Single Line: Retains unmatched text on separate lines.
Use case 2: Structured logs
When raw logs are unstructured or semi-structured text, such as NGINX access logs or application output logs, direct querying and analysis are often inefficient. SLS provides various data parsing plugins that can automatically convert raw logs of different formats into structured data. This provides a solid data foundation for subsequent analysis, monitoring, and alerting.
Example:
Raw log | Structured log |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page:
Add a parsing plugin: Click Add Processor and configure regular expression parsing, separator parsing, and JSON parsing plugins according to the log format. For example, to collect NGINX logs, select .
NGINX Log Configuration: Copy the entire
log_formatdefinition from your Nginx server's configuration file (nginx.conf) and paste it into this text box.Example:
log_format main '$remote_addr - $remote_user [$time_local] "$request" ''$request_time $request_length ''$status $body_bytes_sent "$http_referer" ''"$http_user_agent"';ImportantThe format definition here must be exactly the same as the format that generates the logs on the server. Otherwise, log parsing will fail.
General configuration parameter descriptions: The following parameters appear in multiple data parsing plugins, and their functions and usage are consistent.
Original Field: Specifies the source field to be parsed. The default is
content, which is the entire collected log entry.Retain Original Field if Parsing Fails: We recommend that you enable this option. If a log cannot be successfully parsed by the plugin (for example, due to a format mismatch), this option ensures that the raw log content is not lost but is fully retained in the specified raw field.
Retain Original Field if Parsing Succeeds: If selected, the raw log content is retained even if the log is parsed successfully.
3. Log filtering
During log collection, indiscriminately collecting a large volume of low-value or irrelevant logs (such as DEBUG/INFO level logs) not only wastes storage resources and increases costs but also affects query efficiency and poses data breach risks. Implement fine-grained filtering policies for efficient and secure log collection.
Reduce costs by filtering content
Filter fields based on log content, such as collecting only logs where the level is WARNING or ERROR.
Example:
Raw log without any processing | Collect only |
| |
Procedure: In the Processor Configurations section of the Logtail Configuration page
Click Add Processor and select :
Field Name: The log field to use for filtering.
Field Value: The regular expression used for filtering. Only full matches are supported, not partial keyword matches.
Control the collection scope with a blacklist
Use a blacklist to exclude specified directories or files, which prevents irrelevant or sensitive logs from being uploaded.
Procedure: In the section of the Logtail Configuration page, enable Collection Blacklist and click Add.
Supports full and wildcard matching for directories and filenames. The only supported wildcard characters are the asterisk (*) and the question mark (?).
File Path Blacklist: Specifies the file paths to exclude. Examples:
/home/admin/private*.log: Ignores all files in the/home/admin/directory that start with private and end with .log./home/admin/private*/*_inner.log: Ignores files that end with _inner.log within directories that start with private under the/home/admin/directory.
File Blacklist: A list of filenames to ignore during collection. Example:
app_inner.log: Ignores all files namedapp_inner.logduring collection.
Directory Blacklist: Directory paths cannot end with a forward slash (/). Examples:
/home/admin/dir1/: The directory blacklist will not take effect./home/admin/dir*: Ignores files in all subdirectories that start with dir under the/home/admin/directory during collection./home/admin/*/dir: Ignores all files in subdirectories named dir at the second level of the/home/admin/directory. For example, files in the/home/admin/a/dirdirectory are ignored, but files in the/home/admin/a/b/dirdirectory are collected.
Container filtering
Set collection conditions based on container metadata (such as environment variables, pod labels, namespaces, and container names) to precisely control which containers' logs are collected.
Configuration steps: On the Logtail Configuration page, in the Processor Configurations area, enable Container Filtering and click Add.
Multiple conditions have an "AND" relationship. All regular expression matching is based on Go's RE2 regular expression engine, which has some limitations compared to engines such as PCRE. Follow the guidelines in Appendix: Regular expression limits (Container filtering) when you write regular expressions.
Environment Variable Blacklist/Whitelist: Specify conditions for the environment variables of the containers to be collected.
K8s Pod Label Blacklist/Whitelist: Specify conditions for the labels of the pods where the containers to be collected are located.
K8s Pod Name Regex Match: Specify the containers to be collected by pod name.
K8s Namespace Regex Match: Specify the containers to be collected by namespace name.
K8s Container Name Regex Match: Specify the containers to be collected by container name.
Container Label Blacklist/Whitelist: Collect containers whose container labels meet the conditions. Used for Docker use cases. Not recommended for Kubernetes use cases.
4. Log classification
In use cases where multiple applications or instances share the same log format, it is difficult to distinguish the log source. This leads to a lack of context during queries and reduces analysis efficiency. To solve this, configure topics and log tags to achieve automated context association and logical classification.
Configure topics
When the logs of multiple applications or instances have the same format but different paths (such as /apps/app-A/run.log and /apps/app-B/run.log), it is difficult to distinguish the source of the collected logs. Generate topics based on machine groups, custom names, or file path extraction to flexibly distinguish logs from different services or paths.
Procedure: : Select a method for generating topics. The following three types are supported:
Machine Group Topic: When a collection configuration is applied to multiple machine groups, LoongCollector automatically uses the name of the server's machine group as the
__topic__field for upload. This is suitable for use cases where logs are divided by host.Custom: Uses the format
customized://<custom_topic_name>, such ascustomized://app-login. This format is suitable for static topic use cases with fixed business identifiers.File Path Extraction: Extract key information from the full path of the log file to dynamically mark the log source. This is suitable for situations where multiple users or applications share the same log filename but have different paths. For example, when multiple users or services write logs to different top-level directories but the sub-paths and filenames are identical, the source cannot be distinguished by filename alone:
/data/logs ├── userA │ └── serviceA │ └── service.log ├── userB │ └── serviceA │ └── service.log └── userC └── serviceA └── service.logConfigure File Path Extraction and use a regular expression to extract key information from the full path. The matched result is then uploaded to the logstore as the topic.
File path extraction rule: Based on regular expression capturing groups
When you configure a regular expression, the system automatically determines the output field format based on the number and naming of capturing groups. The rules are as follows:
In the regular expression for a file path, you must escape the forward slash (/).
Capturing group type
Use case
Generated field
Regex example
Matching path example
Generated field example
Single capturing group (only one
(.*?))Only one dimension is needed to distinguish the source (such as username or environment)
Generates the
__topic__field\/logs\/(.*?)\/app\.log/logs/userA/app.log__topic__: userAMultiple capturing groups - unnamed (multiple
(.*?))Multiple dimensions are needed to distinguish the source, but no semantic tags are required
Generates a tag field
__tag__:__topic_{i}__, where{i}is the ordinal number of the capturing group\/logs\/(.*?)\/(.*?)\/app\.log/logs/userA/svcA/app.log__tag__:__topic_1__userA__tag__:__topic_2__svcAMultiple capturing groups - named (using
(?P<name>.*?)Multiple dimensions are needed to distinguish the source, and the field meanings should be clear for easy querying and analysis
Generates a tag field
__tag__:{name}\/logs\/(?P<user>.*?)\/(?P<service>.*?)\/app\.log/logs/userA/svcA/app.log__tag__:user:userA;__tag__:service:svcA
Log tagging
Enable the log tag enrichment feature to extract key information from container environment variables or Kubernetes pod labels and append it as tags. This allows for fine-grained grouping of logs.
Configuration steps: In the Input Configurations section of the Logtail Configuration page, , enable Log Tag Enrichment and click Add.
Environment Variables: Configure the environment variable name and tag name. The environment variable value will be stored in the tag name.
Environment Variable Name: Specify the name of the environment variable to extract.
Tag Name: The name of the environment variable tag.
Pod Labels: Configure the pod label name and tag name. The pod label value will be stored in the tag name.
Pod Label Name: The name of the Kubernetes pod label to extract.
Tag Name: The name of the tag.
Step 4: Configure query and analysis settings
After you configure log processing and plugins, click Next to go to the Query and Analysis Configurations page:
Full-text index is enabled by default, which supports keyword searches on raw log content.
For precise queries by field, wait for the Preview Data to load, and then click Automatic Index Generation. SLS generates a field index based on the first entry in the preview data.
After the configuration is complete, click Next to finish setting up the entire collection process.
Step 5: View uploaded logs
After you create a collection configuration and apply it to a machine group, the system automatically delivers the configuration and starts collecting incremental logs.
Confirm that new content is added to the log file: LoongCollector only collects incremental logs. Run
tail -f /path/to/your/log/fileand trigger a business operation to ensure that new logs are being written.Query logs: Go to the query and analysis page of the target logstore and click Search & Analyze (the default time range is the last 15 minutes) to see if new logs are flowing in. The default fields for container text logs are as follows:
Field
Description
__tag__:__hostname__
The name of the container's host.
__tag__:__path__
The path of the log file in the container.
__tag__:_container_ip_
The IP address of the container.
__tag__:_image_name_
The name of the image used by the container.
NoteIf there are multiple images with the same hash but different names or tags, the collection configuration will select one of the names based on the hash for collection. It cannot be guaranteed that the selected name will be consistent with the one defined in the YAML file.
__tag__:_pod_name_
The name of the pod.
__tag__:_namespace_
The namespace to which the pod belongs.
__tag__:_pod_uid_
The unique identifier (UID) of the pod.
Key configuration notes for log collection integrity
Ensuring the integrity of log collection is a core goal of a LoongCollector Sidecar deployment. The following configuration parameters directly affect the integrity and reliability of log data.
LoongCollector resource configuration
In high-data-volume use cases, a reasonable resource configuration is fundamental to ensuring collection performance on the client side. The key configuration parameters are as follows:
# Configure CPU and memory resources based on the log generation rate
resources:
limits:
cpu: "2000m"
memory: "2Gi"
# Parameters that affect collection performance
env:
- name: cpu_usage_limit
value: "2"
- name: mem_usage_limit
value: "2048"
- name: max_bytes_per_sec
value: "209715200"
- name: process_thread_count
value: "8"
- name: send_request_concurrency
value: "20"
For more information about the relationship between specific data volumes and corresponding configurations, see Logtail network types, startup parameters, and configuration files.
Server-side quota configuration
Server-side quota limits or network anomalies can obstruct data sending on the client side, which creates backpressure on the file collection side and affects log integrity. We recommend that you use CloudLens for SLS to monitor project resource quotas.
Initial collection configuration optimization
The initial file collection policy at pod startup directly affects data integrity, especially in high-speed data writing use cases.
By configuring the initial collection size, specify the starting position for the first collection from a new file. The default initial collection size is 1,024 KB.
During the first collection, if the file is smaller than 1,024 KB, collection starts from the beginning of the file content.
During the first collection, if the file is larger than 1,024 KB, collection starts from the position 1,024 KB from the end of the file.
The initial collection size can range from 0 to 10,485,760 KB.
enable_full_drain_mode
This is a key parameter for ensuring data integrity. It guarantees that LoongCollector completes all data collection and sending when it receives a SIGTERM signal.
# Parameter that affects collection integrity
env:
- name: enable_full_drain_mode
value: "true" # Enable full drain mode
What to do next
Data visualization: Use visualization dashboards to monitor key metric trends.
Automated alerting for data anomalies: Set up alert policies to detect system anomalies in real time.
SLS only collects incremental logs. To collect historical logs, see Import historical log files.
Appendix: YAML example
This example shows a complete Kubernetes Deployment configuration that includes an application container (Nginx) and a LoongCollector Sidecar container. It is suitable for collecting container logs using the Sidecar mode.
Before you use it, make the following three key replacements:
Replace
${your_aliyun_user_id}with the UID of your Alibaba Cloud account.Replace
${your_machine_group_user_defined_id}with the custom ID of the machine group that you created in Step 2. The value must be an exact match.Replace
${your_region_config}with the configuration name that matches the region and network type of your SLS project.Example: If your project is in China (Hangzhou) and you use internal network access, set the value to
cn-hangzhou. If you use public network access, set the value tocn-hangzhou-internet.
Short-lived tasks (Job/CronJob)
apiVersion: batch/v1
kind: Job
metadata:
name: demo-job
spec:
backoffLimit: 3
activeDeadlineSeconds: 3600
completions: 1
parallelism: 1
template:
spec:
restartPolicy: Never
terminationGracePeriodSeconds: 300
containers:
# Application container
- name: demo-job
image: debian:bookworm-slim
command: ["/bin/bash", "-c"]
args:
- |
# Wait for LoongCollector to be ready
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# Execute business logic
echo "Hello, World!" >> /app/logs/business.log
# Save the exit code
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# Notify LoongCollector that the business task is complete
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcode
# Resource limits
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500"
memory: "512Mi"
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
- name: tasksite
mountPath: /tasksite
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-hongkong.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service
/etc/init.d/loongcollectord start
# Wait for the configuration to download and the service to be ready
sleep 15
# Verify the service status
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "your-user-id"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "your-user-defined-id"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/cn-hongkong/ilogtail_config.json"
- name: ALIYUN_LOG_ENV_TAGS
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_"
# Pod information injection
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: app-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "10Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
Long-lived services (Deployment/StatefulSet)
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
namespace: production
labels:
app: nginx-demo
version: v1.0.0
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
version: v1.0.0
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful shutdown period
containers:
# Application container - Web application
- name: nginx-demo
image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
# Startup command and signal handling
command: ["/bin/sh", "-c"]
args:
- |
# Define the signal handler function
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE
# Resource configuration
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "1Gi"
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service
/etc/init.d/loongcollectord start
# Wait for the configuration to download and the service to be ready
sleep 15
# Verify the service status
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "2000m"
memory: "2048Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "${your_aliyun_user_id}"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "${your_machine_group_user_defined_id}"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json"
# Enable full drain mode to ensure all logs are sent when the pod stops
- name: enable_full_drain_mode
value: "true"
# Append pod environment information as log tags
- name: "ALIYUN_LOG_ENV_TAGS"
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_"
# Get pod and node information
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: "_node_ip_"
valueFrom:
fieldRef:
fieldPath: status.hostIP
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: nginx-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "50Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
Appendix: Native parsing plugin details
In the Processor Configurations section of the Logtail Configuration page, add processors to structure raw logs. To add a processing plugin to an existing collection configuration, follow these steps:
In the navigation pane on the left, choose
Logstores and find the target logstore.Click the
icon before its name to expand the logstore.Click Logtail Configuration. In the configuration list, find the target Logtail configuration and click Manage Logtail Configuration in the Actions column.
On the Logtail configuration page, click Edit.
This section introduces only commonly used processing plugins that cover common log processing use cases. For more features, see Extended processors.
Rules for combining plugins (for LoongCollector / Logtail 2.0 and later):
Native and extended processors can be used independently or combined as needed.
Prioritize native processor because they offer better performance and higher stability.
When native features cannot meet your business needs, add extended processors after the configured native ones for supplementary processing.
Order constraint:
All plugins are executed sequentially in the order they are configured, which forms a processing chain. Note: All native processors must precede any extended processors. After you add an extended processor, you cannot add more native processors.
Regular expression parsing
Extract log fields using a regular expression and parse the log into key-value pairs. Each field can be independently queried and analyzed.
Example:
Raw log without any processing | Using the regular expression parsing plugin |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Regular Expression: Used to match logs. Generate it automatically or enter it manually:
Automatic generation:
Click Generate.
In the Log Sample, select the log content to be extracted.
Click Generate Regular Expression.

Manual entry: Manually Enter Regular Expression based on the log format.
After configuration, click Validate to test whether the regular expression can correctly parse the log content.
Extracted Field: Set the corresponding field name (Key) for the extracted log content (Value).
For more information about the other parameters, see the general configuration parameter descriptions in Use case 2: Structured logs.
Delimiter-based parsing
Structure log content using a delimiter, parsing it into multiple key-value pairs. Both single-character and multi-character delimiters are supported.
Example:
Raw log without any processing | Fields split by the specified character |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Delimiter: Specify the character used to split the log content.
Example: For a CSV file, select Custom and enter a comma (,).
Quote: When a field value contains the delimiter, you must specify a quote character to wrap the field to avoid incorrect splitting.
Extracted Field: Set a field name (Key) for each column in the order that they are separated. The following rules apply:
The field name can contain only letters, digits, and underscores (_).
Must start with a letter or an underscore (_).
Maximum length: 128 bytes.
For more information about the other parameters, see the general configuration parameter descriptions in Use case 2: Structured logs.
Standard JSON parsing
Structure an Object-type JSON log by parsing it into key-value pairs.
Example:
Raw log without any processing | Automatic extraction of standard JSON key-value pairs |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Original Field: The default value is `content`. This field is used to store the raw log content to be parsed.
For more information about the other parameters, see the general configuration parameter descriptions in Use case 2: Structured logs.
Nested JSON parsing
Parse a nested JSON log into key-value pairs by specifying the expansion depth.
Example:
Raw log without any processing | Expansion depth: 0, using expansion depth as a prefix | Expansion depth: 1, using expansion depth as a prefix |
| | |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select .
Original Field: The name of the raw field to be expanded, for example,
content.JSON Expansion Depth: The expansion level of the JSON object. 0 means fully expanded (default), 1 means the current level, and so on.
Character to Concatenate Expanded Keys: The connector for field names during JSON expansion. The default is an underscore (_).
Name Prefix of Expanded Keys: Specify a prefix for the field names after JSON expansion.
Expand Array: Enable this to expand arrays into key-value pairs with indexes.
Example:
{"k":["a","b"]}is expanded to{"k[0]":"a","k[1]":"b"}.To rename the expanded fields (for example, from prefix_s_key_k1 to new_field_name), add a Rename Fields plugin to complete the mapping.
For more information about the other parameters, see the general configuration parameter descriptions in Use case 2: Structured logs.
JSON array parsing
Use the json_extract function to extract JSON objects from a JSON array.
Example:
Raw log without any processing | Extract JSON array structure |
| |
Procedure: In the Processor Configurations section of the Logtail Configuration page, switch the Processing Mode to SPL, configure the SPL Statement, and use the json_extract function to extract JSON objects from the JSON array.
Example: Extract elements from the JSON array in the log field content and store the results in new fields json1 and json2.
* | extend json1 = json_extract(content, '$[0]'), json2 = json_extract(content, '$[1]')Apache log parsing
Structure log content based on the definitions in the Apache log configuration file, parsing it into multiple key-value pairs.
Example:
Raw log without any processing | Apache Common Log Format |
| |
Configuration steps: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select .
Log Format: combined
APACHE LogFormat Configuration: The system automatically fills in the configuration based on the Log Format.
ImportantBe sure to check the auto-filled content to ensure it is exactly the same as the LogFormat defined in the Apache configuration file on the server (usually located at /etc/apache2/apache2.conf).
For more information about the other parameters, see the general configuration parameter descriptions in Use case 2: Structured logs.
Data masking
Mask sensitive data in logs.
Example:
Raw log without any processing | Masking result |
| |
Procedure: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Original Field: The field that contains the log content before parsing.
Data Masking Method:
const: Replaces sensitive content with a constant string.
md5: Replaces sensitive content with its MD5 hash.
Replacement String: If Data Masking Method is set to const, enter a string to replace the sensitive content.
Content Expression that Precedes Replaced Content: The expression used to find sensitive content, which is configured using RE2 syntax.
Content Expression to Match Replaced Content: The regular expression used to match sensitive content. The expression must be written in RE2 syntax.
Time parsing
Parse the time field in the log and set the parsing result as the log's __time__ field.
Example:
Raw log without any processing | Time parsing |
|
|
Procedure: In the Processor Configurations section of the Logtail Configuration page, click Add Processor and select :
Original Field: The field that contains the log content before parsing.
Time Format: Set the time format that corresponds to the timestamps in the log.
Time Zone: Select the time zone for the log time field. By default, this is the time zone of the environment where the LoongCollector (Logtail) process is running.
Appendix: Regular expression limits (Container filtering)
The regular expressions used for container filtering are based on Go's RE2 engine, which has some syntax limitations compared to other engines such as PCRE. Note the following when you write regular expressions:
1. Named group syntax differences
Go uses the (?P<name>...) syntax to define named groups and does not support the (?<name>...) syntax from PCRE.
Correct example:
(?P<year>\d{4})Incorrect syntax:
(?<year>\d{4})
2. Unsupported regular expression features
The following common but complex regular expression features are not available in RE2. Avoid using them:
Assertions:
(?=...),(?!...),(?<=...), or(?<!...)Conditional expressions:
(?(condition)true|false)Recursive matching:
(?R)or(?0)Subprogram references:
(?&name)or(?P>name)Atomic groups:
(?>...)
3. Recommendations
We recommend that you use tools such as Regex101 to debug regular expressions. Select the Golang (RE2) mode for validation to ensure compatibility. If you use any of the unsupported syntax mentioned above, the plugin will not parse or match correctly.
> Create Machine Group


