LoongCollector is the next-generation log collection agent for Alibaba Cloud Simple Log Service (SLS) and is an upgraded version of Logtail. This topic describes how to install LoongCollector in a Kubernetes cluster. You can install it in either DaemonSet mode or Sidecar mode.
Preparations
Before you install LoongCollector, verify the network connection between your cluster nodes and the SLS endpoint. This ensures that LoongCollector can report data correctly.
Obtain the service endpoint:
Log on to the Simple Log Service console. In the project list, click the destination project.
Click the
icon next to the project name to go to the project overview page.In the Endpoint section, locate the public and private endpoints for the project's region.
Test the connection: Log on to the cluster node where you plan to install the LoongCollector component and run the following
curlcommand. Replace${ProjectName}and${SLS_ENDPOINT}with your actual information.curl https://${Project_Name}.${SLS_ENDPOINT}Check the result:
If the command returns
{"Error":{"Code":"OLSInvalidMethod",...}}, the network connection between your node and SLS is working.NoteThis test verifies only network-layer connectivity. SLS returns an error because the request is missing required API parameters. This behavior is expected.
If the command times out or returns other network-layer errors, such as
Connection refused, this indicates a network connectivity failure. Check the node's network configuration, security group rules, or DNS resolution.
Choose an installation method
Choose an installation method from the following table based on your cluster type and requirements.
Installation method | Use cases |
Collect logs from ACK managed and dedicated clusters within the same Alibaba Cloud account and region. | |
| |
Collect logs from specific applications with the following requirements:
|
Install on an ACK cluster (DaemonSet mode)
If you are using logtail-ds and want to upgrade to LoongCollector, you must uninstall logtail-ds before you install LoongCollector.
Install LoongCollector with a single click in the Alibaba Cloud Container Service for Kubernetes (ACK) console. By default, container logs from the cluster are collected into an SLS project in the same account and region. To collect logs across accounts or regions, see Install on a self-managed cluster (DaemonSet mode).
Install on an existing ACK managed cluster
Log on to the ACK console. In the navigation pane on the left, choose Clusters.
On the Clusters page, click the name of the target cluster. In the navigation pane on the left, click Add-ons.
On the Logs and Monitoring tab, find loongcollector, and click Install.
After the installation is complete, SLS automatically creates the following resources in the region where the ACK cluster is located. Log on to the Simple Log Service console to view them.
Resource type
Resource name
Purpose
Project
k8s-log-${cluster_id}A resource management unit that isolates logs from different services.
Machine group
k8s-group-${cluster_id}The machine group for loongcollector-ds. It is mainly used for log collection.
k8s-group-${cluster_id}-clusterThe machine group for loongcollector-cluster. It is mainly used for metric collection.
k8s-group-${cluster_id}-singletonA single-instance machine group. It is mainly used for some single-instance collection configurations.
Logstore
config-operation-logImportantDo not delete this logstore.
Stores logs from the alibaba-log-controller in the LoongCollector component. It is billed in the same way as a standard logstore. For more information, see Billing items for the pay-by-data-written mode. Do not create collection configurations in this logstore.
Install when creating a new ACK managed cluster
Log on to the ACK console. In the navigation pane on the left, choose Clusters.
Click Create Kubernetes Cluster. On the Advanced Settings section, SLS is enabled by default. Click Modify Default Configuration to create a project or use an existing project.
This topic describes only the configurations related to SLS. For more information about other configuration items, see Create an ACK managed cluster.
When you select Create Project, SLS creates the following resources by default. Log on to the Simple Log Service console to view them.
Resource type
Resource name
Purpose
Project
k8s-log-${cluster_id}A resource management unit that isolates logs from different services.
Machine group
k8s-group-${cluster_id}The machine group for loongcollector-ds. It is mainly used for log collection.
k8s-group-${cluster_id}-clusterThe machine group for loongcollector-cluster. It is mainly used for metric collection.
k8s-group-${cluster_id}-singletonA single-instance machine group. It is mainly used for some single-instance collection configurations.
Logstore
config-operation-logImportantDo not delete this logstore.
Stores logs from the alibaba-log-controller in the LoongCollector component. It is billed in the same way as a standard logstore. For more information, see Billing items for the pay-by-data-written mode. Do not create collection configurations in this logstore.
Install on a self-managed cluster (DaemonSet mode)
Use cases
Kubernetes clusters in self-managed data centers
Kubernetes clusters deployed on other cloud providers
Collecting container logs from Alibaba Cloud ACK clusters across different accounts or regions
Ensure your self-managed cluster runs Kubernetes 1.6 or later.
User guide
Download and decompress the installation package: On a machine where kubectl is installed and configured, run the command for your cluster's region to download LoongCollector and its dependent components.
#China regions wget https://aliyun-observability-release-cn-shanghai.oss-cn-shanghai.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh #Regions outside China wget https://aliyun-observability-release-ap-southeast-1.oss-ap-southeast-1.aliyuncs.com/loongcollector/k8s-custom-pkg/3.1.6/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.shModify the
values.yamlconfiguration file: In theloongcollector-custom-k8s-packagefolder, modify the./loongcollector/values.yamlconfiguration file.Description
values.yaml
# ===================== Required parameters ===================== # The name of the project for log collection in this cluster. Example: k8s-log-custom-sd89ehdq projectName: "" # The region of the project. Example for Shanghai: cn-shanghai region: "" # The ID of the Alibaba Cloud account that owns the project. Enclose the ID in quotation marks. Example: "123456789" aliUid: "" # The network to use. Options: Internet or Intranet. Default: Internet net: Internet # The AccessKey ID and AccessKey secret of the Alibaba Cloud account or RAM user. accessKeyID: "" accessKeySecret: "" # A custom cluster ID. The ID can contain uppercase letters, lowercase letters, digits, and hyphens (-). clusterID: "" # ... Other optional parameters are omitted ...projectName
String(Required)The name of the project to which LoongCollector uploads logs. The naming conventions are as follows:
The project name can contain only lowercase letters, digits, and hyphens (-).
It must start with a lowercase letter and end with a lowercase letter or a digit.
The name must be 3 to 63 characters in length.
region
String(Required)The ID of the region where the project is located. For more information, see Regions.
aliUid
String(Required)The ID of the Alibaba Cloud account that owns the project.
net
String(Required)The network type used to transmit log data.
Internet (default): The public network.
Intranet: The internal network.
accessKeyID
String(Required)The AccessKey ID used to access the project. Use the AccessKey of a Resource Access Management (RAM) user and grant the AliyunLogFullAccess system policy to the RAM user. For more information about RAM, see Overview of RAM users.
accessKeySecret
String(Required)The AccessKey secret that corresponds to the specified AccessKey ID.
clusterID
String(Required)A custom ID for the cluster. The name can contain only uppercase letters, lowercase letters, digits, and hyphens (-).
ImportantDo not use the same cluster ID for different Kubernetes clusters.
Execute the installation script: In the
loongcollector-custom-k8s-packagefolder, run the following command to install LoongCollector and its dependent components.bash k8s-custom-install.sh installVerify the installation: After the installation is complete, run the following command to check the component status:
# Check the pod status kubectl get po -n kube-system | grep loongcollector-dsSample result:
loongcollector-ds-gnmnh 1/1 Running 0 63sIf a component fails to start (its status is not Running):
Check the configuration: Verify that the configuration items in
values.yamlare correct.Check the image: Run the following command and check the
Eventssection of the output to confirm that the container image was pulled successfully.kubectl describe pod loongcollector-ds -n kube-system
After the components are installed, SLS automatically creates the following resources. Log on to the Simple Log Service console to view them.
Resource type
Resource name
Purpose
Project
The value of
projectNamethat you specified in thevalues.yamlfileA resource management unit that isolates logs of different services.
Machine group
k8s-group-${cluster_id}A collection of log collection nodes.
k8s-group-${cluster_id}-clusterThe machine group for loongcollector-cluster, mainly used for metric collection.
k8s-group-${cluster_id}-singletonA single-instance machine group, mainly used for single-instance collection configurations.
Logstore
config-operation-logImportantDo not delete this logstore.
Stores logs of the alibaba-log-controller component in LoongCollector. The billing is the same as that for a standard logstore. For more information, see Billing items for the pay-by-data-written mode. Do not create collection configurations in this logstore.
Install using the sidecar pattern
Use the sidecar pattern for fine-grained log management, multi-tenant data isolation, or to bind log collection to the application lifecycle. This pattern injects a separate LoongCollector (Logtail) container into the application pod to enable dedicated log collection within that pod. If you have not deployed an application or want to test the process, use the Appendix: YAML example to quickly verify the flow.
1. Modify the application pod YAML configuration
Define shared volumes
In
spec.template.spec.volumes, add three shared volumes at the same level ascontainers:volumes: # Shared log directory (written by the application container, read by the Sidecar) - name: ${shared_volume_name} # <-- The name must match the name in volumeMounts emptyDir: {} # Signal directory for inter-container communication (for graceful shutdown) - name: tasksite emptyDir: medium: Memory # Use memory as the medium for better performance sizeLimit: "50Mi" # Shared host timezone configuration: Synchronizes the timezone for all containers in the pod - name: tz-config # <-- The name must match the name in volumeMounts hostPath: path: /usr/share/zoneinfo/Asia/Shanghai # Modify the timezone as neededConfigure application container mounts
In the
volumeMountssection of your application container, such asyour-business-app-container, add the following mount items:Ensure that the application container writes logs to the
${shared_volume_path}directory to enable log collection by LoongCollector.volumeMounts: # Mount the shared log volume to the application log output directory - name: ${shared_volume_name} mountPath: ${shared_volume_path} # Example: /var/log/app # Mount the communication directory - name: tasksite mountPath: /tasksite # Shared directory for communication with the Loongcollector container # Mount the timezone file - name: tz-config mountPath: /etc/localtime readOnly: trueInject the LoongCollector Sidecar container
In the
spec.template.spec.containersarray, append the following Sidecar container definition:- name: loongcollector image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun command: ["/bin/bash", "-c"] args: - | echo "[$(date)] LoongCollector: Starting initialization" # Start the LoongCollector service /etc/init.d/loongcollectord start # Wait for the configuration to download and the service to be ready sleep 15 # Verify the service status if /etc/init.d/loongcollectord status; then echo "[$(date)] LoongCollector: Service started successfully" touch /tasksite/cornerstone else echo "[$(date)] LoongCollector: Failed to start service" exit 1 fi # Wait for the application container to complete (via the tombstone file signal) echo "[$(date)] LoongCollector: Waiting for business container to complete" until [[ -f /tasksite/tombstone ]]; do sleep 2 done # Allow time to upload remaining logs echo "[$(date)] LoongCollector: Business completed, waiting for log transmission" sleep 30 # Stop the service echo "[$(date)] LoongCollector: Stopping service" /etc/init.d/loongcollectord stop echo "[$(date)] LoongCollector: Shutdown complete" # Health check livenessProbe: exec: command: ["/etc/init.d/loongcollectord", "status"] initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3 # Resource configuration resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "2000m" memory: "2048Mi" # Environment variable configuration env: - name: ALIYUN_LOGTAIL_USER_ID value: "${your_aliyun_user_id}" - name: ALIYUN_LOGTAIL_USER_DEFINED_ID value: "${your_machine_group_user_defined_id}" - name: ALIYUN_LOGTAIL_CONFIG value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json" # Enable full drain mode to ensure all logs are sent before the pod terminates - name: enable_full_drain_mode value: "true" # Append pod environment information as log tags - name: ALIYUN_LOG_ENV_TAGS value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_" # Automatically inject pod and node metadata as log tags - name: "_pod_name_" valueFrom: fieldRef: fieldPath: metadata.name - name: "_pod_ip_" valueFrom: fieldRef: fieldPath: status.podIP - name: "_namespace_" valueFrom: fieldRef: fieldPath: metadata.namespace - name: "_node_name_" valueFrom: fieldRef: fieldPath: spec.nodeName - name: "_node_ip_" valueFrom: fieldRef: fieldPath: status.hostIP # Volume mounts (shared with the application container) volumeMounts: # Read-only mount for the application log directory - name: ${shared_volume_name} # <-- Shared log directory name mountPath: ${dir_containing_your_files} # <-- Path to the shared directory in the sidecar readOnly: true # Mount the communication directory - name: tasksite mountPath: /tasksite # Mount the timezone - name: tz-config mountPath: /etc/localtime readOnly: true
2. Modify the application container's lifecycle logic
Depending on the workload type, you must modify the application container to support a coordinated exit with the Sidecar.
Short-lived tasks (Job/CronJob)
# 1. Wait for LoongCollector to be ready
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# 2. Execute core business logic (ensure logs are written to the shared directory)
echo "Hello, World!" >> /app/logs/business.log
# 3. Save the exit code
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# 4. Notify LoongCollector that the business task is complete
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcodeLong-lived services (Deployment/StatefulSet)
# Define the signal handler function
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE3. Set the graceful termination period
In spec.template.spec, set a sufficient termination grace period to ensure LoongCollector has enough time to upload the remaining logs.
spec:
# ... Your other existing spec configurations ...
template:
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful shutdown period4. Variable descriptions
Variable | Description |
| Set this to the ID of your Alibaba Cloud account. For more information, see Configure user identifiers. |
| Set a custom ID for the machine group. This ID is used to create a custom machine group. Example: Important Ensure that this ID is unique within the region of your project. |
| Specify the configuration based on the region of your SLS project and the network type used for access. For information about regions, see Service regions. Example: If your project is in the China (Hangzhou) region, use |
| Set a custom name for the volume. Important The |
| Set the mount path. This is the directory in the container where the text logs to be collected are located. |
5. Apply the configuration and verify the result
Run the following command to deploy the changes:
kubectl apply -f <YOUR-YAML>Check the pod status to confirm that the LoongCollector container was injected successfully:
kubectl describe pod <YOUR-POD-NAME>If the status of the two containers (the application container and
loongcollector) is Normal, the injection is successful.
6. Create a machine group with a custom identifier
Log on to the Simple Log Service console and click the target project.
In the navigation pane on the left, choose . Next to Machine Groups, click .
In the Create Machine Group dialog box, configure the following parameters and click OK.
Name: The name of the machine group. It cannot be modified after creation. The naming conventions are as follows:
Can contain only lowercase letters, digits, hyphens (-), and underscores (_).
Must start and end with a lowercase letter or a digit.
Must be 2 to 128 characters in length.
Machine Group Identifier: Select Custom Identifier.
Custom Identifier: Enter the value of the
ALIYUN_LOGTAIL_USER_DEFINED_IDenvironment variable that you set for the LoongCollector container in the YAML file in 1. Modify the application pod YAML configuration. The value must be an exact match. Otherwise, the association will fail.
Check the heartbeat status of the machine group: After the machine group is created, click its name and view the heartbeat status in the status area.
OK: Indicates that LoongCollector has successfully connected to SLS and the machine group is registered.
FAIL:
The configuration may not have taken effect yet. It takes about 2 minutes for the configuration to become effective. Refresh the page and try again later.
If the status is still FAIL after 2 minutes, see Troubleshoot Logtail machine group issues to diagnose the issue.
Each pod corresponds to a separate LoongCollector instance. Use different custom identifiers for different applications or environments to facilitate fine-grained management.
FAQ
How do I modify the LoongCollector configuration for an ACK managed cluster to collect logs across accounts or regions?
How do I collect container logs from Alibaba Cloud ACK Edge, ACK One, ACS, and ACK Serverless clusters?
What to do next
After you install LoongCollector, see Collect container logs from a Kubernetes cluster. This topic describes core principles, key processes, selection recommendations, and best practices. Then, create a collection configuration using one of the following methods:
Appendix: YAML examples
This example provides a complete Kubernetes deployment configuration that includes an NGINX application container and a LoongCollector sidecar container. This configuration is suitable for collecting container logs using the sidecar pattern.
Before you start, make the following three replacements:
Replace
${your_aliyun_user_id}with your Alibaba Cloud account UID.Replace
${your_machine_group_user_defined_id}with the custom ID of the machine group that you created in Step 3. The ID must be an exact match.Replace
${your_region_config}with the configuration name that matches the region and network type of your SLS project.For example, for a project in the China (Hangzhou) region, use
cn-hangzhoufor internal network access orcn-hangzhou-internetfor public network access.
Short-lived tasks (Job/CronJob)
apiVersion: batch/v1
kind: Job
metadata:
name: demo-job
spec:
backoffLimit: 3
activeDeadlineSeconds: 3600
completions: 1
parallelism: 1
template:
spec:
restartPolicy: Never
terminationGracePeriodSeconds: 300
containers:
# Application container
- name: demo-job
image: debian:bookworm-slim
command: ["/bin/bash", "-c"]
args:
- |
# Wait for LoongCollector to be ready
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# Execute business logic
echo "Hello, World!" >> /app/logs/business.log
# Save the exit code
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# Notify LoongCollector that the business task is complete
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcode
# Resource limits
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500"
memory: "512Mi"
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
- name: tasksite
mountPath: /tasksite
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-hongkong.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service
/etc/init.d/loongcollectord start
# Wait for the configuration to download and the service to be ready
sleep 15
# Verify the service status
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "your-user-id"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "your-user-defined-id"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/cn-hongkong/ilogtail_config.json"
- name: ALIYUN_LOG_ENV_TAGS
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_"
# Pod information injection
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: app-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "10Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
Long-lived services (Deployment/StatefulSet)
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
namespace: production
labels:
app: nginx-demo
version: v1.0.0
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
version: v1.0.0
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful shutdown period
containers:
# Application container - Web application
- name: nginx-demo
image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
# Startup command and signal handling
command: ["/bin/sh", "-c"]
args:
- |
# Define the signal handler function
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE
# Resource configuration
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "1Gi"
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service
/etc/init.d/loongcollectord start
# Wait for the configuration to download and the service to be ready
sleep 15
# Verify the service status
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "2000m"
memory: "2048Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "${your_aliyun_user_id}"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "${your_machine_group_user_defined_id}"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json"
# Enable full drain mode to ensure all logs are sent when the pod stops
- name: enable_full_drain_mode
value: "true"
# Append pod environment information as log tags
- name: "ALIYUN_LOG_ENV_TAGS"
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_"
# Get pod and node information
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: "_node_ip_"
valueFrom:
fieldRef:
fieldPath: status.hostIP
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: nginx-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "50Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
> Create Machine Group