LoongCollector is the next-generation log collection agent for Alibaba Cloud Simple Log Service (SLS) and is an upgraded version of Logtail. This topic describes how to install LoongCollector in a Kubernetes cluster. You can install it in either DaemonSet mode or Sidecar mode.
Preparations
Before you install LoongCollector, verify the network connection between your cluster nodes and the SLS endpoint. This ensures that LoongCollector can report data correctly.
Obtain the service endpoint:
Log on to the Simple Log Service console. In the project list, click the destination project.
Click the
icon next to the project name to go to the project overview page.In the Endpoint section, locate the public and private endpoints for the project's region.
Test the connection: Log on to the cluster node where you plan to install the LoongCollector component and run the following
curlcommand. Replace${ProjectName}and${SLS_ENDPOINT}with your actual information.curl https://${Project_Name}.${SLS_ENDPOINT}Check the result:
If the command returns
{"Error":{"Code":"OLSInvalidMethod",...}}, the network connection between your node and SLS is working.NoteThis test verifies only network-layer connectivity. SLS returns an error because the request is missing required API parameters. This behavior is expected.
If the command times out or returns other network-layer errors, such as
Connection refused, this indicates a network connectivity failure. Check the node's network configuration, security group rules, or DNS resolution.
Choose an installation method
Choose an installation method from the following table based on your cluster type and requirements.
Installation method | Use cases |
Collect logs from ACK managed and dedicated clusters within the same Alibaba Cloud account and region. | |
| |
Collect logs from specific applications with the following requirements:
|
Install on an ACK cluster (DaemonSet mode)
If you are using logtail-ds and want to upgrade to LoongCollector, you must uninstall logtail-ds before you install LoongCollector.
Install LoongCollector with a single click in the Alibaba Cloud Container Service for Kubernetes (ACK) console. By default, container logs from the cluster are collected into an SLS project in the same account and region. To collect logs across accounts or regions, see Install on a self-managed cluster (DaemonSet mode).
Install on an existing ACK managed cluster
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, find the one you want to manage and click its name. In the left navigation pane, click Add-ons.
On the Logs and Monitoring tab, find loongcollector, and click Install.
After the installation is complete, SLS automatically creates the following resources in the region where the ACK cluster is located. Log on to the Simple Log Service console to view them.
Resource type
Resource name
Purpose
Project
k8s-log-${cluster_id}A resource management unit that isolates logs from different services.
Machine group
k8s-group-${cluster_id}The machine group for loongcollector-ds. It is mainly used for log collection.
k8s-group-${cluster_id}-clusterThe machine group for loongcollector-cluster. It is mainly used for metric collection.
k8s-group-${cluster_id}-singletonA single-instance machine group. It is mainly used for some single-instance collection configurations.
Logstore
config-operation-logImportantDo not delete this logstore.
Stores logs from the alibaba-log-controller in the LoongCollector component. It is billed in the same way as a standard logstore. For more information, see Billing items for the pay-by-data-written mode. Do not create collection configurations in this logstore.
Install when creating a new ACK managed cluster
Log on to the ACK console. In the left navigation pane, click Clusters.
Click Create Kubernetes Cluster. On the Advanced Settings section, SLS is enabled by default. Click Modify Default Configuration to create a project or use an existing project.
This topic describes only the configurations related to SLS. For more information about other configuration items, see Create an ACK managed cluster.
When you select Create Project, SLS creates the following resources by default. Log on to the Simple Log Service console to view them.
Resource type
Resource name
Purpose
Project
k8s-log-${cluster_id}A resource management unit that isolates logs from different services.
Machine group
k8s-group-${cluster_id}The machine group for loongcollector-ds. It is mainly used for log collection.
k8s-group-${cluster_id}-clusterThe machine group for loongcollector-cluster. It is mainly used for metric collection.
k8s-group-${cluster_id}-singletonA single-instance machine group. It is mainly used for some single-instance collection configurations.
Logstore
config-operation-logImportantDo not delete this logstore.
Stores logs from the alibaba-log-controller in the LoongCollector component. It is billed in the same way as a standard logstore. For more information, see Billing items for the pay-by-data-written mode. Do not create collection configurations in this logstore.
Install on a self-managed cluster (DaemonSet mode)
Use cases
Kubernetes clusters in self-managed data centers
Kubernetes clusters deployed on other cloud providers
Collecting container logs from Alibaba Cloud ACK clusters across different accounts or regions
Ensure your self-managed cluster runs Kubernetes 1.6 or later.
User guide
Download and decompress the installation package: On a machine where kubectl is installed and configured, run the command for your cluster's region to download LoongCollector and its dependent components.
#China regions wget https://aliyun-observability-release-cn-shanghai.oss-cn-shanghai.aliyuncs.com/loongcollector/k8s-custom-pkg/3.0.12/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.sh #Regions outside China wget https://aliyun-observability-release-ap-southeast-1.oss-ap-southeast-1.aliyuncs.com/loongcollector/k8s-custom-pkg/3.1.6/loongcollector-custom-k8s-package.tgz; tar xvf loongcollector-custom-k8s-package.tgz; chmod 744 ./loongcollector-custom-k8s-package/k8s-custom-install.shModify the
values.yamlconfiguration file: In theloongcollector-custom-k8s-packagefolder, modify the./loongcollector/values.yamlconfiguration file.Description
values.yaml
# ===================== Required parameters ===================== # The name of the project for log collection in this cluster. Example: k8s-log-custom-sd89ehdq projectName: "" # The region of the project. Example for Shanghai: cn-shanghai region: "" # The ID of the Alibaba Cloud account that owns the project. Enclose the ID in quotation marks. Example: "123456789" aliUid: "" # The network to use. Options: Internet or Intranet. Default: Internet net: Internet # The AccessKey ID and AccessKey secret of the Alibaba Cloud account or RAM user. accessKeyID: "" accessKeySecret: "" # A custom cluster ID. The ID can contain uppercase letters, lowercase letters, digits, and hyphens (-). clusterID: "" # ... Other optional parameters are omitted ...projectName
String(Required)The name of the project to which LoongCollector uploads logs. The naming conventions are as follows:
The project name can contain only lowercase letters, digits, and hyphens (-).
It must start with a lowercase letter and end with a lowercase letter or a digit.
The name must be 3 to 63 characters in length.
region
String(Required)The ID of the region where the project is located. For more information, see Regions.
aliUid
String(Required)The ID of the Alibaba Cloud account that owns the project.
net
String(Required)The network type used to transmit log data.
Internet (default): The public network.
Intranet: The internal network.
accessKeyID
String(Required)The AccessKey ID used to access the project. Use the AccessKey of a Resource Access Management (RAM) user and grant the AliyunLogFullAccess system policy to the RAM user. For more information about RAM, see Overview of RAM users.
accessKeySecret
String(Required)The AccessKey secret that corresponds to the specified AccessKey ID.
clusterID
String(Required)A custom ID for the cluster. The name can contain only uppercase letters, lowercase letters, digits, and hyphens (-).
ImportantDo not use the same cluster ID for different Kubernetes clusters.
Execute the installation script: In the
loongcollector-custom-k8s-packagefolder, run the following command to install LoongCollector and its dependent components.bash k8s-custom-install.sh installVerify the installation: After the installation is complete, run the following command to check the component status:
# Check the pod status kubectl get po -n kube-system | grep loongcollector-dsSample result:
loongcollector-ds-gnmnh 1/1 Running 0 63sIf a component fails to start (its status is not Running):
Check the configuration: Verify that the configuration items in
values.yamlare correct.Check the image: Run the following command and check the
Eventssection of the output to confirm that the container image was pulled successfully.kubectl describe pod loongcollector-ds -n kube-system
After the components are installed, SLS automatically creates the following resources. Log on to the Simple Log Service console to view them.
Resource type
Resource name
Purpose
Project
The value of
projectNamethat you specified in thevalues.yamlfileA resource management unit that isolates logs of different services.
Machine group
k8s-group-${cluster_id}A collection of log collection nodes.
k8s-group-${cluster_id}-clusterThe machine group for loongcollector-cluster, mainly used for metric collection.
k8s-group-${cluster_id}-singletonA single-instance machine group, mainly used for single-instance collection configurations.
Logstore
config-operation-logImportantDo not delete this logstore.
Stores logs of the alibaba-log-controller component in LoongCollector. The billing is the same as that for a standard logstore. For more information, see Billing items for the pay-by-data-written mode. Do not create collection configurations in this logstore.
Install using the sidecar pattern
Use the sidecar pattern for fine-grained log management, multi-tenant data isolation, or to bind log collection to the application lifecycle. This pattern injects a separate LoongCollector (Logtail) container into the application pod to enable dedicated log collection within that pod. If you have not deployed an application or want to test the process, use the Appendix: YAML example to quickly verify the flow.
1. Modify the application pod YAML configuration
Define the shared volumes
In
spec.template.spec.volumes, add three shared volumes at the same level as thecontainerssection:volumes: # Shared log directory (written by the application container, read by the Sidecar) - name: ${shared_volume_name} # <-- The name must be the same as the name in volumeMounts. emptyDir: {} # Signaling directory for inter-container communication (for graceful start and stop) - name: tasksite emptyDir: medium: Memory # Use memory as the medium for better performance. sizeLimit: "50Mi" # Shared host timezone configuration: Synchronize the timezones of all containers in the pod. - name: tz-config # <-- The name must be the same as the name in volumeMounts. hostPath: path: /usr/share/zoneinfo/Asia/Shanghai # Modify the timezone as needed.Configure the volume mounts for the application container
In the
volumeMountssection of your application container, such asyour-business-app-container, add the following mount configurations:Ensure that the application container writes logs to the
${shared_volume_path}directory so that LoongCollector can collect them.volumeMounts: # Mount the shared log volume to the application log output directory. - name: ${shared_volume_name} mountPath: ${shared_volume_path} # Example: /var/log/app # Mount the communication directory. - name: tasksite mountPath: /tasksite # Shared directory for communication with the LoongCollector container. # Mount the timezone file. - name: tz-config mountPath: /etc/localtime readOnly: trueInject the LoongCollector Sidecar container
In the
spec.template.spec.containersarray, add the following Sidecar container definition:- name: loongcollector image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun command: ["/bin/bash", "-c"] args: - | echo "[$(date)] LoongCollector: Starting initialization" # Start the LoongCollector service. /etc/init.d/loongcollectord start # Wait for the configuration to be downloaded and the service to be ready. sleep 15 # Verify the service status. if /etc/init.d/loongcollectord status; then echo "[$(date)] LoongCollector: Service started successfully" touch /tasksite/cornerstone else echo "[$(date)] LoongCollector: Failed to start service" exit 1 fi # Wait for the application container to complete (via the tombstone file signal). echo "[$(date)] LoongCollector: Waiting for business container to complete" until [[ -f /tasksite/tombstone ]]; do sleep 2 done # Allow time to upload remaining logs. echo "[$(date)] LoongCollector: Business completed, waiting for log transmission" sleep 30 # Stop the service. echo "[$(date)] LoongCollector: Stopping service" /etc/init.d/loongcollectord stop echo "[$(date)] LoongCollector: Shutdown complete" # Health check livenessProbe: exec: command: ["/etc/init.d/loongcollectord", "status"] initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3 # Resource configuration resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "2000m" memory: "2048Mi" # Environment variable configuration env: - name: ALIYUN_LOGTAIL_USER_ID value: "${your_aliyun_user_id}" - name: ALIYUN_LOGTAIL_USER_DEFINED_ID value: "${your_machine_group_user_defined_id}" - name: ALIYUN_LOGTAIL_CONFIG value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json" # Enable full drain mode to ensure all logs are sent before the pod terminates. - name: enable_full_drain_mode value: "true" # Append pod environment information as log tags. - name: ALIYUN_LOG_ENV_TAGS value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_" # Automatically inject pod and node metadata as log tags. - name: "_pod_name_" valueFrom: fieldRef: fieldPath: metadata.name - name: "_pod_ip_" valueFrom: fieldRef: fieldPath: status.podIP - name: "_namespace_" valueFrom: fieldRef: fieldPath: metadata.namespace - name: "_node_name_" valueFrom: fieldRef: fieldPath: spec.nodeName - name: "_node_ip_" valueFrom: fieldRef: fieldPath: status.hostIP # Volume mounts (shared with the application container) volumeMounts: # Read-only mount of the application log directory. - name: ${shared_volume_name} # <-- Shared log directory name mountPath: ${dir_containing_your_files} # <-- Path of the shared directory in the sidecar readOnly: true # Mount the communication directory. - name: tasksite mountPath: /tasksite # Mount the timezone. - name: tz-config mountPath: /etc/localtime readOnly: true
2. Modify the application container's lifecycle logic
Depending on the workload type, you must modify the application container to support a coordinated exit with the Sidecar.
Short-lived tasks (Job/CronJob)
# 1. Wait for LoongCollector to be ready.
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# 2. Execute the core business logic (ensure logs are written to the shared directory).
echo "Hello, World!" >> /app/logs/business.log
# 3. Save the exit code.
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# 4. Notify LoongCollector that the business task is complete.
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcodeLong-lived services (Deployment/StatefulSet)
# Define the signal handler function.
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop.
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully.
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped.
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler.
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready.
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx.
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process.
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal.
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE3. Set the graceful termination period
In spec.template.spec, set a sufficient termination grace period to ensure that LoongCollector has enough time to upload the remaining logs.
spec:
# ... Your other existing spec configurations ...
template:
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful stop period
4. Parameter description
Variable | Description |
| Set this to the ID of your Alibaba Cloud account. For more information, see Configure user identifiers. |
| Set a custom identifier for the machine group to create a custom machine group. For example, Important Make sure that this identifier is unique within the region of your project. |
| Specify this based on the region of the Simple Log Service project and the network type for access. For information about regions, see Service regions. Example: If the project is in the China (Hangzhou) region, use |
| Set a custom name for the volume. Important The |
| Set the mount path, which is the directory in the container where the text logs to be collected are located. |
5. Apply the configuration and verify the result
Run the following command to deploy the changes:
kubectl apply -f <YOUR-YAML>View the pod status to confirm that the LoongCollector container is successfully injected:
kubectl describe pod <YOUR-POD-NAME>If you see two containers (the application container and the
loongcollectorcontainer) and their status is normal, the injection is successful.
6. Create a machine group with a custom identifier
Log on to the Simple Log Service console and click the target project.
In the navigation pane on the left, choose . Next to Machine Groups, click .
In the Create Machine Group dialog box, configure the following parameters and click OK.
Name: The name of the machine group. It cannot be modified after creation. The naming conventions are as follows:
Can contain only lowercase letters, digits, hyphens (-), and underscores (_).
Must start and end with a lowercase letter or a digit.
Must be 2 to 128 characters in length.
Machine Group Identifier: Select Custom Identifier.
Custom Identifier: Enter the value of the
ALIYUN_LOGTAIL_USER_DEFINED_IDenvironment variable that you set for the LoongCollector container in the YAML file in 1. Modify the application pod YAML configuration. The value must be an exact match. Otherwise, the association will fail.
Check the heartbeat status of the machine group: After the machine group is created, click its name and view the heartbeat status in the status area.
OK: Indicates that LoongCollector has successfully connected to SLS and the machine group is registered.
FAIL:
The configuration may not have taken effect yet. It takes about 2 minutes for the configuration to become effective. Refresh the page and try again later.
If the status is still FAIL after 2 minutes, see Troubleshoot Logtail machine group issues to diagnose the issue.
Each pod corresponds to a separate LoongCollector instance. Use different custom identifiers for different applications or environments to facilitate fine-grained management.
FAQ
How do I modify the LoongCollector configuration for an ACK managed cluster to collect logs across accounts or regions?
How do I collect container logs from Alibaba Cloud ACK Edge, ACK One, ACS, and ACK Serverless clusters?
Next steps
After you install LoongCollector, see Collect container logs from a Kubernetes cluster. This topic describes core principles, key processes, selection recommendations, and best practices. Then, create a collection configuration using one of the following methods:
Appendix: YAML examples
This example provides a complete Kubernetes deployment configuration that includes an NGINX application container and a LoongCollector sidecar container. This configuration is suitable for collecting container logs using the sidecar pattern.
Before you start, make the following three replacements:
Replace
${your_aliyun_user_id}with your Alibaba Cloud account UID.Replace
${your_machine_group_user_defined_id}with the custom ID of the machine group that you created in Step 3. The ID must be an exact match.Replace
${your_region_config}with the configuration name that matches the region and network type of your SLS project.For example, for a project in the China (Hangzhou) region, use
cn-hangzhoufor internal network access orcn-hangzhou-internetfor public network access.
Short-lived tasks (Job/CronJob)
apiVersion: batch/v1
kind: Job
metadata:
name: demo-job
spec:
backoffLimit: 3
activeDeadlineSeconds: 3600
completions: 1
parallelism: 1
template:
spec:
restartPolicy: Never
terminationGracePeriodSeconds: 300
containers:
# Application container
- name: demo-job
image: debian:bookworm-slim
command: ["/bin/bash", "-c"]
args:
- |
# Wait for LoongCollector to be ready.
echo "[$(date)] Business: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] Business: LoongCollector is ready, starting business logic"
# Execute business logic.
echo "Hello, World!" >> /app/logs/business.log
# Save the exit code.
retcode=$?
echo "[$(date)] Business: Task completed with exit code: $retcode"
# Notify LoongCollector that the business task is complete.
touch /tasksite/tombstone
echo "[$(date)] Business: Tombstone created, exiting"
exit $retcode
# Resource limits
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500"
memory: "512Mi"
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
- name: tasksite
mountPath: /tasksite
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-hongkong.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service.
/etc/init.d/loongcollectord start
# Wait for the configuration to be downloaded and the service to be ready.
sleep 15
# Verify the service status.
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete.
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs.
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "your-user-id"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "your-user-defined-id"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/cn-hongkong/ilogtail_config.json"
- name: ALIYUN_LOG_ENV_TAGS
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_"
# Pod information injection
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Volume mounts
volumeMounts:
- name: app-logs
mountPath: /app/logs
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: app-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "10Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
Long-lived services (Deployment/StatefulSet)
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
namespace: production
labels:
app: nginx-demo
version: v1.0.0
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
version: v1.0.0
spec:
terminationGracePeriodSeconds: 600 # 10-minute graceful stop period
containers:
# Application container - Web application
- name: nginx-demo
image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
# Start command and signal handling
command: ["/bin/sh", "-c"]
args:
- |
# Define the signal handler function.
_term_handler() {
echo "[$(date)] [nginx-demo] Caught SIGTERM, starting graceful shutdown..."
# Send a QUIT signal to Nginx for a graceful stop.
if [ -n "$NGINX_PID" ]; then
kill -QUIT "$NGINX_PID" 2>/dev/null || true
echo "[$(date)] [nginx-demo] Sent SIGQUIT to Nginx PID: $NGINX_PID"
# Wait for Nginx to stop gracefully.
wait "$NGINX_PID"
EXIT_CODE=$?
echo "[$(date)] [nginx-demo] Nginx stopped with exit code: $EXIT_CODE"
fi
# Notify LoongCollector that the application container has stopped.
echo "[$(date)] [nginx-demo] Writing tombstone file"
touch /tasksite/tombstone
exit $EXIT_CODE
}
# Register the signal handler.
trap _term_handler SIGTERM SIGINT SIGQUIT
# Wait for LoongCollector to be ready.
echo "[$(date)] [nginx-demo]: Waiting for LoongCollector to be ready..."
until [[ -f /tasksite/cornerstone ]]; do
sleep 1
done
echo "[$(date)] [nginx-demo]: LoongCollector is ready, starting business logic"
# Start Nginx.
echo "[$(date)] [nginx-demo] Starting Nginx..."
nginx -g 'daemon off;' &
NGINX_PID=$!
echo "[$(date)] [nginx-demo] Nginx started with PID: $NGINX_PID"
# Wait for the Nginx process.
wait $NGINX_PID
EXIT_CODE=$?
# Also notify LoongCollector if the exit was not caused by a signal.
if [ ! -f /tasksite/tombstone ]; then
echo "[$(date)] [nginx-demo] Unexpected exit, writing tombstone"
touch /tasksite/tombstone
fi
exit $EXIT_CODE
# Resource configuration
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "1Gi"
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# LoongCollector Sidecar container
- name: loongcollector
image: aliyun-observability-release-registry.cn-shenzhen.cr.aliyuncs.com/loongcollector/loongcollector:v3.1.1.0-20fa5eb-aliyun
command: ["/bin/bash", "-c"]
args:
- |
echo "[$(date)] LoongCollector: Starting initialization"
# Start the LoongCollector service.
/etc/init.d/loongcollectord start
# Wait for the configuration to be downloaded and the service to be ready.
sleep 15
# Verify the service status.
if /etc/init.d/loongcollectord status; then
echo "[$(date)] LoongCollector: Service started successfully"
touch /tasksite/cornerstone
else
echo "[$(date)] LoongCollector: Failed to start service"
exit 1
fi
# Wait for the application container to complete.
echo "[$(date)] LoongCollector: Waiting for business container to complete"
until [[ -f /tasksite/tombstone ]]; do
sleep 2
done
echo "[$(date)] LoongCollector: Business completed, waiting for log transmission"
# Allow enough time to transmit remaining logs.
sleep 30
echo "[$(date)] LoongCollector: Stopping service"
/etc/init.d/loongcollectord stop
echo "[$(date)] LoongCollector: Shutdown complete"
# Health check
livenessProbe:
exec:
command: ["/etc/init.d/loongcollectord", "status"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Resource configuration
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "2000m"
memory: "2048Mi"
# Environment variable configuration
env:
- name: ALIYUN_LOGTAIL_USER_ID
value: "${your_aliyun_user_id}"
- name: ALIYUN_LOGTAIL_USER_DEFINED_ID
value: "${your_machine_group_user_defined_id}"
- name: ALIYUN_LOGTAIL_CONFIG
value: "/etc/ilogtail/conf/${your_region_config}/ilogtail_config.json"
# Enable full drain mode to ensure all logs are sent when the pod stops.
- name: enable_full_drain_mode
value: "true"
# Append pod environment information as log tags.
- name: "ALIYUN_LOG_ENV_TAGS"
value: "_pod_name_|_pod_ip_|_namespace_|_node_name_|_node_ip_"
# Get pod and node information.
- name: "_pod_name_"
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: "_pod_ip_"
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: "_namespace_"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "_node_name_"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: "_node_ip_"
valueFrom:
fieldRef:
fieldPath: status.hostIP
# Volume mounts
volumeMounts:
- name: nginx-logs
mountPath: /var/log/nginx
readOnly: true
- name: tasksite
mountPath: /tasksite
- name: tz-config
mountPath: /etc/localtime
readOnly: true
# Volume definitions
volumes:
- name: nginx-logs
emptyDir: {}
- name: tasksite
emptyDir:
medium: Memory
sizeLimit: "50Mi"
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Shanghai
> Create Machine Group