A running GPU-HPN pod generates stdout and stderr logs. This topic describes how to collect these logs from a GPU-HPN pod by using a sidecar container.
This topic applies only to ACS GPU-HPN pods. For information about how to collect logs from general-purpose pods and accelerated pods, see Use a custom solution to collect application logs.
Procedure
Step 1: Deploy a multi-container workload
Deploy a GPU-HPN workload that contains three containers. The purpose of each container is as follows:
print-stdout
: prints the stdout log.print-stderr
: prints the stderr log.shared-std-volume
: mounts stdout of all containers to the/var/log/container-std
directory of the container.
Create a file named dep-with-std-volume.yaml.
apiVersion: apps/v1 kind: Deployment metadata: name: dep-with-std-volume namespace: default spec: replicas: 1 selector: matchLabels: app: dep-with-std-volume template: metadata: labels: alibabacloud.com/compute-class: "gpu-hpn" alibabacloud.com/hpn-type: "rdma" app: dep-with-std-volume spec: containers: - name: print-stdout image: registry.cn-wulanchabu.aliyuncs.com/acs/ubuntu:latest # Print the stdout log. command: ["bash", "-c", "echo 'this log is from print-stdout container'; tail -f /dev/null"] resources: requests: cpu: "1" memory: "1Gi" limits: cpu: "1" memory: "1Gi" - name: print-stderr image: registry.cn-wulanchabu.aliyuncs.com/acs/ubuntu:latest # Print the stderr log. command: ["bash", "-c", "echo 'this log is from print-stderr container' >&2; tail -f /dev/null"] resources: requests: cpu: "1" memory: "1Gi" limits: cpu: "1" memory: "1Gi" - name: shared-std-volume image: registry.cn-wulanchabu.aliyuncs.com/acs/ubuntu:latest command: ["sleep", "infinity"] resources: requests: cpu: "1" memory: "1Gi" limits: cpu: "1" memory: "1Gi" volumeMounts: - mountPath: /var/log/container-std name: stdout-volume volumes: # Declare a shared volume for stdout. - name: stdout-volume emptyDir: medium: Stdout
Deploy the workload.
kubectl apply -f dep-with-std-volume.yaml
Query the status of the pods after the workload is deployed.
kubectl get pod -n default
Expected output:
NAME READY STATUS RESTARTS AGE dep-with-std-volume-xxxx-xxxx 3/3 Running 0 9m
Step 2: Verify the logs
Log on to the shared-std-volume container.
kubectl exec -it dep-with-std-volume-xxxx-xxxx -c shared-std-volume -- /bin/bash
View the shared directory.
ls /var/log/container-std/*
Expected output:
/var/log/container-std/print-stderr: 0.log /var/log/container-std/print-stdout: 0.log /var/log/container-std/shared-std-volume: 0.log
The directory contains three subdirectories, which are named after the three containers. You can find a file named
0.log
in each subdirectory.Each entry in the log file consists of the UTC time, output source, line break mark, and original content, which are separated by space characters. The following example shows how to view the output of the
print-stdout
andprint-stderr
containers.cat /var/log/container-std/print-stdout/* cat /var/log/container-std/print-stderr/*
Expected output:
2024-10-18T08:48:57.567368449Z stdout F this log is from print-stdout container 2024-10-18T08:48:57.949367202Z stderr F this log is from print-stderr container
Time: The UTC time, which cannot be customized.
Output source:
stdout
indicates the standard output andstderr
indicates the standard error output.Check whether the current content has a line break: Each line in the file contain a maximum of 4,096 characters. If the original file content exceeds this limit, the remaining content will be displayed in a new line. The characters
F
andP
indicate that whether the current line has a line break.F
indicates that the current line is the last line of the original content.P
indicates that the current line is not the last line of the original content.
Original content: The actual output of the container.