Collect the stdout and stderr logs of GPU-HPN pods by using sidecar containers - Container Compute Service

A running GPU-HPN pod generates stdout and stderr logs. This topic describes how to collect these logs from a GPU-HPN pod by using a sidecar container.

Important

This topic applies only to ACS GPU-HPN pods. For information about how to collect logs from general-purpose pods and accelerated pods, see Use a custom solution to collect application logs.

Procedure

Step 1: Deploy a multi-container workload

Deploy a GPU-HPN workload that contains three containers. The purpose of each container is as follows:

print-stdout: prints the stdout log.
print-stderr: prints the stderr log.
shared-std-volume: mounts stdout of all containers to the /var/log/container-std directory of the container.

Create a file named dep-with-std-volume.yaml.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dep-with-std-volume
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: dep-with-std-volume
  template:
    metadata:
      labels:
        alibabacloud.com/compute-class: "gpu-hpn"
        alibabacloud.com/hpn-type: "rdma"
        app: dep-with-std-volume
    spec:
      containers:
      - name: print-stdout
        image: registry.cn-wulanchabu.aliyuncs.com/acs/ubuntu:latest
        # Print the stdout log.
        command: ["bash", "-c", "echo 'this log is from print-stdout container'; tail -f /dev/null"]
        resources:
          requests:
            cpu: "1"
            memory: "1Gi"
          limits:
            cpu: "1"
            memory: "1Gi"
      - name: print-stderr
        image: registry.cn-wulanchabu.aliyuncs.com/acs/ubuntu:latest
        # Print the stderr log.
        command: ["bash", "-c", "echo 'this log is from print-stderr container' >&2; tail -f /dev/null"]
        resources:
          requests:
            cpu: "1"
            memory: "1Gi"
          limits:
            cpu: "1"
            memory: "1Gi"
      - name: shared-std-volume
        image: registry.cn-wulanchabu.aliyuncs.com/acs/ubuntu:latest
        command: ["sleep", "infinity"]
        resources:
          requests:
            cpu: "1"
            memory: "1Gi"
          limits:
            cpu: "1"
            memory: "1Gi"
        volumeMounts:
        - mountPath: /var/log/container-std
          name: stdout-volume
      volumes:
      # Declare a shared volume for stdout.
      - name: stdout-volume
        emptyDir:
          medium: Stdout

Deploy the workload.

kubectl apply -f dep-with-std-volume.yaml

Query the status of the pods after the workload is deployed.

kubectl get pod -n default

Expected output:

NAME                            READY   STATUS             RESTARTS         AGE
dep-with-std-volume-xxxx-xxxx   3/3     Running            0                9m

Step 2: Verify the logs

Log on to the shared-std-volume container.

kubectl exec -it dep-with-std-volume-xxxx-xxxx -c shared-std-volume -- /bin/bash

View the shared directory.
```
ls /var/log/container-std/*
```
Expected output:
```
/var/log/container-std/print-stderr:
0.log

/var/log/container-std/print-stdout:
0.log

/var/log/container-std/shared-std-volume:
0.log
```
The directory contains three subdirectories, which are named after the three containers. You can find a file named 0.log in each subdirectory.
Each entry in the log file consists of the UTC time, output source, line break mark, and original content, which are separated by space characters. The following example shows how to view the output of the print-stdout and print-stderr containers.
```
cat /var/log/container-std/print-stdout/*
cat /var/log/container-std/print-stderr/*
```
Expected output:
```
2024-10-18T08:48:57.567368449Z stdout F this log is from print-stdout container
2024-10-18T08:48:57.949367202Z stderr F this log is from print-stderr container
```
- Time: The UTC time, which cannot be customized.
- Output source: stdout indicates the standard output and stderr indicates the standard error output.
- Check whether the current content has a line break: Each line in the file contain a maximum of 4,096 characters. If the original file content exceeds this limit, the remaining content will be displayed in a new line. The characters F and P indicate that whether the current line has a line break.
  - F indicates that the current line is the last line of the original content.
  - P indicates that the current line is not the last line of the original content.
- Original content: The actual output of the container.