All Products
Search
Document Center

Container Compute Service:Use core dumps to analyze program anomalies in an instance

Last Updated:Dec 17, 2024

If a program unexpectedly terminates or stops responding, the operating system records the content of the random access memory (RAM) that is allocated to the program and saves the content to a file for subsequent debugging and analysis. This process is called a core dump. With core dump files, you can use the gdb debugging tool to locate the cause of program crashes. This topic describes how to enable core dumps for an ACS pod. This way, you can view and analyze core dump files generated when a container exceptionally exits, find out the cause of the issue, and fix the issue.

How it works

In Linux, if a program unexpectedly terminates or crashes, the operating system records the state of the RAM that is allocated to the program and saves the state to a file. This process is called a core dump. RAM state files generated during a core dump are known as core dump files, which are usually named as core files. You can use the gdb debugging tool to view and analyze core files in order to find the cause.

The following figure shows the core dump signals supported in Linux. By default, signals whose actions are Core will generate core files. For more information, see Core dump file.

How to work with core dumps

By default, core dumps are disabled for ACS pods. Frequently core dumps generate large numbers of core files, which will exhaust the disk space and further affect your businesses. We recommend that you mount a remotely shared OSS or NAS volume to store core files. By specifying a custom path to store core files, you can ensure the integrity of core files and avoid CrashBackOff events. These events will generate core files and exhaust the storage space at the container rootfs layer.

When excessive core files are generated, you can use ephemeral containers. After a core dump event occurs, the core file is saved to the rootfs layer. You can log on to the ephemeral container to analyze the core file.

You can add the pod annotation alibabacloud.com/core-pattern: core-path/core-pattern to specify the path of core files and enable core dumps. You can mount the path to a shared volume so that you can analyze the core files in the path. You can specify the name of core-pattern files, such as core-%E-%p-%t:

  • %E: the path of the executable file that causes the crash (Executable Path).

  • %p: the ID of the crashed process (process ID).

  • %t: the time when the crash occurred (Timestamp).

For more information about core-pattern file naming, see Man page of core.

apiVersion: v1
kind: Pod
metadata:
  annotations:
    alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t"
...
Note

You can modify the kubeconfig file of the ACS cluster or perform the following steps in CloudShell. To modify a kubeconfig file, make sure that kubectl is installed and the kubeconfig file is configured. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

Mount a remotely shared volume

Mount a NAS volume to store core files

Use a shared NAS volume to collect core files generated by the kernel when a container crash occurs.

  1. Create a NAS file system and a mount target. For more information, see Create a NAS file system and a mount target.

  2. Create a Deployment named coredump-nas-volume-test based on the following YAML content. For more information, see Create a stateless application by using a Deployment. Replace the value of volumes.volumeAttributes.server with the actual NAS server address.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredump-nas-volume-test
      labels:
        app: test
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          name: nginx-test
          labels:
            app: nginx
          annotations:
            alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t" # Specify the path of core files.
        spec:
          containers:
          - name: nginx
            image: registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest
            volumeMounts:
              - name: nas-volume
                mountPath: /data/dump-a/
          volumes:     # Mount a shared NAS volume.
            - name: nas-volume
              csi:
                driver: nasplugin.csi.alibabacloud.com
                fsType: nas
                volumeAttributes:
                  server: "0389a***-nh7m.cn-shanghai.extreme.nas.aliyuncs.com"
                  path: "/"
                  vers: "3"
                  options: "nolock,tcp,noresvport"

    After the preceding pod triggers a core dump event, the core file is stored in the remote NAS volume.

  3. Run the following command to check whether the volume is mounted. Make sure that you can view core files in the remotely shared volume when core dumps occur.

    kubectl exec -it deploy/coredump-nas-volume-test -- sh -c 'df -h | grep aliyun'

    Expected results:

    0389a***-nh7m.cn-shanghai.extreme.nas.aliyuncs.com:/   10P     0   10P   0% /data/dump-a

    The mounted NAS volume is working.

Mount an OSS volume to store core files

Use a shared OSS volume to collect core files generated by the kernel when a container crash occurs.

  1. Create an OSS bucket. For more information, see Mount a statically provisioned OSS volume.

  2. Create a Deployment named coredump-oss-volume-test based on the following YAML content. For more information, see Create a stateless application by using a Deployment. Replace .Spec.csi.volumeAttributes with the endpoint of the OSS bucket and the credential information.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredump-oss-volume-test
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
          annotations:
            alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t"    # Specify the path of core files.
        spec:
          containers:
          - name: nginx
            image: registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest
            volumeMounts:
              - name: oss-volume
                mountPath: /data/dump-a/
          volumes:
            - name: oss-volume
              persistentVolumeClaim:
                claimName: oss-pvc
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: oss-pvc
    spec:
      storageClassName: test # The storageClass name is used only for binding mapping. No resource is created.
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 50Gi
      selector:
        matchLabels:
          alicloud-pvname: oss-csi-pv
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: oss-csi-pv
      labels:
        alicloud-pvname: oss-csi-pv
    spec:
      storageClassName: test # The storageClass name is used only for binding mapping. No resource is created.
      capacity:
        storage: 50Gi
      accessModes:
        - ReadWriteMany
      persistentVolumeReclaimPolicy: Retain
      csi:
        driver: ossplugin.csi.alibabacloud.com
        volumeHandle: oss-csi-pv
        volumeAttributes:
          bucket: "oss-test"
          url: "oss-cn-hangzhou-internal.aliyuncs.com"
          otherOpts: "-o max_stat_cache_size=0 -o allow_other"
          akId: "<your AccessKey ID>"
          akSecret: "<your AccessKey Secret>"

    After the preceding pod triggers a core dump event, the core file is stored in the remote OSS volume.

  3. Run the following command to check whether the volume is mounted. Make sure that you can view core files in the remotely shared volume when core dumps occur.

    kubectl exec -it deploy/coredump-oss-volume-test -- sh -c 'df -h | grep s3fs'

    Expected results:

    s3fs             16E     0   16E   0% /data/dump-a

    The mounted OSS volume is working.

Inject ephemeral containers

Inject an ephemeral container. Mount the path of core files to the container as an emptyDir volume.

  1. Create a Deployment named coredump-emptydir-volume-test based on the following YAML content. For more information, see Create a stateless application by using a Deployment.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredump-emptydir-volume-test
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
          annotations:
            alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t"    # Specify the path of core files.
        spec:
          containers:
          - name: nginx
            image: registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest
            volumeMounts:
              - name: emptydir-volume
                mountPath: /data/dump-a/
          volumes:
            - name: emptydir-volume
              emptyDir: {}

    Create a Deployment and mount the volume named emptydir-volume to the pod. After configuraiton, you can log on to the mount target of the ephemeral container to view core files.

    Important

    The open source version of kubectl debug does not allow you to configure mounting when injecting ephemeral containers into a pod. The following steps show how to configure a local proxy environment by using kubectl proxy to verify the mount target. Open two prompts on the client and keep the proxy configuration prompt.

  2. Run the following command in one prompt to launch a proxy service between the client and cluster.

    kubectl proxy 

    Expected results:

    Starting to serve on 127.0.0.1:8001
    Note

    You can run the kubectl proxy command and set --port to specify a port. For more information, see kubectl proxy.

  3. Run the following command in another prompt to inject the ephemeral container to a pod. Replace coredump-emptydir-volume-test-xxxxx and target-container with the actual values.

    curl -k http://127.0.0.1:8001/api/v1/namespaces/default/pods/coredump-emptydir-volume-test-xxxxx/ephemeralcontainers -X PATCH  -H 'Content-Type: application/strategic-merge-patch+json' -d '{
      "spec": {
        "ephemeralContainers": [
          {
            "name": "debugger-container-name",
            "command": [
              "/bin/sh",
              "-c",
              "sleep 3600"
            ],
            "image": "registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest",
            "stdin": true,
            "tty": true,
            "targetContainerName": "target-container", # Specify a container when the pod contains multiple containers.
            "volumeMounts": [
              {
                "name": "emptydir-volume",
                "mountPath": "/data/dump-a/"
              }
            ]
          }
        ]
      }
    }'

    The following table describes some parameters.

    Parameter

    Description

    ...namespaces/${NAMESPACE}...

    The namespace of the injected ephemeral container.

    ...pods/${POD_NAME}...

    The pod name of the injected ephemeral container.

    spec: ${SPEC_DETAIL}

    The Spec content of the injected ephemeral container.

    Important

    We recommend that you replace these values in steps and use a tool to verify the JSON format of the Spec field.

    Spec.ephemeralContainers.name

    The name of the ephemeral container. When multiple ephemeral containers are injected, the names must be unique.

    Spec.ephemeralContainers.command

    The boot command of the ephemeral container. This setting is optional when a custom image is used.

    Spec.ephemeralContainers.targetContainerName

    When a pod contains multiple containers, you can specify the name of the injected container.

    Spec.ephemeralContainers.volumeMounts

    The mount target of the ephemeral container, which must be the same as the core-pattern value of the pod.

  4. After the ephemeral container runs, run the following command to log on to it.

    kubectl exec -it -n default coredump-emptydir-volume-test-xxxxx -c debugger-container-name sh
  5. In the ephemeral container, run the following command to access the mount target.

    cd /data/dump-a && pwd

    Expected results:

    /data/dump-a