All Products
Search
Document Center

Container Compute Service:Use core dumps to analyze program anomalies in an instance

Last Updated:Mar 26, 2026

When a container program crashes unexpectedly, the Linux kernel captures the program's memory state at the moment of failure and saves it to a core file. Use gdb to analyze the core file and identify the root cause of the crash. This page covers how to enable core dumps for ACS pods, choose a storage method for core files, and access them after a crash.

How it works

In Linux, when a program terminates abnormally, the kernel records the state of the random access memory (RAM) allocated to that program and writes it to a file — a process called a core dump. The resulting file is called a core file.

The following figure shows the Linux signals that trigger a core dump. By default, signals whose action is Core generate core files. For details, see Core dump file.

Prerequisites

Before you begin, ensure that you have:

You can also run the following kubectl commands in CloudShell without configuring a local kubeconfig file.

Enable core dumps

Core dumps are disabled by default for ACS pods. To enable them, add the alibabacloud.com/core-pattern annotation to your pod spec and set the path where core files are stored:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t"
...

The path value also sets the core file naming pattern. The supported format specifiers are:

Specifier Description
%E Path of the executable that crashed
%p Process ID (PID) of the crashed process
%t Timestamp of the crash

For all supported specifiers, see Man page of core.

Choose a storage method

The storage method you choose determines how you access core files after a crash:

Method When to use Persistence
NAS volume Multiple pods across nodes need access to core files; production environments Persistent, shared across pods
OSS volume You prefer object storage; core files need to be retained long-term Persistent, shared across pods
emptyDir + ephemeral container Quick, one-time debugging of a single pod crash; no shared storage available Lost when the pod is deleted

Mount a remote shared volume (NAS or OSS) to keep core files intact and prevent the pod's rootfs layer from filling up, which would cause CrashLoopBackOff events.

Mount a remotely shared volume

Mount a NAS volume to store core files

Use a shared NAS volume to collect core files when a container crashes.

  1. Create a NAS file system and a mount target. For details, see Create a NAS file system and a mount target.

  2. Create a Deployment using the following YAML. Replace the volumes.volumeAttributes.server value with your NAS server address. For details on creating Deployments, see Create a stateless application by using a Deployment.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredump-nas-volume-test
      labels:
        app: test
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          name: nginx-test
          labels:
            app: nginx
          annotations:
            alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t" # Core file storage path
        spec:
          containers:
          - name: nginx
            image: registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest
            volumeMounts:
              - name: nas-volume
                mountPath: /data/dump-a/
          volumes:
            - name: nas-volume
              csi:
                driver: nasplugin.csi.alibabacloud.com
                fsType: nas
                volumeAttributes:
                  server: "0389a***-nh7m.cn-shanghai.extreme.nas.aliyuncs.com"
                  path: "/"
                  vers: "3"
                  options: "nolock,tcp,noresvport"

    When a pod triggers a core dump, the core file is stored in the NAS volume.

  3. Verify that the NAS volume is mounted:

    kubectl exec -it deploy/coredump-nas-volume-test -- sh -c 'df -h | grep aliyun'

    Expected output:

    0389a***-nh7m.cn-shanghai.extreme.nas.aliyuncs.com:/   10P     0   10P   0% /data/dump-a

    The NAS volume is mounted and ready. Core files generated by crashes are stored at /data/dump-a/.

Mount an OSS volume to store core files

Use a shared OSS volume to collect core files when a container crashes.

  1. Create an OSS bucket. For details, see Mount a statically provisioned OSS volume.

  2. Create a Deployment using the following YAML. Replace the Spec.csi.volumeAttributes values with your OSS bucket endpoint and credentials. For details on creating Deployments, see Create a stateless application by using a Deployment.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredump-oss-volume-test
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
          annotations:
            alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t" # Core file storage path
        spec:
          containers:
          - name: nginx
            image: registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest
            volumeMounts:
              - name: oss-volume
                mountPath: /data/dump-a/
          volumes:
            - name: oss-volume
              persistentVolumeClaim:
                claimName: oss-pvc
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: oss-pvc
    spec:
      storageClassName: test # Used for binding mapping only; no resource is created
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 50Gi
      selector:
        matchLabels:
          alicloud-pvname: oss-csi-pv
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: oss-csi-pv
      labels:
        alicloud-pvname: oss-csi-pv
    spec:
      storageClassName: test # Used for binding mapping only; no resource is created
      capacity:
        storage: 50Gi
      accessModes:
        - ReadWriteMany
      persistentVolumeReclaimPolicy: Retain
      csi:
        driver: ossplugin.csi.alibabacloud.com
        volumeHandle: oss-csi-pv
        volumeAttributes:
          bucket: "oss-test"
          url: "oss-cn-hangzhou-internal.aliyuncs.com"
          otherOpts: "-o max_stat_cache_size=0 -o allow_other"
          akId: "<your AccessKey ID>"
          akSecret: "<your AccessKey Secret>"

    When a pod triggers a core dump, the core file is stored in the OSS volume.

  3. Verify that the OSS volume is mounted:

    kubectl exec -it deploy/coredump-oss-volume-test -- sh -c 'df -h | grep s3fs'

    Expected output:

    s3fs             16E     0   16E   0% /data/dump-a

    The OSS volume is mounted and ready. Core files generated by crashes are stored at /data/dump-a/.

Use an ephemeral container to access core files

Use this approach when excessive core files are generated or when you need to debug a specific pod crash without a remote shared volume. After a core dump event occurs, the core file is saved to the rootfs layer. You can then log in to an injected ephemeral container to analyze the core file. The core file is stored in an emptyDir volume mounted to both the application container and the ephemeral container.

Important

The open-source kubectl debug command does not support volume mounts when injecting ephemeral containers. As a workaround, the steps below use the Kubernetes API directly via kubectl proxy and curl to inject an ephemeral container with a volume mount configured. Open two terminal windows and keep the proxy terminal running throughout.

  1. Create a Deployment using the following YAML. For details on creating Deployments, see Create a stateless application by using a Deployment.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredump-emptydir-volume-test
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
          annotations:
            alibabacloud.com/core-pattern: "/data/dump-a/core-%E-%p-%t" # Core file storage path
        spec:
          containers:
          - name: nginx
            image: registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest
            volumeMounts:
              - name: emptydir-volume
                mountPath: /data/dump-a/
          volumes:
            - name: emptydir-volume
              emptyDir: {}

    This creates a Deployment with an emptydir-volume volume mounted at /data/dump-a/. After a crash, inject an ephemeral container to access core files at that path.

  2. In the first terminal, start a local proxy between your client and the cluster:

    Use --port to specify a different port. For details, see kubectl proxy.
    kubectl proxy

    Expected output:

    Starting to serve on 127.0.0.1:8001
  3. In the second terminal, inject an ephemeral container into the pod. Replace coredump-emptydir-volume-test-xxxxx with the actual pod name and target-container with the target container name.

    Parameter Description
    ...namespaces/${NAMESPACE}... Namespace of the pod
    ...pods/${POD_NAME}... Name of the pod
    spec: ${SPEC_DETAIL} Spec of the ephemeral container. Validate the JSON format before submitting.
    Spec.ephemeralContainers.name Name of the ephemeral container. Must be unique when multiple ephemeral containers are injected.
    Spec.ephemeralContainers.command Startup command. Optional when using a custom image with a default entrypoint.
    Spec.ephemeralContainers.targetContainerName Name of the target container in the pod. Required when the pod has multiple containers.
    Spec.ephemeralContainers.volumeMounts Mount path for the ephemeral container. Must match the core-pattern annotation value.
    curl -k http://127.0.0.1:8001/api/v1/namespaces/default/pods/coredump-emptydir-volume-test-xxxxx/ephemeralcontainers \
      -X PATCH \
      -H 'Content-Type: application/strategic-merge-patch+json' \
      -d '{
      "spec": {
        "ephemeralContainers": [
          {
            "name": "debugger-container-name",
            "command": [
              "/bin/sh",
              "-c",
              "sleep 3600"
            ],
            "image": "registry.cn-hangzhou.aliyuncs.com/acs-sample/nginx:latest",
            "stdin": true,
            "tty": true,
            "targetContainerName": "target-container",
            "volumeMounts": [
              {
                "name": "emptydir-volume",
                "mountPath": "/data/dump-a/"
              }
            ]
          }
        ]
      }
    }'

    The following table describes the key parameters:

  4. Log in to the ephemeral container:

    kubectl exec -it -n default coredump-emptydir-volume-test-xxxxx -c debugger-container-name sh
  5. Access the core file directory:

    cd /data/dump-a && pwd

    Expected output:

    /data/dump-a

    Core files generated by the crashed container are in this directory. Use gdb to analyze them.

  6. After finishing the debugging session, stop the proxy process in the first terminal by pressing Ctrl+C.

What's next