If your application is written in Java and the heap size of the Java virtual machine (JVM) is small, the application may encounter out of memory (OOM) errors. You can mount a Container Network File System (CNFS) volume to the log directory of your application. This way, the log that records OOM errors is automatically stored in the CNFS volume. This topic describes how to use CNFS to automatically collect the heap dumps of a JVM.

Prerequisites

Background information

  • CNFS allows you to abstract NAS file systems as Kubernetes resources by using CustomResourceDefinition (CRD) objects. You can use the CRD objects to create, delete, describe, mount, monitor, and expand NAS file systems. For more information, see CNFS overview.
  • Container Registry is a secure platform that allows you to manage and distribute cloud-native artifacts that meet the standards of Open Container Initiative (OCI) in an effective manner. The artifacts include container images and Helm charts. For more information, see What is Container Registry?.

Procedure

  1. You can use the registry.cn-hangzhou.aliyuncs.com/acs1/java-oom-test:v1.0 image to deploy a Java program that is used to trigger OOM errors.
    For more information about how to build an image, see Use Container Registry Enterprise Edition to build images.
  2. Use the following template to create a Deployment named java-application.

    When you launch the Mycode program, the heap size is set to 80 MB, and the heap dumps are written to the /mnt/oom/logs directory. If the heap cannot meet the requirement of the JVM, a HeapDumpOnOutOfMemoryError error is returned.

    cat << EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: java-application
    spec:
      selector:
        matchLabels:
          app: java-application
      template:
        metadata:
          labels:
            app: java-application
        spec:
          containers:
          - name: java-application
            image: registry.cn-hangzhou.aliyuncs.com/acs1/java-oom-test:v1.0  # The image address of the sample Java application. 
            imagePullPolicy: Always
            env:                               # Specify two environment variables. Set the key of one variable to POD_NAME and the value to metadata.name. Set the key of the other variable to POD_NAMESPACE and the value to metadata.namespace. 
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            args:
            - java                            # Run the Java command. 
            - -Xms80m                         # The minimum heap size. 
            - -Xmx80m                         # The maximum heap size. 
            - -XX:HeapDumpPath=/mnt/oom/logs  # The path in which heap dumps are stored when OOM errors occur. 
            - -XX:+HeapDumpOnOutOfMemoryError # Generate heap dumps when OOM errors occur. 
            - Mycode                          # Run the Mycode program. 
            volumeMounts:
            - name: java-oom-pv
              mountPath: "/mnt/oom/logs"      # Mount the CNFS volume to the /mnt/oom/logs directory. 
              subPathExpr: $(POD_NAMESPACE).$(POD_NAME)   # Create a subdirectory named $(POD_NAMESPACE).$(POD_NAME). The subdirectory is used to store heap dumps that are generated due to OOM errors. 
          volumes:
          - name: java-oom-pv
            persistentVolumeClaim:
              claimName: cnfs-nas-pvc         # The persistent volume claim (PVC) that is used to mount the CNFS volume. The PVC name is cnfs-nas-pvc. 
    EOF
  3. Go to the Event Center module of the Container Service for Kubernetes (ACK) console. If a Back-off restarting warning appears on the page, it indicates that an OOM error has occurred to the java-application application.
    1. Log on to the ACK console.
    2. In the left-side navigation pane of the ACK console, click Clusters.
    3. On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
    4. In the left-side navigation pane of the cluster details page, choose Operations > Event Center.
    CNFS
  4. To view, upload, and download files in NAS file systems, you can deploy a File Browser application. This allows you to perform these operations on a web page. Mount the NAS file system to the rootDir path of the File Browser application. Then, run the kubectl port-forward command to map the container port of the File Browser application to your on-premises machine. This way, you can use your browser to access files in the NAS file system.
    1. Use the following template to create a ConfigMap that is used by File Browser and the File Browser Deployment. By default, port 80 is opened.
      cat << EOF | kubectl apply -f -
      apiVersion: v1
      data:
        .filebrowser.json: |
          {
            "port": 80
          }
      kind: ConfigMap
      metadata:
        labels:
          app.kubernetes.io/instance: filebrowser
          app.kubernetes.io/name: filebrowser
        name: filebrowser
        namespace: default
      ---
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        labels:
          app.kubernetes.io/instance: filebrowser
          app.kubernetes.io/name: filebrowser
        name: filebrowser
        namespace: default
      spec:
        progressDeadlineSeconds: 600
        replicas: 1
        revisionHistoryLimit: 10
        selector:
          matchLabels:
            app.kubernetes.io/instance: filebrowser
            app.kubernetes.io/name: filebrowser
        template:
          metadata:
            labels:
              app.kubernetes.io/instance: filebrowser
              app.kubernetes.io/name: filebrowser
          spec:
            containers:
            - image: docker.io/filebrowser/filebrowser:v2.18.0
              imagePullPolicy: IfNotPresent
              name: filebrowser
              ports:
              - containerPort: 80
                name: http
                protocol: TCP
              resources: {}
              securityContext: {}
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
              - mountPath: /.filebrowser.json
                name: config
                subPath: .filebrowser.json
              - mountPath: /db
                name: rootdir
              - mountPath: /rootdir
                name: rootdir
            dnsPolicy: ClusterFirst
            restartPolicy: Always
            schedulerName: default-scheduler
            securityContext: {}
            terminationGracePeriodSeconds: 30
            volumes:
            - configMap:
                defaultMode: 420
                name: filebrowser
              name: config
            - name: rootdir
              persistentVolumeClaim:
                claimName: cnfs-nas-pvc
      EOF

      Expected output:

      configmap/filebrowser unchanged
      deployment.apps/filebrowser configured
    2. Map port 80 of File Browser to your on-premises machine.
      kubectl port-forward deployment/filebrowser 8080:80
      Expected output:
      Forwarding from 127.0.0.1:8080 -> 80
      Forwarding from [::1]:8080 -> 80
    3. Open your browser, enter 127.0.0.1:8080 in the address bar, and then press Enter. The File Browser logon page appears. Enter the default username (admin) and password (admin). Then, click Login.
      file browser.png
    4. The cnfs-nas-pvc PVC is mounted to the rootDir directory. Double-click rootDir to open the NAS file system.
      file browser2

Result

On the File Browser page, find the default.java-application-76d8cd95b7-prrl2 directory that is created for java-application and named based on the subPathExpr: $(POD_NAMESPACE).$(POD_NAME) configuration.

file browser3

Navigate to this directory and find the heap dump file java_pid1.hprof. If you want to locate the exact line of code that triggers the OOM error, download java_pid1.hprof to your on-premises machine and use Eclipse Memory Analyzer Tool (MAT) to analyze the JVM stacks.

CNFS 02