Help Center/ Cloud Container Engine/ Best Practices/ Storage/ Automatically Collecting JVM Dump Files That Exit Unexpectedly Using a General Purpose File System (SFS 3.0 Capacity-Oriented)
Updated on 2025-04-27 GMT+08:00

Automatically Collecting JVM Dump Files That Exit Unexpectedly Using a General Purpose File System (SFS 3.0 Capacity-Oriented)

If you are using Java to develop services, you may encounter an out of memory (OOM) problem if the JVM heap space is insufficient. To address this issue, you can use general purpose file systems (SFS 3.0 Capacity-Oriented) to store logs and mount the file systems to the relevant directories in containers. In the event of a JVM OOM, general purpose file systems (SFS 3.0 Capacity-Oriented) can record logs.

Prerequisites

  • A CCE standard cluster has been created. For details, see Buying a CCE Standard/Turbo Cluster.
  • Before using general purpose file systems (SFS 3.0 Capacity-Oriented) for CCE container storage, you need to configure a VPC endpoint to communicate with the general purpose file systems (SFS 3.0 Capacity-Oriented). For details, see Configure a VPC Endpoint.

Procedure

  1. Create a PVC based on a general purpose file system (SFS 3.0 Capacity-Oriented).

    cat << EOF | kubectl apply -f -
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: jvm-sfs-pvc
      namespace: default
      annotations: {}
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 10Gi
      storageClassName: csi-sfs
    EOF

  2. Create a Deployment to simulate Java OOM and dump the generated dump files to the PV associated with the general purpose file system (SFS 3.0 Capacity-Oriented).

    cat << EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: java-application
      namespace: default
    spec:
      selector:
        matchLabels:
          app: java-application
      template:
        metadata:
          labels:
            app: java-application
        spec:
          containers:
          - name: java-application
            image: swr.cn-east-3.myhuaweicloud.com/container/java-oom-demo:v1  #The image in this document is only an example.
            imagePullPolicy: Always
            env: 
            - name: POD_NAME     # Use metadata.name as the value of the POD_NAME environment variable.
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: POD_NAMESPACE     #Use metadata.namespace as the value of the POD_NAMESPACE environment variable.
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            args:
            - java                            #Run the Java command.
            - -Xms80m                         #Configure the minimum heap size of the heap memory.
            - -Xmx80m                         #Configure the maximum heap size of the heap memory.
            - -XX:HeapDumpPath=/mnt/oom/logs  #Heap memory dump path when OOM occurs
            - -XX:+HeapDumpOnOutOfMemoryError #Capture the heap OOM error.
            - Mycode                          # Execute the application in the example image.
            volumeMounts:
            - name: java-oom-pv
              mountPath: "/mnt/oom/logs"      # The container that is created using the example image uses /mnt/oom/logs as the log mount directory.
              subPathExpr: default.java-application   # Create a subdirectory in the $(POD_NAMESPACE).$ (POD_NAME) format and generate OOM dump files in the subdirectory.
          imagePullSecrets:
            - name: default-secret
          volumes:
          - name: java-oom-pv
            persistentVolumeClaim:
              claimName: jvm-sfs-pvc         #PVC using the SFS file system, named jvm-sfs-pvc
    EOF

    In the example provided, the parameter settings are tailored to align with the image used, demonstrating the process of log collection through SFS volume mounting. It is necessary to customize these parameter settings based on the specific requirements of the actual service image in use.

  3. Wait until the container automatically restarts due to OOM.

    kubectl -n default get pod

    Information similar to the following is displayed:

    NAME                                READY   STATUS    RESTARTS      AGE
    java-application-84dc6f897f-hc9q7   1/1     Running   1 (31s ago)   97s

  4. Obtain the files generated by the Java program due to OOM.

    1. Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose Storage, click the PVCs tab, locate the row containing jvm-sfs-pvc, and click the name of the associated PV.
    2. After the system automatically switches to the row containing the corresponding PV, click the name of the associated storage volume.
    3. After the system automatically switches to the SFS console, copy the mounting command.

    4. Log in to a node, create a mount point, and run the mount command to mount the SFS volume to the node.
      mkdir /test-jvm
      mount -t nfs -o vers=3,timeo=600,noresvport,nolock,proto=tcp ***.com:/pvc-4ea9137e-4101-4610-a4d2-9f8bb37043a1 /test-jvm
    5. Check the files in the mounted file system. The dump file java_pid1.hprof is present in the directory. To identify the line of code that triggers an OOM error, download java_pid1.hprof to the local host and use Eclipse Memory Analyzer Tool (MAT) to further analyze JVM stack information.