When a Pod fails to start or runs abnormally due to a storage issue, work through the diagnostic steps below to identify the root cause. The steps follow the order in this diagram:

Jump to a specific error message:
0/x nodes are available: x node(s) had volume node affinity conflictno persistent volumes available for this claim and no storage class is set0/x nodes are available: x pod has unbound immediate PersistentVolumeClaims
1. Check whether the Pod abnormality is caused by a storage issue
Check the Pod and PersistentVolumeClaim (PVC) events to confirm that a storage issue is preventing the Pod from starting.
View the Pod events.
If an event indicates a storage issue, continue with the steps in this topic. For example, the
FailedSchedulingEvents: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 4m37s default-scheduler 0/1 nodes are available: 1 node(s) had volume node affinity conflict. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.If an event such as
SuccessfulAttachVolumeshows that the volume attached successfully but the Pod is not running (for example, it is in theCrashLoopBackOffstate), the issue is not storage-related. Check the events for other causes, or submit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketEvents: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 97s default-scheduler Successfully assigned default/disk-test-0 to cn-shanghai.192.168.5.2 Normal SuccessfulAttachVolume 97s attachdetach-controller AttachVolume.Attach succeeded for volume "d-uf6b8s2l5ypf48*****"
kubectl describe pod <pod-name>If there are no storage-related Pod events, check all cluster events.
If an event indicates a storage issue, continue with the steps in this topic. For example, the
FailedBindingLAST SEEN TYPE REASON OBJECT MESSAGE 2m56s Normal FailedBinding persistentvolumeclaim/data-my-release-mariadb-0 no persistent volumes available for this claim and no storage class is set 41s Normal ExternalProvisioning persistentvolumeclaim/pvc-nas-dynamic-create-subpath8 waiting for a volume to be created, either by external provisioner "nasplugin.csi.alibabacloud.com" or manually created by system administrator 3m31s Normal Provisioning persistentvolumeclaim/pvc-nas-dynamic-create-subpath8 External provisioner is provisioning volume for claim "default/pvc-nas-dynamic-create-subpath8"If there are no storage-related events, check for other causes or submit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticket for assistance.
kubectl get events
2. Check the CSI storage components
If your cluster uses FlexVolume components, migrate to Container Storage Interface (CSI) components as soon as possible because FlexVolume is deprecated. For more information, see Migrate from FlexVolume to CSI.
Check whether the CSI storage components are running.
CSI storage components include csi-plugin and csi-provisioner. By default, the managed version of csi-provisioner is installed and maintained by Alibaba Cloud. Its Pods are not visible in your cluster.
kubectl get pod -n kube-system | grep csiAll csi-plugin Pods should be in the
Runningstate. The following is an example of healthy output:NAME READY STATUS RESTARTS AGE csi-plugin-bpz28 4/4 Running 0 3d csi-plugin-h2tdg 4/4 Running 0 3d csi-plugin-qpnm4 4/4 Running 0 3d csi-plugin-wczgm 4/4 Running 0 3dIf a Pod is not in the
Runningstate, runkubectl describe pods <pod-name> -n kube-systemto view the Pod events and the reason why the container exited.Check whether the CSI storage components are up to date.
kubectl get ds csi-plugin -n kube-system -o yaml | grep imageThe
imagefield shows the current version. The following is an example:image: registry-cn-shanghai-vpc.ack.aliyuncs.com/acs/csi-plugin:v1.33.1-67e8986-aliyunIf the components are not on the latest version, upgrade the CSI components. For the latest version information, see csi-plugin. You can also go to the Add-ons page in the ACK console, find csi-plugin and csi-provisioner, and check or upgrade their versions there.
Check the YAML files of the PV, PVC, and StorageClass to confirm that the
driverorprovisionerfield is set to use CSI storage components and is consistent with the components your cluster uses.
3. Check the PVC status
View the PVC status.
kubectl get pvcIf the PVC is not in the
Boundstate, resolve the issue based on whether it uses static or dynamic provisioning. Static provisioning The PVC failed to bind to a PV. Common causes: Check the YAML files for configuration errors. For reference, see: Dynamic provisioning An issue occurred with the csi-provisioner component. Run the following command to view PVC events:If the PV status is
Released, the PV cannot be reused. Retrieve the storage resource information from the PV and re-create the PV.kubectl describe pvc <pvc-name> -n <namespace>Handle the issue based on the event message. For more information, see: If there are no relevant event messages, submit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticket for assistance. For cloud disk persistent volumes, an ECS OpenAPI error may have occurred when creating the disk. See ECS Error Center to troubleshoot. If the issue persists, submit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticket for assistance.
4. Check the Pod status
View the Pod status.
kubectl get podIf the PVC is in the
Boundstate but the Pod is not in theRunningstate, troubleshoot based on the storage type.
Cloud disk persistent volumes
Make sure that the ECS node's instance type supports the cloud disk type you specified, and that the Pod and the cloud disk are in the same region and zone. For instance-type-to-disk-type mappings, see Instance families.
Common causes:
No eligible node is available for scheduling
An error occurred when attaching the cloud disk
The ECS instance type and cloud disk type do not match
Resolution:
To recover quickly, reschedule the Pod to another node. For more information, see Schedule applications to specified nodes.
Run the following command to view the Pod events, then handle the issue based on the event message:
kubectl describe pods <pod-name>For event-specific guidance, see Cloud disk persistent volume FAQ. If the instance type and disk type do not match, select a compatible disk type from Instance families. For ECS OpenAPI errors, see ErrorCode.
If there are no relevant event messages, submit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticket for assistance.
NAS persistent volumes
The node and the NAS file system must be in the same VPC. If they are in different VPCs, use Cloud Enterprise Network (CEN) to connect them.
NAS supports cross-zone mounting.
The mount path for an Extreme NAS file system must start with
/share.
Common causes:
fsGroupis set in the Pod'ssecurityContext, and the recursive permission change on many files slows down the mount operationPort 2049 is blocked in the security group, preventing the NAS file system from mounting
The NAS file system and the node are in different VPCs
Resolution:
If fsGroup is set in the Pod's securityContext, check whether it can be removed. If so, remove it, restart the Pod, and remount the volume.
If that does not apply, check whether port 2049 is blocked on the node. If it is, open port 2049 in the security group and remount. For more information, see Add a security group rule.
Confirm that the NAS file system and the node are in the same VPC. If they are not, use CEN to connect them.
For other issues, run the following command and handle the issue based on the event message:
kubectl describe pods <pod-name>For event-specific guidance, see NAS persistent volume FAQ. If there are no relevant event messages, submit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticket for assistance.
OSS persistent volumes
The PV must contain AccessKey information to mount an OSS bucket. Store the credentials in a Secret and reference it from the PV.
For cross-region access, use a public endpoint. Within the same region, use an internal endpoint.
Common causes:
fsGroupis set in the Pod'ssecurityContext, and the recursive permission change on many files slows down the mount operationAn internal endpoint is used for cross-region access, preventing the connection to the bucket
Resolution:
If fsGroup is set in the Pod's securityContext, check whether it can be removed. If so, remove it, restart the Pod, and remount the volume.
If you are accessing the bucket across regions, check that the endpoint is a public endpoint rather than an internal endpoint.
For other issues, run the following command and handle the issue based on the event message:
kubectl describe pods <pod-name>For event-specific guidance, see ossfs 1.0 persistent volume FAQ or ossfs 2.0 persistent volume FAQ. If there are no relevant event messages, submit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticketsubmit a ticket for assistance.
FAQ
When I create or mount a persistent volume, the PVC shows "no volume plugin matched"
Symptom
A PVC event shows the following message:
Unable to attach or mount volumes: unmounted volumes=[xxx], unattached volumes=[xxx]: failed to get Plugin from volumeSpec for volume "xxx" err=no volume plugin matchedCause
The storage component specified in the YAML template is not installed in the cluster.
Solution
Check whether the storage component exists in the cluster.
If it is not installed, install it. For more information, see Manage components.
If it is installed, make sure the component type matches the YAML templates for the PV and PVC:
CSI storage components: deploy using CSI documentation.
FlexVolume storage components: deploy using FlexVolume documentation.
ImportantFlexVolume is deprecated. Migrate to CSI components as soon as possible. For more information, see Migrate from FlexVolume to CSI.
Pod event shows "0/x nodes are available: x pod has unbound immediate PersistentVolumeClaims"
Symptom
The Pod fails to start with this event:
0/x nodes are available: x pod has unbound immediate PersistentVolumeClaims. preemption: 0/x nodes are available: x Preemption is not helpful for schedulingCause
The StorageClass referenced by the Pod does not exist in the cluster.
Solution
Check whether the StorageClass referenced by the Pod exists. If it does not, create it.
A PV is in the Released state and cannot be bound by re-creating a PVC
Symptom
After a PVC is accidentally deleted, the PV enters the Released state and cannot be bound by re-creating the PVC.
Cause
When the reclaimPolicy of a PV is Retain, the PV enters the Released state when its bound PVC is deleted.
Solution
Delete the pv.spec.claimRef field from the PV, then re-bind it using the static volume method. The PV enters the Bound state.
A PV is in the Lost state and cannot be bound
Symptom
After a PVC and a PV are created, the PV enters the Lost state and cannot bind to the PVC.
Cause
The PVC name referenced in the PV's claimRef field does not exist.
Solution
Delete the pv.spec.claimRef field from the PV, then re-bind it using the static volume method. The PV enters the Bound state.
Do StorageClass changes affect existing storage?
No. If the YAML files for the PVC and PV are not changed, changes to the StorageClass do not affect existing storage. For example, modifying allowVolumeExpansion in a StorageClass takes effect only after you modify the capacity of the PVC. If the PVC's YAML is unchanged, the existing configuration is not affected.