Storage FAQ - Container Service for Kubernetes - Alibaba Cloud Documentation Center

This topic lists frequently asked questions about mounting and using persistent volumes with Container Storage Interface (CSI) components.

Typical issues

If a pod is in an abnormal state or a persistent volume fails to mount, see Troubleshoot storage issues for troubleshooting steps.

The following list describes some typical issues:

Cloud Disk persistent volumes

Category	Issue
Creation	Why does the system prompt "InvalidDataDiskCatagory.NotSupported" when I create a dynamically provisioned PV? Why does the system prompt "The specified AZone inventory is insufficient" when I create a dynamically provisioned PV? Why does the system prompt "disk size is not supported" when I create a dynamically provisioned PV? Why does the system prompt "waiting for first consumer to be created before binding" when I create a dynamically provisioned PV? Why does the system prompt "no topology key found on CSINode node-XXXX" when I create a dynamically provisioned PV? Why does the system prompt "selfLink was empty, can't make reference" when I create a dynamically provisioned PV? Dynamically provisioned PV fails for PVCs requesting less than 20 GiB
Mounting	Why does the system prompt "had volume node affinity conflict" when I start a pod that has a disk mounted? Why does the system prompt "can't find disk" when I start a pod that has a disk mounted? Why does the system prompt "Previous attach action is still in process" when I start a pod that has a disk mounted? Why does the system prompt "InvalidInstanceType.NotSupportDiskCategory" when I start a pod that has a disk mounted? Why does the system prompt "diskplugin.csi.alibabacloud.com not found" in the list of registered CSI drivers when I start a pod that has a disk mounted? Why does the system prompt "Multi-Attach error for volume" when I start a pod that has a disk mounted? Why does the system prompt "Unable to attach or mount volumes: unmounted volumes=[xxx], unattached volumes=[xxx]: timed out waiting for the condition" when I start a pod that uses a disk volume? Why does the system prompt "validate error Device /dev/nvme1n1 has error format more than one digit locations" when I start a pod that has disk mounted? Why does the system prompt "ecs task is conflicted" when I start a pod that has a disk mounted? Why does the system prompt "wrong fs type, bad option, bad superblock on /dev/xxxxx missing codepage or helper program, or other error" when I start a pod that has a disk mounted? Why does the system prompt "exceed max volume count" when I start a pod that has a disk mounted? Why does the system prompt "The amount of the disk on instance in question reach its limits" when I start a pod that has a disk mounted? How do I modify the default StorageClass configuration for cloud disks? Can I use the same disk volume for multiple container applications?
Usage	Why does the system prompt "input/output error" when an application performs read and write operations on the mount directory of a disk volume? How do I set user access permissions for a cloud disk mount directory?
Expansion	Do cloud disk volumes support automatic expansion? Why does the system fail to expand a disk and prompts "Waiting for user to (re-)start a pod to finish file system resize of volume on node"? Why does the system fail to expand a disk and prompt "only dynamically provisioned pvc can be resized and the storageclass that provisions the pvc must support resize"?
Unmounting	Why does the system prompt "The specified disk is not a portable disk" when I delete a pod that has a disk mounted? Why does the system prompt that the disk cannot be unmounted when I delete a pod that has a disk mounted and an orphaned pod that is not managed by ACK is found in the kubelet log? What do I do if the system fails to recreate a deleted pod and prompts that the mounting failed? Why does the system prompt "target is busy" when I delete a pod that has a disk mounted? Why is the disk retained after I delete the PVC used to mount the disk? Why does a PVC still exist after I delete it?
Other	Can I convert a disk used as a PV to the subscription billing model? How can I identify disks for PVs in the ECS console?

NAS persistent volumes

Category	Issue
Mounting	Error during mount: chown: Operation not permitted Controller task queue is full when mounting a dynamically provisioned NAS volume The time required to mount a NAS persistent volume increases Error during mount: unknown filesystem type "xxx" Why is my pod stuck in ContainerCreating when I mount two NAS PVCs? How do I mount a NAS file system with TLS using CSI? How do I implement user or group isolation on NAS? Can multiple applications use the same NAS volume? Error when mounting a NAS volume in ACS: failed to do setup volume
Usage	Cannot create or modify directories on a NAS volume NFS Stale File Handle error during read/write operations
Unmounting	Unmount times out and pod is stuck in Terminating state

OSS persistent volumes

ossfs 1.0

Type	Issue
Mounting	OSS volume mount time is extended OSS volume mount permission issues OSS volume mount failure How do I mount only a specific file in OSS using an OSS volume? How do I use specified ARNs or a ServiceAccount for RRSA authentication? How do I mount an OSS Bucket across accounts? How do I enable exclusive mount mode after ossfs is containerized? An exception occurs when you mount an OSS volume using subpath or subpathExpr
Usage	Access to an OSS Bucket through an OSS volume is slow File size is 0 in the OSS console An application reports the "Transport endpoint is not connected" error when accessing a mount target An application reports the "Input/output error" error when accessing a mount target A directory is displayed as a file object after being mounted Many abnormal requests are monitored on the OSS server The Content-Type metadata of file objects written through an OSS volume is always application/octet-stream The "Operation not supported" or "Operation not permitted" error is returned when you create a hard link How do I view the records of access to OSS through an OSS volume? How do I restart the ossfs process in shared mount mode? How do I check the ossfs version used to mount an OSS volume?
Scaling	Do I need to scale out a volume when the actual storage capacity exceeds the volume's configuration?
Uninstall	Unmounting a statically provisioned OSS volume fails and the pod remains in the Terminating state

ossfs 2.0

Category	Question
Mount	OSS persistent volume mount fails How to mount a single file from an OSS bucket by using a persistent volume How to mount an OSS Bucket across different accounts How to use a specific ARN or ServiceAccount for RRSA-based Authorization
Scale out	Do I need to scale out a volume when the actual storage capacity exceeds the volume's configuration?
Usage	How to restart the ossfs 2.0 process

Storage components

Type	Issue
Component issues	The CSI component fails to start, and the component log shows a "403 - Forbidden" error The CSI component fails to start due to an image pull failure, with the error "exec /usr/bin/plugin.csi.alibabacloud.com: exec format error" OOM issues caused by storage components High network traffic is observed in the monitoring data of the csi-plugin pod The csi-provisioner component log shows the error "failed to renew lease xxx timed out waiting for the condition"
Component upgrade failures	The pre-check for the csi-plugin component fails The pre-check for the csi-plugin component passes, but the upgrade fails The csi-plugin component is present in the console, but the csi-provisioner component is missing The pre-check for the csi-provisioner component fails The pre-check for the csi-provisioner component passes, but the upgrade fails The csi-provisioner component upgrade fails because the number of cluster nodes or the permissions do not meet the requirements The csi-provisioner component upgrade fails due to changes in StorageClass properties

CNFS

An "`IPAddress ... for Service ... has a wrong reference`" event alert appears after an ACK cluster upgrade

Symptom

After you upgrade the cluster, running the kubectl get events -A command may return continuous Warning events:

IPAddress: <IP_ADDRESS> for Service kube-system/cnfs-cache-ds-service has a wrong reference; cleaning up

This issue usually occurs in the following scenarios:

The version of the storage-operator component in the cluster is earlier than v1.33.1.
The cluster is upgraded from a version earlier than 1.33 to version 1.33 or later.

Cause

Versions of storage-operator earlier than v1.33.1 have a known issue where they continuously try to create an existing Service. In Kubernetes 1.33 and later, the MultiCIDRServiceAllocator feature is enabled by default. This repetitive behavior triggers the feature, causing the system to enter a loop of rapidly creating and deleting temporary IPAddress resources.

Solution

Upgrade the storage-operator component.

Why is the `kube-system/cnfs-cache-ds-service` automatically recreated after I manually delete it?

Symptom

You manually delete the cnfs-cache-ds-service in the kube-system namespace. The deletion operation appears to succeed, but the service reappears shortly after.

Cause

This issue is caused by the storage-operator component, which works as follows:

Desired state: In the storage-operator ConfigMap, the installation status of cnfs-cache-ds-service is defined as true.
Continuous monitoring: The component continuously checks the cluster to ensure that the service exists.
Automatic reconciliation: When you manually delete the service, the controller detects that the actual state does not match the desired state. It then immediately recreates the service to match the desired state.

Solution

Method 1: Upgrade the storage-operator component (Recommended)

For more information, see Upgrade the storage-operator component.

Method 2: Modify the storage-operator configuration (Temporary solution)

This method involves modifying the storage-operator configuration file to prevent the cnfs-cache-ds service from being automatically recreated.

Find and edit the storage-operator ConfigMap in the kube-system namespace.
```
kubectl edit configmap storage-operator -n kube-system
```
In the data field, locate cnfs-cache-ds and change the value of its install key from true to false.
```
cnfs-cache-ds:
  install: "false"
  # ...other configurations...
```
Save the changes and exit the editor. The storage-operator then applies the new configuration.

Run the command to delete the service again.

kubectl delete service cnfs-cache-ds-service -n kube-system

Typical issues

Cloud Disk persistent volumes

NAS persistent volumes

OSS persistent volumes

ossfs 1.0

ossfs 2.0

Storage components

CNFS

An "IPAddress ... for Service ... has a wrong reference" event alert appears after an ACK cluster upgrade

Symptom

Cause

Solution

Why is the kube-system/cnfs-cache-ds-service automatically recreated after I manually delete it?

Symptom

Cause

Solution

Method 1: Upgrade the storage-operator component (Recommended)

Method 2: Modify the storage-operator configuration (Temporary solution)

An "`IPAddress ... for Service ... has a wrong reference`" event alert appears after an ACK cluster upgrade

Why is the `kube-system/cnfs-cache-ds-service` automatically recreated after I manually delete it?