FAQ about NAS volumes - Container Service for Kubernetes

This topic provides answers to some frequently asked questions about Apsara File Storage NAS (NAS) volumes.

Why does the system prompt "chown: Operation not permitted"?
What do I do if the task queue of alicloud-nas-controller is full and PVs cannot be created when I use a dynamically provisioned NAS volume?
Why does it require a long time to mount a NAS volume?
What do I do if I cannot create or modify directories in a NAS volume?
Why does the system prompt "unknown filesystem type "xxx" when I mount a NAS volume"?
Why does the system prompt "NFS Stale File Handle" when a client reads data from or writes data to a NAS volume?
What do I do if a pod that uses two PVCs to mount two different NAS volumes remains in the ContainerCreating state?
How do I use CSI to mount a NAS file system that has TLS enabled?

Why does the system prompt chown: Operation not permitted when I mount a NAS volume?

Issue

The system prompts chown: Operation not permitted when I mount a NAS file system.

Cause

The role that is used to run the container does not have the permissions to manage the NAS file system.

Solutions

The user that launches the container process does not have root permissions. You need to use a root account to perform chown and chgrp operations. When the accessModes of a persistent volume (PV) is set to ReadWriteOnce, you can use securityContext.fsGroup to configure volume permissions and an ownership change policy for the pods. For more information, see Configure volume permission and ownership change policy for Pods.
If the issue persists after you use a root account, check whether the permission group of the NAS mount target sets user permissions to No Anonymity (no_squash). For more information, see Manage a permission group.

What do I do if the task queue of alicloud-nas-controller is full and PVs cannot be created when I use a dynamically provisioned NAS volume?

Issue

When you use a dynamically provisioned NAS volume, if the speed of subdirectory creation is faster than the speed of subdirectory deletion, the task queue of alicloud-nas-controller may be full and therefore PVs cannot be created.

Cause

The reclaimPolicy parameter is set to Delete and the archiveOnDelete parameter is set to false in the configuration of the StorageClass that mounts the dynamically provisioned NAS volume.

Solutions

Set archiveOnDelete to true. This way, when a PV is deleted, only the name of the mounted subdirectory in the NAS file system is modified. The files in the subdirectory are not deleted.

You must delete these files yourself. For example, you can configure a node to automatically delete files in the root directory by schedule, or start multiple pods to concurrently delete files of specific formats in subdirectories.

Why does it require a long time to mount a NAS volume?

Issue

It requires a long time to mount a NAS volume.

Cause

If the following conditions are met, the chmod or chown operation is performed when volumes are mounted, which increases the time consumption.

The AccessModes parameter is set to ReadWriteOnce in the persistent volume (PV) and persistent volume claim (PVC) templates.
The securityContext.fsgroup parameter is set in the application template.

Solutions

If the securityContext.fsgroup parameter is set in the application template, delete the fsgroup parameter in the securityContext section.
If you want to configure the user ID (UID) and mode of the files in the mounted directory, you can manually mount the directory to an Elastic Compute Service (ECS) instance. You can then run the chown and chmod commands through a CLI and provision the NAS volume through the CSI plug-in. For more information about how to use the CSI plug-in to mount NAS volumes, see Mount a statically provisioned NAS volume or Mount a dynamically provisioned NAS volume.
Apart from the preceding methods, for clusters of Kubernetes 1.20 or later, you can set the fsGroupChangePolicy parameter to OnRootMismatch. This way, the chmod or chown operation is performed only when the system launches the pod for the first time. As a result, it requires a long time to mount the OSS volume during the first launch. The issue does not occur when you mount OSS volumes after the first launch is complete. For more information about fsGroupChangePolicy, see Set the security context for a pod or a container.

What do I do if I cannot create or modify directories in a NAS volume?

Issue

You cannot create or modify directories in a NAS volume.

Cause

Non-root users do not have the permissions to create or modify directories in a PV.

Solutions

You can select one of the following methods to run the chmod or chown command to modify the permissions on the mount directory:

Launch an init container with root privileges, mount a PV, and then run the chmod or chown command to modify the permissions on the mount directory. Then, you can create and modify directories in the PV.
Set the fsGroupChangePolicy parameter to OnRootMismatch. This way, when the system launches the pod for the first time, the system runs the chmod or chown command to modify the permissions on the mount directory. Then, you can create and modify directories in the PV.

Why does the system prompt unknown filesystem type "xxx" when I mount a NAS volume?

Issue

The system prompts unknown filesystem type "xxx" when you mount a NAS volume.

Cause

The dependencies of the NAS volume are not installed on the node on which the application is deployed.

Solutions

Check whether the NAS volume is correctly configured.

For NAS volumes, see Use CNFS to manage shared NAS volumes (recommended).
For OSS volumes, see Mount a statically provisioned OSS volume.

Why does the system prompt NFS Stale File Handle when a client reads data from or writes data to a NAS volume?

Issue

The system prompts NFS Stale File Handle when a client reads data from or writes data to a NAS volume.

Cause

NAS does not ensure data consistency after you mount a NAS volume to a container. Assume that a NAS volume is mounted to two clients. Client 1 opens a file in the NAS volume to obtain the file descriptor (FD) of the file. If Client 2 deletes the file, Client 1 cannot read or write the file and the system prompts NFS Stale File Handle.

Solutions

You need to resolve data consistency issues based on your business scenarios.

What do I do if a pod that uses two PVCs to mount two different NAS volumes remains in the ContainerCreating state?

Issue

You failed to launch a pod that uses two PVCs to mount NAS volumes and the pod remains in the ContainerCreating state. The pod can be launched if you configure the pod to use only one of the PVCs.

Cause

The two PVs associated with the two PVCs use the same spec.csi.volumeHandle value. As a result, the kubelet cannot differentiate the two PVs when it processes the PV mounting logic.

Solutions

Set the spec.csi.volumeHandle parameter of each PV to the actual name of the PV.

How do I use CSI to mount a NAS file system that has TLS enabled?

NAS uses TLS to ensure the security of data transmission between the NAS client and NAS service and prevent data theft and tampering. CSI allows you to use the NAS client (aliNAS) of Alibaba Cloud to mount volumes and enable TLS.

Precautions

The NAS client uses the stunnel process as a TLS encryption wrapper. For high-throughput applications, the stunnel process consumes a large amount of CPU resources to perform encryption and decryption. In extreme cases, each mount operation occupies a core. For more information, see Encryption in transit for NFS file systems.

Procedure

Install the NAS client.

Modify the ConfigMap of csi-plugin to restart csi-plugin.

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: csi-plugin
  namespace: kube-system
data:
  cnfs-client-properties: |
    alinas-utils=true
EOF

kubectl rollout restart ds -n kube-system csi-plugin

Mount a statically provisioned NAS volume or a dynamically provisioned NAS volume based on the following sample code:

Mount a dynamically provisioned NAS volume

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: alicloud-nas-tls
mountOptions:
- nolock,tcp,noresvport
- vers=3
- tls   # Add a TLS mount option. 
parameters:
  volumeAs: subpath
  server: "0cd8b4a576-g****.cn-hangzhou.nas.aliyuncs.com:/k8s/"
  mountProtocol: alinas  # Declare to use the aliNAS client to mount the volume. 
provisioner: nasplugin.csi.alibabacloud.com
reclaimPolicy: Retain
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: nas-tls
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: alicloud-nas-tls
  resources:
    requests:
      storage: 20Gi

Parameter	Description
parameters.mountProtocol	Set the parameter to alinas to use the aliNAS client to mount the volume. By default, this parameter is left empty and NFS is used to mount the volume.
mountOptions	Add the tls option to enable TLS. Add this option only when mountProtocol is set to alinas. By default, TLS is disabled.

Mount a statically provisioned NAS volume

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-nas-tls
  labels:
    alicloud-pvname: pv-nas-tls
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  csi:
    driver: nasplugin.csi.alibabacloud.com
    volumeHandle: pv-nas   # Enter the name of the PV. 
    volumeAttributes:
      server: "2564f4****-ysu87.cn-shenzhen.nas.aliyuncs.com"
      path: "/csi"
      mountProtocol: alinas # Declare to use the aliNAS client to mount the volume. 
  mountOptions:
  - nolock,tcp,noresvport
  - vers=3
  - tls # Add a TLS mount option.
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-nas-tls
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  selector:
    matchLabels:
      alicloud-pvname: pv-nas-tls

Parameter	Description
spec.csi.volumeAttributes.mountProtocol	Set the parameter to alinas to use the aliNAS client to mount the volume. By default, this parameter is left empty and NFS is used to mount the volume.
spec.mountOptions	Add the tls option to enable TLS. Add this option only when mountProtocol is set to alinas. By default, TLS is disabled.