OSS persistent volumes (PVs) are Filesystem in Userspace (FUSE) file systems mounted using ossfs. You can troubleshoot ossfs issues by analyzing debug logs or retrieving pod logs. This topic describes common ossfs issues and provides general troubleshooting methods with examples.
Troubleshooting instructions
CSI plugin version | ossfs runtime mode | Troubleshooting method |
Versions earlier than v1.28 | ossfs runs as a background process on the node where the application pod is located. If an issue occurs, you must remount ossfs in the foreground. | Analyze debug logs |
v1.28 to v1.30.4 | ossfs runs as a container in a pod within the | Query pod logs |
v1.30.4 and later | ossfs runs as a container in a pod within the |
To prevent ossfs from generating excessive logs, the default log level for container runtimes is critical or error. During debugging, you may need to add debug parameters and remount the volume.
Scenario 1: Mount failure
If a statically provisioned OSS volume fails to mount, the pod cannot start, and the event shows FailedMount, first perform a quick check by referring to An OSS volume fails to mount and the application pod event shows FailedMount.
CSI plugin version 1.26.6 and later
Symptoms
When the application pod starts, it remains in the ContainerCreating state for a long time.
Cause
First, confirm whether the ossfs container is running as expected. If it is, check whether the application pod is in the ContainerCreating state for other reasons, such as a node issue.
In the command,
<NODE_NAME>is the name of the node where the affected application pod is located.In the command,
<VOLUME_ID>is typically the name of the application's PV. You must retrieve this value in the following two scenarios:If the PV name is different from the value of the PV's `volumeHandle` field, use the `volumeHandle` field. Run the following command to retrieve the `volumeHandle` value for a PV:
kubectl get pv <PV_NAME> -o jsonpath='{.spec.csi.volumeHandle}'If the PV name is too long,
<VOLUME_ID>is formed by concatenating "h1." with its SHA-1 value. Run the following command to retrieve this value:echo -n "<PV_NAME>" | sha1sum | sed 's/^/h1./'If your system does not support the
sha1sumoperation, you can also use the OpenSSL library to retrieve the value:echo -n "<PV_NAME>" | openssl sha1 -binary | xxd -p -c 40 | sed 's/^/h1./'
CSI v1.30.4 and later
kubectl -n ack-csi-fuse get pod -l csi.alibabacloud.com/volume-id=<VOLUME_ID> -o wide | grep <NODE_NAME>Because this version adds a daemon process to the ossfs container, the expected output is the same regardless of whether the ossfs process is running as expected:
NAME READY STATUS RESTARTS AGE csi-fuse-ossfs-xxxx 1/1 Running 0 5sIf the ossfs container is not in the `Running` state, troubleshoot that issue first.
CSI versions earlier than v1.30.4
kubectl -n kube-system get pod -l csi.alibabacloud.com/volume-id=<VOLUME_ID> -o wide | grep <NODE_NAME>If the ossfs container is not running as expected, the output is as follows:
NAME READY STATUS RESTARTS AGE csi-fuse-ossfs-xxxx 0/1 CrashLoopBackOff 1 (4s ago) 5s
Cause of this event: When the CSI component starts the ossfs container, ossfs exits unexpectedly. This issue can be caused by initialization check errors, such as a failed OSS connectivity check (for example, the bucket does not exist or the permissions are incorrect), a non-existent OSS mount path, or insufficient read and write permissions.
Solution
Retrieve the ossfs container logs.
For CSI v1.30.4 and later
kubectl -n ack-csi-fuse logs csi-fuse-ossfs-xxxxFor CSI versions earlier than v1.30.4
kubectl -n kube-system logs csi-fuse-ossfs-xxxxIf the pod is in the `CrashLoopBackOff` state, retrieve the logs from the pod's previous unexpected exit:
kubectl -n kube-system logs -p csi-fuse-ossfs-xxxx
(Optional) If the logs are empty or do not provide enough information to identify the issue, the log level may be too high. In this case, add the required configurations as follows.
NoteThis method does not require you to redeploy a new debug PV and its corresponding persistent volume claim (PVC). However, you cannot directly retrieve the REST request responses from OSS.
Create an OSS PV for debugging. Based on the original PV configuration, add
-o dbglevel=debug -o curldbgto theotherOptsfield. After you mount the new OSS PV, run thekubectl logscommand to retrieve debug logs from the ossfs pod.ImportantDebug logs can be large. We recommend that you use this setting for debugging only.
Use the following content to create a ConfigMap named `csi-plugin` in the `kube-system` namespace. Set the log level to
debug.NoteFor CSI plugin v1.28.2 and earlier, you can only set the log level to `info`.
apiVersion: v1 kind: ConfigMap metadata: name: csi-plugin namespace: kube-system data: fuse-ossfs: | dbglevel=debug # Log levelRestart the `csi-plugin` pod on the node where the application is located and all `csi-provisioner` pods to apply the ConfigMap configuration. Restart the application pod to trigger a remount, and confirm that the
csi-fuse-ossfs-xxxxpod is redeployed after the remount.ImportantThe ConfigMap is a global configuration. After debugging, delete the ConfigMap. Then, restart the `csi-plugin` pod on the node and all `csi-provisioner` pods again to turn off debug logging. Finally, restart the application pod to trigger another remount to prevent ossfs from generating excessive debug logs.
Analyze the ossfs container logs.
Typically, ossfs errors fall into two categories: errors from ossfs itself and non-200 error codes returned by the OSS server after a request. The following examples for each error type show the general troubleshooting methods.
Errors from ossfs itself
ossfs: MOUNTPOINT directory /test is not empty. if you are sure this is safe, can use the 'nonempty' mount option.Based on the log message, the error occurs because the mount point path is not empty. You can resolve this issue by adding the
-o nonemptyconfiguration item.NoteYou can find a solution in the ossfs FAQ document based on the error log. If you cannot find the cause, submit a ticket.
Non-200 error code returned from the OSS server
Retrieve the logs. Before ossfs exits, it checks the bucket to be mounted. The OSS server returns error code 404, the error cause is NoSuchBucket, and the error message is The specified bucket does not exist.
[ERROR] 2023-10-16 12:38:38:/tmp/ossfs/src/curl.cpp:CheckBucket(3420): Check bucket failed, oss response: <?xml version="1.0" encoding="UTF-8"?> <Error> <Code>NoSuchBucket</Code> <Message>The specified bucket does not exist.</Message> <RequestId>652D2ECEE1159C3732F6E0EF</RequestId> <HostId><bucket-name>.oss-<region-id>-internal.aliyuncs.com</HostId> <BucketName><bucket-name></BucketName> <EC>0015-00000101</EC> <RecommendDoc>https://api.aliyun.com/troubleshoot?q=0015-00000101</RecommendDoc> </Error>Based on the log message, the error occurs because the specified OSS bucket does not exist. To resolve this issue, log on to the OSS console, create the bucket, and then remount the volume.
NoteYou can find solutions in the HTTP error codes topic of the OSS documentation using the error code and error message.
Additional information
Log level instructions
If the log level is too high to identify the issue, and you have the following requirements:
You do not want to recreate the OSS PV.
You are concerned that the global ConfigMap configuration may affect other operations, such as mounting or unmounting other OSS PVs during debugging.
In this case, you can mount ossfs in the foreground and retrieve debug logs to identify the issue. For specific steps, see the solution for CSI plugin versions earlier than 1.26.6.
ImportantAfter ossfs is containerized, it is not installed on nodes by default. The version of ossfs that you install manually may be different from the version running in the pod. We recommend that you first try the solutions described above by changing the PV mount parameters or the global ConfigMap configuration.
To mount ossfs, perform the following steps.
Install the latest version of ossfs.
On the node, run the following command to retrieve the SHA-256 value of the PV name.
echo -n "<PV_NAME>" | sha256sumIf your system does not support the `sha256sum` operation, you can also use the OpenSSL library to retrieve the value:
echo -n "<PV_NAME>" | openssl sha256 -binary | xxd -p -c 256For "pv-oss", the expected output is:
8f3e75e1af90a7dcc66882ec1544cb5c7c32c82c2b56b25a821ac77cea60a928On the node, run the following command to retrieve the ossfs mount parameters.
ps -aux | grep <sha256-value>The output contains the process record for ossfs.
On the node, run the following command to generate the authentication information required for mounting ossfs. After debugging, delete this file promptly.
mkdir -p /etc/ossfs && echo "<bucket-name>:<akId>:<akSecret>" > /etc/ossfs/passwd-ossfs && chmod 600 /etc/ossfs/passwd-ossfs
Troubleshooting segmentation faults
If the error log contains
"ossfs exited with error" err="signal: segmentation fault (core dumped) ", it indicates that the ossfs process exited unexpectedly because of a segmentation fault during runtime.To help technical support identify the problem, log on to the logon node, follow the procedure below to obtain the core dump file, and then submit a ticket.
Find the crash record of the ossfs process.
coredumpctl listExpected output:
TIME PID UID GID SIG COREFILE EXE Mon 2025-11-17 11:21:44 CST 2108767 0 0 1 present /usr/bin/xxx Tue 2025-11-18 19:35:58 CST 493791 0 0 1 present /usr/local/bin/ossfsThe preceding output indicates that two processes on this node have exited because of segmentation faults.
Find the record to troubleshoot based on the time (
TIME) and process name (EXE), and note itsPID. In the example above, the PID is493791.Export the core dump file. Use the
PIDfrom the previous step and run the following command to export the core file. The--outputparameter specifies the name of the generated file.# Replace <PID> with the actual PID coredumpctl dump <PID> --output ossfs.dumpsubmit a ticket and provide the generated
ossfs.dumpfile.
CSI plugin versions earlier than 1.26.6
Symptoms
When the application pod starts, it remains in the ContainerCreating state for a long time.
Cause
Run the following command to determine why the pod cannot start as expected.
Replace the following variables:
<POD_NAMESPACE>: the namespace where the application pod is located.<POD_NAME>: the name of the application pod.kubectl -n <POD_NAMESPACE> describe pod <POD_NAME>
Check whether an event with the `FailedMount` cause exists in the Events section of the command output.
Replace the following variables:
<PV_NAME>: the name of the OSS volume.<BUCKET>: the name of the mounted OSS bucket.<PATH>: the path of the mounted OSS bucket.<POD_UID>: the UID of the application pod.Warning FailedMount 3s kubelet MountVolume.SetUp failed for volume "<PV_NAME>" : rpc error: code = Unknown desc = Mount is failed in host, mntCmd:systemd-run --scope -- /usr/local/bin/ossfs <BUCKET>:/<PATH> /var/lib/kubelet/pods/<POD_UID>/volumes/kubernetes.io~csi/<PV_NAME>/mount -ourl=oss-cn-beijing-internal.aliyuncs.com -o allow_other , err: ..... with error: exit status 1Cause of this event: This event occurs because ossfs exits unexpectedly when the CSI component starts it. As a result, no corresponding ossfs process is running on the node. This issue can be caused by initialization check errors, such as a failed OSS connectivity check (for example, the bucket does not exist or permissions are incorrect), a non-existent mount path, or insufficient read and write permissions.
Solution
Step 1: Get the original ossfs startup command
Check the `FailedMount` event to view the output from the mount failure. For more information, see Scenario 1: Mount failure.
Retrieve the original ossfs startup command from the mount failure output.
The following is an example of the mount failure output:
Warning FailedMount 3s kubelet MountVolume.SetUp failed for volume "<PV_NAME>" : rpc error: code = Unknown desc = Mount is failed in host, mntCmd:systemd-run --scope -- /usr/local/bin/ossfs <BUCKET>:/<PATH> /var/lib/kubelet/pods/<POD_UID>/volumes/kubernetes.io~csi/<PV_NAME>/mount -ourl=oss-cn-beijing-internal.aliyuncs.com -o allow_other , err: ..... with error: exit status 1In `mntCmd`, the content after
systemd-run --scope --is the original ossfs startup command. The original ossfs startup command is as follows:/usr/local/bin/ossfs <BUCKET>:/<PATH> /var/lib/kubelet/pods/<POD_UID>/volumes/kubernetes.io~csi/<PV_NAME>/mount -ourl=oss-cn-beijing-internal.aliyuncs.com -o allow_other
Step 2: Mount ossfs in the foreground and get debug logs
By default, only the user who runs the mount command can access the directory mounted by ossfs. Other users cannot access the directory. Therefore, if the original command does not include the -o allow_other configuration item, permission issues may occur on the root mount path.
Confirm whether a mount point path permission issue exists.
If a permission issue exists, add the
-o allow_otherconfiguration item when you create the PV. For more information about how to configure access permissions for an ossfs mount point, see Mount a bucket. For more information about how to add configuration items, see Use a statically provisioned ossfs 1.0 volume.Run the following command on the node where the application pod is located to run ossfs in the foreground and set the log mode to Debug.
In this command,
/testis the test mount point path. The ossfs process that runs in the foreground mounts the OSS bucket to/test.mkdir /test && /usr/local/bin/ossfs <BUCKET>:/<PATH> /test -ourl=oss-cn-beijing-internal.aliyuncs.com -f -o allow_other -o dbglevel=debug -o curldbgParameter
Description
-f
Runs ossfs in the foreground instead of as a daemon process. In foreground mode, logs are output to the terminal screen. This parameter is typically used for debugging.
-o allow_other
Grants other users on the computer permission to access the mounted directory. This prevents new mount point path permission issues when mounting ossfs in the foreground.
-o dbglevel=debug
Sets the ossfs log level to debug.
-o curldbg
Enables libcurl logging to troubleshoot errors returned by the OSS server.
Step 3: Analyze the debug logs
After you run ossfs in the foreground, logs are output to the terminal. Typically, ossfs errors fall into two categories: errors from ossfs itself and non-200 error codes returned by the OSS server after a request. The following examples for each error type show the general troubleshooting methods.
The following example shows how to troubleshoot an issue where ossfs exits soon after it starts during a mount failure.
Errors from ossfs itself
Check the logs. The error log that is printed before ossfs exits is as follows.
ossfs: MOUNTPOINT directory /test is not empty. if you are sure this is safe, can use the 'nonempty' mount option.Based on the log message, the error occurs because the mount point path is not empty. You can resolve this issue by adding the -o nonempty configuration item.
You can find solutions in the ossfs FAQ topic of the OSS documentation based on the error log. If you cannot find the cause, submit a ticket.
Non-200 error code returned from the OSS server
Check the logs. The error code that is returned by the OSS server and printed before ossfs exits is 404. The cause is NoSuchBucket, and the message is The specified bucket does not exist.
[INFO] Jul 10 2023:13:03:47:/tmp/ossfs/src/curl.cpp:RequestPerform(2382): HTTP response code 404 was returned, returning ENOENT, Body Text: <?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>NoSuchBucket</Code>
<Message>The specified bucket does not exist.</Message>
<RequestId>xxxx</RequestId>
<HostId><BUCKET>.oss-cn-beijing-internal.aliyuncs.com</HostId>
<BucketName><BUCKET></BucketName>
<EC>0015-00000101</EC>
</Error>Based on the log message, the error occurs because the specified OSS bucket does not exist. To resolve this issue, log on to the OSS console, create the bucket, and then remount the volume.
You can find the solution in the HTTP error codes topic of the OSS documentation using the error code and error message.
Scenario 2: Error when executing a POSIX command
CSI plugin version 1.26.6 and later
Symptoms
The application pod is in the `Running` state, but ossfs reports an error when a POSIX command, such as a read or write command, is executed.
Cause
Confirm whether the ossfs container is running as expected.
NoteIn the command,
<NODE_NAME>is the name of the node where the affected application pod is located.In the command,
<VOLUME_ID>is typically the name of the application's PV. You must retrieve this value in the following two scenarios:If the PV name is different from the value of the PV's `volumeHandle` field, use the `volumeHandle` field. Run the following command to retrieve the `volumeHandle` value for a PV:
kubectl get pv <PV_NAME> -o jsonpath='{.spec.csi.volumeHandle}'If the PV name is too long,
<VOLUME_ID>is formed by concatenating "h1." with its SHA-1 value. Run the following command to retrieve this value:echo -n "<PV_NAME>" | sha1sum | sed 's/^/h1./'If your system does not support the
sha1sumoperation, you can also use the OpenSSL library to retrieve the value:echo -n "<PV_NAME>" | openssl sha1 -binary | xxd -p -c 40 | sed 's/^/h1./'
CSI v1.30.4 and later
kubectl -n ack-csi-fuse get pod -l csi.alibabacloud.com/volume-id=<VOLUME_ID> -o wide | grep <NODE_NAME>Because this version adds a daemon process to the ossfs container, the expected output is the same regardless of whether the ossfs process is running as expected:
NAME READY STATUS RESTARTS AGE csi-fuse-ossfs-xxxx 1/1 Running 0 5sIf the ossfs container is not in the `Running` state, troubleshoot that issue first.
CSI versions earlier than v1.30.4
kubectl -n kube-system get pod -l csi.alibabacloud.com/volume-id=<VOLUME_ID> -o wide | grep <NODE_NAME>If the ossfs container is not running as expected, the output is as follows:
NAME READY STATUS RESTARTS AGE csi-fuse-ossfs-xxxx 0/1 CrashLoopBackOff 1 (4s ago) 5s
Check the application logs to confirm the command that caused the error and the error type that is returned. For example, an
I/O erroris returned when you run thechmod -R 777 /mnt/pathcommand.
Cause of this event: After the CSI component starts the ossfs container, the ossfs pod runs as expected and mounts the OSS bucket to the specified mount point path on the node where the application pod is located. However, when a POSIX command such as `chmod`, `read`, or `open` is executed, ossfs runs abnormally, returns an error, and prints the corresponding error to the logs.
Solution
Retrieve the ossfs container logs.
For CSI v1.30.4 and later
kubectl -n ack-csi-fuse logs csi-fuse-ossfs-xxxxFor CSI versions earlier than v1.30.4
kubectl -n kube-system logs csi-fuse-ossfs-xxxxIf the pod is in the `CrashLoopBackOff` state, retrieve the logs from the pod's previous unexpected exit:
kubectl -n kube-system logs -p csi-fuse-ossfs-xxxx
(Optional) If the logs are empty or do not provide enough information to identify the issue, the log level may be too high. In this case, add the required configurations as follows.
NoteThis method does not require you to redeploy a new debug PV and its corresponding persistent volume claim (PVC). However, you cannot directly retrieve the REST request responses from OSS.
Create an OSS PV for debugging. Based on the original PV configuration, add
-o dbglevel=debug -o curldbgto theotherOptsfield. After you mount the new OSS PV, run thekubectl logscommand to retrieve debug logs from the ossfs pod.ImportantDebug logs can be large. We recommend that you use this setting for debugging only.
Use the following content to create a ConfigMap named `csi-plugin` in the `kube-system` namespace. Set the log level to
debug.NoteFor CSI plugin v1.28.2 and earlier, you can only set the log level to `info`.
apiVersion: v1 kind: ConfigMap metadata: name: csi-plugin namespace: kube-system data: fuse-ossfs: | dbglevel=debug # Log levelRestart the `csi-plugin` pod on the node where the application is located and all `csi-provisioner` pods to apply the ConfigMap configuration. Restart the application pod to trigger a remount, and confirm that the
csi-fuse-ossfs-xxxxpod is redeployed after the remount.ImportantThe ConfigMap is a global configuration. After debugging, delete the ConfigMap. Then, restart the `csi-plugin` pod on the node and all `csi-provisioner` pods again to turn off debug logging. Finally, restart the application pod to trigger another remount to prevent ossfs from generating excessive debug logs.
Analyze the ossfs container logs.
Typically, ossfs errors fall into two categories: errors from ossfs itself and non-200 error codes returned by the OSS server after a request. The following examples for each error type show the general troubleshooting methods.
Errors from ossfs itself
This section uses an error that occurs when you run the
chmod -R 777 <mount_point_path_in_application_pod>command as an example to show how to troubleshoot the issue.If the test mount path for ossfs is
/test, run the following command.chmod -R 777 /testAfter you check the logs, you find that the `chmod` operation is successful for files in the
/testmount point path, but the `chmod` operation on/testitself generates the following error log.[ERROR] 2023-10-18 06:03:24:/tmp/ossfs/src/ossfs.cpp:ossfs_chmod(1745): Could not change mode for mount point.Based on the log message, you cannot change the permissions of the mount point path using `chmod`. For more information about how to modify the permissions of a mount point, see ossfs 1.0 volume FAQ.
NoteYou can find solutions in the ossfs FAQ topic of the OSS documentation based on the error log. If you cannot find the cause, submit a ticket.
Non-200 error code returned from the OSS server
The following example shows how to troubleshoot an issue where an error is returned for all operations on an object in a bucket.
[INFO] 2023-10-18 06:05:46:/tmp/ossfs/src/curl.cpp:HeadRequest(3014): [tpath=/xxxx] [INFO] 2023-10-18 06:05:46:/tmp/ossfs/src/curl.cpp:PreHeadRequest(2971): [tpath=/xxxx][bpath=][save=][sseckeypos=-1] [INFO] 2023-10-18 06:05:46:/tmp/ossfs/src/curl.cpp:prepare_url(4660): URL is http://oss-cn-beijing-internal.aliyuncs.com/<bucket>/<path>/xxxxx [INFO] 2023-10-18 06:05:46:/tmp/ossfs/src/curl.cpp:prepare_url(4693): URL changed is http://<bucket>.oss-cn-beijing-internal.aliyuncs.com/<path>/xxxxx [INFO] 2023-10-18 06:05:46:/tmp/ossfs/src/curl.cpp:RequestPerform(2383): HTTP response code 404 was returned, returning ENOENT, Body Text:Run the command that caused the error. The HTTP return code from the OSS server that is printed before the exit is 404. The cause is inferred to be that the object does not exist on the OSS server. For more information about the causes of this issue and its solutions, see 404 error.
NoteYou can find the solution in the HTTP error codes topic of the OSS documentation using the error code and error message.
Additional information
Log level instructions
If the log level is too high to identify the issue, and you have the following requirements:
You do not want to recreate the OSS PV.
You are concerned that the global ConfigMap configuration may affect other operations, such as mounting or unmounting other OSS PVs during debugging.
In this case, you can mount ossfs in the foreground and retrieve debug logs to identify the issue. For specific steps, see the solution for CSI plugin versions earlier than 1.26.6.
ImportantAfter ossfs is containerized, it is not installed on nodes by default. The version of ossfs that you install manually may be different from the version running in the pod. We recommend that you first try the solutions described above by changing the PV mount parameters or the global ConfigMap configuration.
To mount ossfs, perform the following steps.
Install the latest version of ossfs.
On the node, run the following command to retrieve the SHA-256 value of the PV name.
echo -n "<PV_NAME>" | sha256sumIf your system does not support the `sha256sum` operation, you can also use the OpenSSL library to retrieve the value:
echo -n "<PV_NAME>" | openssl sha256 -binary | xxd -p -c 256For "pv-oss", the expected output is:
8f3e75e1af90a7dcc66882ec1544cb5c7c32c82c2b56b25a821ac77cea60a928On the node, run the following command to retrieve the ossfs mount parameters.
ps -aux | grep <sha256-value>The output contains the process record for ossfs.
On the node, run the following command to generate the authentication information required for mounting ossfs. After debugging, delete this file promptly.
mkdir -p /etc/ossfs && echo "<bucket-name>:<akId>:<akSecret>" > /etc/ossfs/passwd-ossfs && chmod 600 /etc/ossfs/passwd-ossfs
Troubleshooting segmentation faults
If the error log contains
"ossfs exited with error" err="signal: segmentation fault (core dumped) ", it indicates that the ossfs process exited unexpectedly because of a segmentation fault during runtime.To help technical support identify the problem, log on to the logon node, follow the procedure below to obtain the core dump file, and then submit a ticket.
Find the crash record of the ossfs process.
coredumpctl listExpected output:
TIME PID UID GID SIG COREFILE EXE Mon 2025-11-17 11:21:44 CST 2108767 0 0 1 present /usr/bin/xxx Tue 2025-11-18 19:35:58 CST 493791 0 0 1 present /usr/local/bin/ossfsThe preceding output indicates that two processes on this node have exited because of segmentation faults.
Find the record to troubleshoot based on the time (
TIME) and process name (EXE), and note itsPID. In the example above, the PID is493791.Export the core dump file. Use the
PIDfrom the previous step and run the following command to export the core file. The--outputparameter specifies the name of the generated file.# Replace <PID> with the actual PID coredumpctl dump <PID> --output ossfs.dumpsubmit a ticket and provide the generated
ossfs.dumpfile.
CSI plugin versions earlier than 1.26.6
Symptoms
The application pod is in the `Running` state, but ossfs reports an error when a POSIX command, such as a read or write command, is executed.
Cause
Check the application logs to confirm the command that caused the error and the error type that is returned. For example, an I/O error is returned when you run the chmod -R 777 /mnt/path command.
You can run the following command to enter the application pod and confirm.
kubectl -n <POD_NAMESPACE> exec -it <POD_NAME> -- /bin/bash
bash-4.4# chmod -R 777 /mnt/path
chmod: /mnt/path: I/O errorCause of this event: This event occurs because after the CSI component starts ossfs, the ossfs process runs as expected and mounts the OSS bucket to the specified mount point path on the node where the application pod is located. However, when a POSIX command such as chmod, read, or open is executed, ossfs runs abnormally and returns an error.
Solution
Step 1: Get the original ossfs startup command
Because ossfs is already running on the node, you can run the following command on the node where the application pod is located to retrieve the original ossfs startup command using the OSS volume name.
ps -aux | grep ossfs | grep <PV_NAME>Expected output:
root 2097450 0.0 0.2 124268 33900 ? Ssl 20:47 0:00 /usr/local/bin/ossfs <BUCKET> /<PATH> /var/lib/kubelet/pods/<POD_UID>/volumes/kubernetes.io~csi/<PV_NAME>/mount -ourl=oss-cn-beijing-internal.aliyuncs.com -o allow_otherIn the command, replace the space after <BUCKET> with a colon (:). That is, change <BUCKET> /<PATH> to <BUCKET>:/<PATH>. The original ossfs startup command is as follows.
/usr/local/bin/ossfs <BUCKET>:/<PATH> /var/lib/kubelet/pods/<POD_UID>/volumes/kubernetes.io~csi/<PV_NAME>/mount -ourl=oss-cn-beijing-internal.aliyuncs.com -o allow_otherStep 2: Mount ossfs in the foreground and get debug logs
By default, only the user who runs the mount command can access the directory mounted by ossfs. Other users cannot access the directory. Therefore, if the original command does not include the -o allow_other configuration item, permission issues may occur on the root mount path.
Confirm whether a mount point path permission issue exists.
If a permission issue exists, add the
-o allow_otherconfiguration item when you create the PV. For more information about how to configure access permissions for an ossfs mount point, see Mount a bucket. For more information about how to add configuration items, see Use a statically provisioned ossfs 1.0 volume.Run the following command on the node where the application pod is located to run ossfs in the foreground and set the log mode to Debug.
In this command,
/testis the test mount point path. The ossfs process that runs in the foreground mounts the OSS bucket to/test.mkdir /test && /usr/local/bin/ossfs <BUCKET>:/<PATH> /test -ourl=oss-cn-beijing-internal.aliyuncs.com -f -o allow_other -o dbglevel=debug -o curldbgParameter
Description
-f
Runs ossfs in the foreground instead of as a daemon process. In foreground mode, logs are output to the terminal screen. This parameter is typically used for debugging.
-o allow_other
Grants other users on the computer permission to access the mounted directory. This prevents new mount point path permission issues when mounting ossfs in the foreground.
-o dbglevel=debug
Sets the ossfs log level to debug.
-o curldbg
Enables libcurl logging to troubleshoot errors returned by the OSS server.
Step 3: Analyze the debug logs
After you run ossfs in the foreground, logs are output to the terminal. Typically, ossfs errors fall into two categories: errors from ossfs itself and non-200 error codes returned by the OSS server after a request. The following examples for each error type show the general troubleshooting methods.
If a POSIX command fails, you need to open another terminal to rerun the command and analyze the new ossfs logs.
Errors from ossfs itself
The following example describes how to troubleshoot an error that occurs when you run the chmod -R 777 <mount_point_path_in_application_pod> command.
Because the test ossfs process is mounted to the /test path, the command is as follows.
chmod -R 777 /testYou can query the logs. The `chmod` operation is successful for files in the /test mount point path. However, the following error log is generated for the `chmod` operation on /test itself.
[ERROR] Jul 10 2023:13:03:18:/tmp/ossfs/src/ossfs.cpp:ossfs_chmod(1742): Could not change mode for mount point.Based on the log message, you cannot change the permissions of the mount point path using `chmod`. For more information about how to modify the permissions of a mount point, see ossfs 1.0 volume FAQ.
You can find solutions in the ossfs FAQ topic of the OSS documentation based on the error log. If you cannot find the cause, submit a ticket.
Non-200 error code returned from the OSS server
The following example describes how to troubleshoot an issue where the server returns an error when you perform operations on an object in a bucket.
[INFO] Aug 23 2022:11:54:11:/tmp/ossfs/src/curl.cpp:RequestPerform(2377): HTTP response code 404 was returned, returning ENOENT, Body Text: <?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>NoSuchKey</Code>
<Message>The specified key does not exist.</Message>
<RequestId>xxxx</RequestId>
<HostId><BUCKET>.oss-cn-beijing-internal.aliyuncs.com</HostId>
<Key><object-name></Key>
<EC>0026-00000001</EC>
</Error>When the command failed to execute, the return code from the OSS server was 404, the error cause was NoSuchKey, and the error message was The specified key does not exist.
The log data indicates that the object cannot be found on the OSS server. For more information about the causes of this issue and the corresponding solutions, see NoSuchKey.
You can find the solution in the HTTP error codes topic of the OSS documentation using the error code and error message.