In containerized environments, traditional solutions often use Filesystem in Userspace (FUSE)-based file systems, such as ossfs, to mount object storage data. However, for small file read-intensive scenarios (such as AI training dataset loading, time-series log analysis, and other business operations that require fast traversal of millions of small files), traditional solutions cannot meet high throughput and low latency requirements. In these cases, we recommend that you use a virtual block device solution by mounting object storage data through strmvol volumes to optimize small file read performance.
Usage notes
strmvol volumes
Only internal endpoints can be used to access Object Storage Service (OSS) data.
The virtual block device needs to build a complete file metadata index during initialization, which requires some node resources during this phase. The application pod remains in the ContainerCreating state. The required resources and time directly correlate to the number of files in the OSS bucket mount path.
When selecting the OSS bucket mount path, follow the principle of least privilege.
For example, if application A accesses content under
/app/a/in the OSS bucket, and application B accesses content under/app/b/, we recommend that you create separate storage volumes for A and B with target mount paths of/app/a/and/app/b/respectively.For the time and resource overhead during initialization, see Metadata index building.
The resource occupied by the virtual block device during mounting is limited by the configurations on the persistent volume (PV), except in the initialization phase. Reserve sufficient node resources.
A virtual block device allows you to store up to 16 TiB of data in the attached OSS path.
Cluster and node requirements
The cluster must be an ACK managed Pro cluster or ACK dedicated cluster, with Kubernetes version 1.20 or later, and the storage plug-in type must be CSI.
The kernel version of the node is 4.19 or later. We recommend that you set the resource specifications to 4C8G or more.
Supported node operating systems: Alibaba Cloud Linux 3, Alibaba Cloud Linux 2, and CentOS 7. To optimize data access performance by using the erofs operating system, we recommend that you use Alibaba Cloud Linux 3.
NoteCentOS 7 and Alibaba Cloud Linux 2 have reached end-of-life (EOL). For more information, see [Product Changes] EOL of Alibaba Cloud Linux 2 and CentOS 7.
You cannot mount a strmvol volume to a serverless pod scheduled to a virtual node.
Preparations
Step 1: Deploy strmvol-csi-driver
strmvol volumes require a separate Container Storage Interface (CSI) driver (strmvol-csi-driver). After the CSI driver is deployed, it operates independently of the csi-provisioner and csi-plugin components in the ACK cluster.
Log on to the ACK console. In the left-side navigation pane, choose .
On the Marketplace page, search for strmvol-csi-driver, and then click the corresponding card.
On the details page, click Deploy in the upper-right corner.
In the panel that appears, configure the basic information and parameters, and then click OK.
Step 2: Configure OSS access permissions
Create a RAM user and grant permissions.
Create a RAM user. You can skip this step if you have an existing RAM user. For more information about how to create a RAM user, see Create a RAM user.
Create a custom policy to grant OSS access permissions to the RAM user. For more information, see Create custom policies.
Select the read-only policy or read-write policy based on your business requirements. Replace
mybucketwith the name of the bucket you created.Policy that provides read-only permissions on OSS
Policy that provides read-write permissions on OSS
Optional. If the objects in the OSS bucket are encrypted by using a specified customer master key (CMK) in Key Management Service (KMS), you need to grant KMS access permissions to the RAM user. For more information, see Encryption.
Grant OSS access permissions to the RAM user. For more information, see Grant permissions to a RAM user.
Create an AccessKey pair for the RAM user. For more information, see Create an AccessKey pair.
Create a Secret that stores authentication information for accessing OSS data.
The following is an example command. Replace
akIdandakSecretwith your actual AccessKey.kubectl create -n default secret generic strmvol-secret --from-literal='akId=xxxxxx' --from-literal='akSecret=xxxxxx'
Mount strmvol volumes
Step 1: Create strmvol volumes
Mount a statically provisioned strmvol volume
Create a PV.
Modify the following YAML file and save it as strmvol-pv.yaml.
apiVersion: v1 kind: PersistentVolume metadata: name: pv-strmvol spec: capacity: # The OSS mount target can store up to 16 TiB of data. storage: 20Gi # Only the ReadOnlyMany access mode is supported. accessModes: - ReadOnlyMany # To prevent remote data from being deleted, only Retain is supported. persistentVolumeReclaimPolicy: Retain csi: driver: strmvolplugin.csi.alibabacloud.com volumeHandle: pv-strmvol # Use the created Secret. nodeStageSecretRef: name: strmvol-secret namespace: default volumeAttributes: bucket: cnfs-oss-test path: /subpath # The strmvol volume can access OSS data only over the internal network. url: oss-cn-hangzhou-internal.aliyuncs.com umask: "000" directMode: "false" resourceLimit: "2c4g"nodeStageSecretRefParameter
Required
Description
nameRequired
The name of the Secret that stores the AccessKey information.
namespaceRequired
The namespace of the Secret that stores the AccessKey information.
volumeAttributesParameter
Required
Description
bucketRequired
The OSS bucket that you want to mount.
pathOptional
The OSS Bucket mount path, which represents the directory structure relative to the root file of the bucket when mounting.
ImportantSelect the mount path based on the principle of least privilege.
urlRequired
The internal endpoint that is used to mount OSS. You can view the internal endpoint on the Bucket page in the OSS console. Common endpoint format:
http://oss-{{regionName}}-internal.aliyuncs.comorhttps://oss-{{regionName}}-internal.aliyuncs.com.ImportantThe internal access endpoint format
vpc100-oss-{{regionName}}.aliyuncs.comis deprecated. Switch to the new format at the earliest opportunity.umaskOptional
The mask of the default file system permissions after the virtual block device is mounted.
If you want to use 755 as the default file system permission, set umask to 022.
directModeOptional
Specifies whether to enable the direct mode.
"true": When you enable the direct mode, prefetching and local data caching are disabled. This mode is suitable for random reading of small files, such as random batch reading of training datasets."false": By default, the direct mode is disabled. This mode is suitable for general scenarios such as sequential reading of small files and large files. If your business has no obvious data reading features, disable the direct mode.
resourceLimitOptional
The maximum resource limit of a node after a virtual block device is mounted.
For example,
"2c4g"indicates that the virtual block device can use up to 2 vCPUs and 4 GiB of memory from the node.NoteThe memory is used for data prefetching and local caching. When you enable the direct mode, the required memory is significantly lower than the default value.
Operating systems other than Alibaba Cloud Linux 3 have the same upper limit of data reading performance. Therefore, we do not recommend that you use advanced performance configurations. For more information, see Data reading performance test.
Create a PV.
kubectl create -f strmvol-pv.yamlCheck the PV status.
kubectl get pv pv-strmvolExpected output:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE pv-strmvol 20Gi ROX Retain Available <unset> 18s
Create a persistent volume claim (PVC).
Modify the following YAML file and save it as strmvol-pvc-static.yaml.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pvc-strmvol namespace: default spec: # The following configurations must match those of the PV. accessModes: - ReadOnlyMany resources: requests: storage: 20Gi volumeName: pv-strmvolCreate a PVC.
kubectl create -f strmvol-pvc-static.yamlCheck the PVC status.
kubectl get pvc pvc-strmvolThe following output shows that a PV is bound to the PVC.
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE pvc-strmvol-2 Bound pv-strmvol 20Gi ROX <unset> 16s
Mount a dynamically provisioned strmvol volume
Create a StorageClass.
Modify the following YAML file and save it as strmvol-sc.yaml.
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: strmvol-test parameters: # Use the created Secret. csi.storage.k8s.io/node-stage-secret-name: strmvol-secret csi.storage.k8s.io/node-stage-secret-namespace: default bucket: cnfs-oss-test path: /subpath # The strmvol volume can access OSS data only over the internal network. url: oss-cn-hangzhou-internal.aliyuncs.com umask: "000" directMode: "false" resourceLimit: "2c4g" provisioner: strmvolplugin.csi.alibabacloud.com # To prevent remote data from being deleted, only Retain is supported reclaimPolicy: Retain volumeBindingMode: ImmediateThe following table describes the
parameters:Configure a Secret
Parameter
Required
Description
csi.storage.k8s.io/node-stage-secret-nameRequired
The name of the Secret that stores the AccessKey information.
csi.storage.k8s.io/node-stage-secret-namespaceRequired
The namespace of the Secret that stores the AccessKey information.
Configure a PV
Parameter
Required
Description
bucketRequired
The OSS bucket that you want to mount.
pathOptional
The OSS Bucket mount path, which represents the directory structure relative to the root file of the bucket when mounting.
ImportantSelect the mount path based on the principle of least privilege.
urlRequired
The internal endpoint that is used to mount OSS. You can view the internal endpoint on the Bucket page in the OSS console. Common endpoint format:
http://oss-{{regionName}}-internal.aliyuncs.comorhttps://oss-{{regionName}}-internal.aliyuncs.com.ImportantThe internal access endpoint format
vpc100-oss-{{regionName}}.aliyuncs.comis deprecated. Switch to the new format at the earliest opportunity.umaskOptional
The mask of the default file system permissions after the virtual block device is mounted.
If you want to use 755 as the default file system permission, set umask to 022.
directModeOptional
Specifies whether to enable the direct mode.
"true": When you enable the direct mode, prefetching and local data caching are disabled. This mode is suitable for random reading of small files, such as random batch reading of training datasets."false": By default, the direct mode is disabled. This mode is suitable for general scenarios such as sequential reading of small files and large files. If your business has no obvious data reading features, disable the direct mode.
resourceLimitOptional
The maximum resource limit of a node after a virtual block device is mounted.
For example,
"2c4g"indicates that the virtual block device can use up to 2 vCPUs and 4 GiB of memory from the node.NoteThe memory is used for data prefetching and local caching. When you enable the direct mode, the required memory is significantly lower than the default value.
Operating systems other than Alibaba Cloud Linux 3 have the same upper limit of data reading performance. Therefore, we do not recommend that you use advanced performance configurations. For more information, see Data reading performance test.
Create a StorageClass.
kubectl create -f strmvol-sc.yaml
Create a PVC.
Modify the following YAML file and save it as strmvol-pvc-dynamic.yaml.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pvc-strmvol namespace: default spec: # Only the ReadOnlyMany access mode is supported accessModes: - ReadOnlyMany # Specify a StorageClass storageClassName: strmvol-test resources: requests: # The OSS mount target can store up to 16 TiB of data storage: 20GiCreate a PVC.
kubectl create -f strmvol-pvc-dynamic.yamlCheck the PVC status.
kubectl get pvc pvc-strmvolThe following output shows that a PV is automatically created by the CSI plug-in and bound to the PVC.
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE pvc-strmvol Bound strmvol-d8d1d22a-e1d7-4caa-b875-54f378dec769 20Gi ROX strmvol-test <unset> 3m
Step 2: Create an application and mount strmvol volumes
Create a file named strmvol-test.yaml and copy the following content to the file:
The following YAML example creates a StatefulSet with one pod. The pod requests storage resources through a PVC named
pvc-strmvoland mounts the volume to the/datapath.apiVersion: apps/v1 kind: StatefulSet metadata: name: strmvol-test namespace: default spec: replicas: 1 selector: matchLabels: app: strmvol-test template: metadata: labels: app: strmvol-test spec: containers: - name: nginx image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6 ports: - containerPort: 80 volumeMounts: - name: pvc-strmvol mountPath: /data volumes: - name: pvc-strmvol persistentVolumeClaim: claimName: pvc-strmvolCreate a StatefulSet and mount a strmvol volume to the StatefulSet.
kubectl create -f strmvol-test.yamlCheck whether the pod provisioned by the StatefulSet is deployed.
kubectl get pod -l app=strmvol-testExpected output:
NAME READY STATUS RESTARTS AGE strmvol-test-0 1/1 Running 0 14sThe output indicates that the mount point is a block device and that the application can access OSS data.
kubectl exec -it strmvol-test-0 -- sh -c "df /data && ls /data"A response similar to the following command output is returned:
Filesystem 1K-blocks Used Available Use% Mounted on /dev/ublkb1 24812 24812 0 100% /data <data in OSS mountpath>