In scenarios that involve high-density deployments of stateful applications (such as databases) or large numbers of short-lived containers (such as continuous integration and batch processing), each pod requires a large number of disks to persist data. When a large number of pods are scheduled to the same node at the same time, the default serial mounting method increases the pod startup time. To resolve this issue, you can enable the disk parallel mounting feature.
Prerequisites
An ACK managed cluster is created. The Kubernetes version of the cluster is 1.26 or later. The versions of the csi-plugin and csi-provisioner components are 1.30.4 or later. For more information, see csi-plugin and csi-provisioner.
Alibaba Cloud CLI is installed and configured. For more information, see Install Alibaba Cloud CLI.
Usage notes
You can enable the parallel mounting feature only for disks that have serial numbers. For more information about how to query the serial number of a disk, see Query the serial numbers of block storage devices.
Disks that were created before June 10, 2020 do not have recognizable serial numbers. If you enable the parallel mounting feature for these disks, mounting failures will occur.
When multiple disks are unmounted from the same node, the disks are unmounted in serial mode.
After you enable parallel mounting, the
Devicefields returned by the API, such as ECS DescribeDisks, and the mount target displayed in the console may be inaccurate. Do not use this mount path in your business. You can use the serial number of the disk to confirm the actual mount path.
Procedure
You can manually enable disk parallel mounting or use an automated script to enable this feature.
Use an automated script
Save the following script as a file named enable_parallel_attach.sh.
Execute the script to mount disks in parallel mode.
bash enable_parallel_attach.sh <Cluster ID>
Manually enable the feature
Add an ECS tag to the node pool of the cluster. Set the tag key to
supportConcurrencyAttachand the tag value totrue. Make sure that the tag is added to a new ECS instance.Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose .
On the Node Pools page, find the node pool that you want to modify and click Edit in the Actions column.
In the lower part of the page, find the Advanced Options section and add an ECS tag. Set the key to
supportConcurrencyAttachand the value totrue.
Add a tag to the ECS instances of all existing nodes in the cluster. Set the key to
supportConcurrencyAttachand the value totrue. For more information, see Add a custom tag.In the left-side navigation pane, choose Operations > Add-ons. Click the Storage tab, find the csi-provisioner component, click Configure in the lower-right corner of the component, and set the FeatureGate parameter to
DiskADController=true,DiskParallelAttach=true.NoteAfter you specify
DiskADController=true, theattachanddetachoperations related to disks are performed by csi-provisioner. After you specifyDiskParallelAttach=true, the disk parallel mounting feature is enabled.After you configure csi-provisioner, set the FeatureGate parameter of the csi-plugin component to
DiskADController=true.
Verify that the disk parallel mounting feature is enabled
In this example, pods with a large number of disks mounted are created on the same node to verify pod startup acceleration after parallel mounting is enabled.
The statistics provided in this topic are only theoretical values. The actual values may vary based on your environment.
Add a node that supports multiple disks to an ACK cluster. For example, you can mount up to 56 disks to an instance of the ecs.g7se.16xlarge type.
Create a test file named attach-stress.yaml and copy the following content to the file. Replace
attach-stress.yamlwith the actual name of the node.Run the following command to confirm that the application starts as expected. Then, scale in the number of pods to 0 to prepare for the subsequent batch mount tests.
kubectl apply -f attach-stress.yaml kubectl rollout status sts attach-stress kubectl scale sts attach-stress --replicas 0Expected output:
storageclass.storage.k8s.io/alibabacloud-disk created statefulset.apps/attach-stress created partitioned roll out complete: 1 new pods have been updated... statefulset.apps/attach-stress scaledRun the following command to start the batch mount test and calculate the time required for starting pods:
NoteIn this case, parallel mounting is disabled for the cluster. Adjust the number of pods for the test based on the maximum number of disks supported by your node.
date && \ kubectl scale sts attach-stress --replicas 28 && \ kubectl rollout status sts attach-stress && \ dateExpected output:
Tuesday October 15 19:21:36 CST 2024 statefulset.apps/attach-stress scaled Waiting for 28 pods to be ready... Waiting for 27 pods to be ready... <Omitted...> Waiting for 3 pods to be ready... Waiting for 2 pods to be ready... Waiting for 1 pods to be ready... partitioned roll out complete: 28 new pods have been updated... Tuesday October 15 19:24:55 CST 2024The output indicates that more than 3 minutes is required for starting all 28 pods when parallel mounting is disabled.
Enable parallel mounting by following the instructions in the Procedure section.
Run the following command to delete the preceding pods and prepare for the subsequent round of testing:
NotePay attention to the
volumeattachmentsresources in the cluster. After the resources are deleted, the disks are unmounted. This process requires a few minutes.kubectl scale sts attach-stress --replicas 0Run the following command again to calculate the time required for starting pods after parallel mounting is enabled. The expected time is approximately 40 seconds, which is much faster than the 3 minutes when parallel mounting is disabled.
date && \ kubectl scale sts attach-stress --replicas 28 && \ kubectl rollout status sts attach-stress && \ dateExpected output:
Tuesday October 15 20:02 54 CST 2024 statefulset.apps/attach-stress scaled Waiting for 28 pods to be ready... Waiting for 27 pods to be ready... <Omitted...> Waiting for 3 pods to be ready... Waiting for 2 pods to be ready... Waiting for 1 pods to be ready... partitioned roll out complete: 28 new pods have been updated... Tuesday October 15 20:03:31 CST 2024
Run the following command to delete the test applications in the cluster:
kubectl delete -f attach-stress.yaml