In high-density deployments of stateful applications — such as databases, continuous integration, or batch processing — each pod requires multiple cloud disks for data storage. When many pods are scheduled to the same node simultaneously, the default serial attachment process significantly increases pod startup time. Parallel cloud disk attachment resolves this by attaching all required disks concurrently, reducing 28-pod startup time from over 3 minutes to approximately 40 seconds.
Prerequisites
Before you begin, ensure that you have:
An ACK managed cluster running version 1.26 or later
csi-plugin version 1.30.4 or later
csi-provisioner version 1.30.4 or later
Cloud Assistant CLI installed and configured. For more information, see Install Cloud Assistant CLI
Usage notes
Only cloud disks with a serial number support parallel attachment. Cloud disks created before June 10, 2020 do not contain identifiable serial number information and cannot be used with parallel attachment. For instructions on viewing serial numbers, see View Elastic Block Storage serial numbers.
Unmounting multiple cloud disks from the same node remains a serial operation. Only attachment is parallelized.
After enabling parallel attachment, the Device field returned by OpenAPI operations such as ECS DescribeDisks and the mount target displayed in the console may be unreliable. Confirm the actual mount path using the cloud disk's serial number instead.The maximum number of cloud disks you can attach to a node depends on the ECS instance type. For example, an ecs.g7se.16xlarge instance supports up to 56 attached cloud disks. Check your instance type's disk limit before running the verification test.
Enable parallel cloud disk attachment
Enable parallel attachment using an automated script or by manual configuration.
Use the automated script
Save the following script as
enable_parallel_attach.sh.Run the script.
bash enable_parallel_attach.sh <cluster-id>
Configure manually
Add an ECS tag to the node pool so that all new ECS instances include the tag.
Log on to the ACK console. In the left navigation pane, click ACK consoleClusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Nodes > Node Pools.
On the Node Pools page, click Actions for the target node pool, and then click Edit.
On the Edit Node Pool page, scroll to the Advanced Options section and add an ECS Tags entry with the key
supportConcurrencyAttachand the valuetrue.
Add the tag
supportConcurrencyAttach: trueto the ECS instances of all existing nodes. For more information, see Create and attach custom tags.Configure the csi-provisioner FeatureGate. In the left navigation pane, click Operations > Add-ons. On the Volumes tab, locate csi-provisioner and click Configure. Set FeatureGate to
DiskADController=true,DiskParallelAttach=true.DiskADController=truedelegates cloud diskattachanddetachoperations to csi-provisioner.DiskParallelAttach=trueenables parallel attachment.Configure the csi-plugin FeatureGate. On the Volumes tab, locate csi-plugin and click Configure. Set FeatureGate to
DiskADController=true.
Verify cloud disk parallel attach performance
This example creates multiple pods that mount cloud disks on the same node to measure the improvement from parallel attachment.
The test data in this topic is for reference only. Actual results may vary depending on your environment.
Add a node that supports attaching multiple cloud disks to your ACK cluster. For example, an ecs.g7se.16xlarge instance supports up to 56 attached cloud disks.
Create a file named
attach-stress.yamlwith the following content. Replace<YOUR-HOSTNAME>with your actual node name. Expand to view the attach-stress.yaml file--- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: alibabacloud-disk provisioner: diskplugin.csi.alibabacloud.com parameters: type: cloud_auto volumeBindingMode: WaitForFirstConsumer reclaimPolicy: Delete allowVolumeExpansion: true --- apiVersion: apps/v1 kind: StatefulSet metadata: name: attach-stress spec: selector: matchLabels: app: attach-stress serviceName: attach-stress replicas: 1 podManagementPolicy: Parallel persistentVolumeClaimRetentionPolicy: whenScaled: Retain whenDeleted: Delete template: metadata: labels: app: attach-stress spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - <YOUR-HOSTNAME> # Replace with the actual node name. hostNetwork: true containers: - name: attach-stress image: registry-cn-hangzhou.ack.aliyuncs.com/acs/busybox command: ["/bin/sh", "-c", "trap exit TERM; while true; do date > /mnt/0/data; sleep 1; done"] volumeMounts: - name: volume-0 mountPath: /mnt/0 - name: volume-1 mountPath: /mnt/1 volumeClaimTemplates: - metadata: name: volume-0 spec: accessModes: [ "ReadWriteOnce" ] storageClassName: alibabacloud-disk resources: requests: storage: 1Gi - metadata: name: volume-1 spec: accessModes: [ "ReadWriteOnce" ] storageClassName: alibabacloud-disk resources: requests: storage: 1GiDeploy the StatefulSet and confirm it starts normally, then scale it down to prepare for the batch test.
kubectl apply -f attach-stress.yaml kubectl rollout status sts attach-stress kubectl scale sts attach-stress --replicas 0Expected output:
storageclass.storage.k8s.io/alibabacloud-disk created statefulset.apps/attach-stress created partitioned roll out complete: 1 new pods have been updated... statefulset.apps/attach-stress scaledRun the baseline test before enabling parallel attachment and record the startup time.
Adjust the replica count based on the maximum number of cloud disks your node supports.
date && \ kubectl scale sts attach-stress --replicas 28 && \ kubectl rollout status sts attach-stress && \ dateExpected output:
Tue Oct 15 19:21:36 CST 2024 statefulset.apps/attach-stress scaled Waiting for 28 pods to be ready... Waiting for 27 pods to be ready... ... Waiting for 3 pods to be ready... Waiting for 2 pods to be ready... Waiting for 1 pods to be ready... partitioned roll out complete: 28 new pods have been updated... Tue Oct 15 19:24:55 CST 2024Without parallel attachment, all 28 pods took more than 3 minutes to start.
Enable parallel attachment by following the steps in Enable parallel cloud disk attachment.
Scale down to 0 to clean up the test pods before running the next test.
Monitor the
volumeattachmentsresources in the cluster. Cloud disks are detached after these resources are deleted. This process takes a few minutes.kubectl scale sts attach-stress --replicas 0Run the test again to measure startup time with parallel attachment enabled. The startup is expected to take about 40 seconds, a significant improvement over the previous 3 minutes.
date && \ kubectl scale sts attach-stress --replicas 28 && \ kubectl rollout status sts attach-stress && \ dateExpected output:
Tue Oct 15 20:02:54 CST 2024 statefulset.apps/attach-stress scaled Waiting for 28 pods to be ready... Waiting for 27 pods to be ready... ... Waiting for 3 pods to be ready... Waiting for 2 pods to be ready... Waiting for 1 pods to be ready... partitioned roll out complete: 28 new pods have been updated... Tue Oct 15 20:03:31 CST 2024Remove the test application from the cluster.
kubectl delete -f attach-stress.yaml