This guide covers best practices for using disk volumes in ACS Clusters. It includes recommendations for StorageClass configurations and application settings for volumes used as ephemeral or persistent storage.
Overview of block storage volumes
Alibaba Cloud Elastic Block Service (EBS) provides a high-throughput, low-latency persistent storage solution for compute instances and is the core infrastructure for compute-intensive scenarios such as high-performance computing (HPC), artificial intelligence (AI), and big data analytics. For more information, see Overview of Block Storage.
Container Compute Service (ACS) supports the following types of disk volumes:
cloud_essd_entry: ESSD Entry diskcloud_auto: ESSD AutoPL diskcloud_essd(default): ESSDcloud_ssd: standard SSDcloud_efficiency: ultra diskYou can combine these parameters, for example,
type: cloud_efficiency, cloud_ssd, cloud_essd. The system will try to create a disk of each specified type in order until one is created.
Legacy products such as Standard SSD and Ultra Disk are gradually being phased out in some regions and zones. These disk types are not supported by ACS GPU clusters. We recommend using the ESSD disk series to ensure compatibility.
StorageClass configuration recommendations
Use a StorageClass to enable dynamic provisioning and topology-aware scheduling of storage resources.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: alicloud-essd
parameters:
type: cloud_auto,cloud_essd
fstype: xfs
provisioner: diskplugin.csi.alibabacloud.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: trueConfigure the core parameters of a StorageClass based on the following principles:
Bind to a storage type: Specify the disk type by using the
parameters.typefield. The examplecloud_auto,cloud_essdindicates that the system prioritizescloud_auto(ESSD AutoPL disks) and usescloud_essd(ESSD disks) as a fallback. For ESSD disks, use theperformanceLevelparameter to define a performance level (PL0 to PL3) to match the I/O requirements of scenarios like HPC and AI training.Optimize the binding mode: Set
volumeBindingModetoWaitForFirstConsumer. In this mode, the system schedules the Pod first and then provisions the cloud disk in the same zone as the Pod. Combined with Alibaba Cloud's elastic provisioning, this ensures topology-aware scheduling, preventing instances where a Pod is scheduled in a different zone than its associated persistent volume claim (PVC) and persistent volume (PV).Select a file system: Specify the file system type by using the
parameters.fstypefield. We recommend XFS. It supports large metadata blocks and delayed logging, which improves performance for large-scale parallel I/O operations.
How to use block storage volumes in ACS
Comparison and selection
Characteristic | Ephemeral storage | Persistent storage |
Data lifecycle | Bound to the Pod. Data is destroyed when the Pod is deleted. | Independent of the Pod. You must manually delete the PVC and PV. |
Data recovery | Not available. Data loss is irreversible. | Supported. The PVC is retained, which allows the Pod to be rebuilt. |
Storage cost | Billed on demand for the duration of the Pod's lifecycle. | Billed continuously until the storage resource is released. |
Use cases | Temporary caches, intermediate computation results. | Databases, logging systems, stateful applications. |
Ephemeral storage is not supported in ACS clusters running Kubernetes version 1.24 or earlier.
Based on the characteristics above, we recommend the following:
Use ephemeral storage if your application data can tolerate loss and requires rapid elastic scaling, such as for stateless web applications.
Choose persistent storage for scenarios that require data persistence, data sharing across Pods, or high availability, such as for a MySQL cluster in a Kubernetes environment.
By selecting the appropriate storage solution, you can achieve balance between resource utilization and data reliability in your ACS Cluster. You can also leverage Alibaba Cloud storage services, such as disk snapshots and cross-zone replication, to enhance your system's disaster recovery capabilities.
Ephemeral storage
Ephemeral storage is a lightweight Kubernetes storage solution implemented with the ephemeral volume type. Its lifecycle is bound to the Pod. When a Pod is deleted, the associated PVC and PV are automatically destroyed, and the data cannot be recovered. This mechanism is ideal for temporary storage, such as caches, log buffers, intermediate computation results, and temporary data processing.
This storage type also offers the following advantages:
Resource isolation: Each Pod has its own independent storage resource, which avoids performance contention or data corruption that can occur with shared storage.
Simplified declarative configuration: The configuration is embedded directly in the Pod template with a
volumeClaimTemplate. This eliminates the need to create PVCs and PVs in advance and simplifies the resource management process.
The following YAML shows a
Deploymentthat mounts a disk volume as ephemeral storage:
In the YAML example, note the following key configurations:
volumes:
- name: scratch-volume
ephemeral: # Explicitly claim ephemeral storage.
volumeClaimTemplate:
metadata:
labels:
type: scratch-volume
spec:
accessModes: [ "ReadWriteOncePod" ]
storageClassName: alicloud-essd
resources:
requests:
storage: 30GiSelect a StorageClass: Use
storageClassNameto select a pre-createdStorageClass.Configure the access mode: Set
accessModestoReadWriteOncePod(RWXO). RWXO is the recommended mode for ephemeral volumes. It allows the volume to be mounted as read-write by a single Pod, which ensures data consistency.Plan storage capacity: Adjust the
storagevalue based on the actual needs of your Pod. Setting this value too low can lead to storage shortages, while setting it too high wastes resources. Monitor theephemeral-storageusage of Pods by using tools such as Prometheus or CloudMonitor to optimize resource allocations.
Persistent storage
Persistent storage is implemented with a StatefulSet controller. Its core principle is that the lifecycle of the PVC and PV is independent of the Pod. Even if a Pod is terminated due to scaling down, failure, or manual deletion, the PVC and PV retain the data and can be rebound to a new Pod. This mechanism is suitable for stateful applications, such as databases and message queues, or scenarios that require data sharing across Pods.
A StatefulSet ensures storage persistence through the following mechanisms:
Independent PVC management: Each Pod is associated with an independent PVC. The PVCs are automatically generated from the
volumeClaimTemplates, but their lifecycle is decoupled from the Pod.Ordered recovery: When a Pod is rebuilt, the
StatefulSetcontroller prioritizes recovering the storage volume to ensure data consistency.
The following YAML shows a StatefulSet that mounts a disk volume as persistent storage:
In the YAML example, note the following key configurations:
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: alicloud-essd
resources:
requests:
storage: 50GiSelect a StorageClass: Use
storageClassNameto select a pre-createdStorageClass.Configure the access mode: Set
accessModestoReadWriteOnce(RWO). This ensures that the storage volume is only mounted by a single node, which meets the access requirements of a single-instance database.Ensure data persistence: The PVC is retained after the Pod is deleted. If you force a Pod to be rebuilt by running the
kubectl delete pod -n <namespace> <pod-name>command, the new Pod rebinds to the original PVC.