Non-volatile memory (NVM) is a type of persistent memory (PMEM) product that is provided by Intel. You can use NVM to expand the memory capacity at lower costs and access persistent data with lower latency. NVM provides the benefits of memory and storage products. This topic describes how to use NVM volumes in Container Service for Kubernetes (ACK) clusters and provides examples.
Background information
PMEM provides high-performance memory that supports data persistence. PMEM resides on the memory bus and allows you to access data in the same way as when you use dynamic random access memory (DRAM). PMEM provides almost the same speed and latency as DRAM and the non-volatility of NAND flash. PMEM provides the following benefits:
- Lower latency than flash SSDs when you access data.
- Higher throughput than flash storage.
- Lower costs than DRAM.
- Data caching. This resolves the issue that data transmitted through Peripheral Component Interconnect Express (PCIe) cannot be cached in the CPU.
- Real-time access to data and ultra-high-speed access to large datasets.
- Data is retained in memory after the machine is powered off. This provides the same benefit as flash memory.
The re6p Elastic Compute Service (ECS) instance family supports the first generation of PMEM and the re7p ECS instance family supports the second generation of PMEM.
The following ECS instance families support the first generation of PMEM:
- re6p, persistent memory-optimized instance family. For more information, see re6p, persistent memory-optimized instance family.
- ebmre6p, persistent memory-optimized ECS Bare Metal Instance family with enhanced performance. For more information, see ebmre6p, persistent memory-optimized ECS Bare Metal Instance family with enhanced performance.
How to use NVM volumes
You can use the Container Storage Interface (CSI) driver that is provided by Alibaba Cloud to manage the lifecycle of NVM devices in ACK clusters. This allows you to allocate, mount, and use NVM resources by using declarative claims.
- PMEP-LVM (use NVM as non-intrusive block storage)
You can directly claim NVM resources without the need to modify your applications. You can use Logical Volume Manager (LVM) to virtualize PMEM resources on a node into volume groups (VGs). Then, you can create persistent volume claims (PVCs) of the required type and capacities. You can use NVM without the need to modify the following types of applications: serverless applications, low-latency and high-throughput data computing applications, and short-CI/CD period applications that require high-speed temporary storage. This allows you to improve the I/O throughput by 2 to 10 times. For more examples, see Use AEP non-volatile memory to improve read and write performance.
- PMEM-direct memory
You can use PMEM as direct memory by making a specific number of modifications to the memory allocation functions. This allows you to access data in a similar way as using DRAM. This way, you can provision NVM resources as direct memory at TB-level and reduce 30% to 50% of the cost. This meets the requirements of in-memory databases such as Redis and SAP HANA in terms of large memory and cost-effectiveness. For more examples, see Deploy a Redis instance that has an NVM volume mounted as direct memory.
- PMEP-LVM: NVM resources can be used as block storage or file systems in ACK clusters without intrusion or modification to your applications. The I/O throughput is 2 to 10 times higher than SSDs.
- PMEM-direct memory: NVM resources can be used as direct memory in ACK clusters. You must modify the applications so that they are adaptive to the logic in PMEM SDK for memory allocation. This offers high throughput and low latency that are comparable to DRAM.
Method | Support for fragmented storage | Support for online expansion | Support for memory persistence | Support for application modification | Latency (4K/RW) | Throughput (4K/RW) | Maximum capacity of a single ECS instance (ecs.ebmre6p.26xlarge) |
---|---|---|---|---|---|---|---|
PMEM-LVM | No | Yes | Yes | No | 10 us | 10W | 1536 GB |
PMEM-Direct | Yes | No | No | Yes | 1.2 us | 56W | 768 GB |
SSD | No | Yes | Yes | No | 100 us | 1W | 32 TB |
Deploy CSI components
- CSI-Plugin: initializes PMEM devices and creates, deletes, mounts, and unmounts volumes.
- CSI-Provisioner: detects and initiates volume creation and deletion requests.
- CSI-Scheduler: schedules storage (The ACK scheduler is a preinstalled component).
- To enable automatic O&M for NVM devices, you must add the
pmem.csi.alibabacloud.com
label to the node that uses NVM. - To use the PMEP-LVM method, you must add the
pmem.csi.alibabacloud.com: lvm
label to the node that uses NVM. - To use the PMEM-direct memory method, you must add the
pmem.csi.alibabacloud.com: direct
label to the node that uses NVM.
- Create an ACK cluster. Create an ACK cluster that contains ECS instances with PMEM resources. For example, create an ACK cluster that contains ECS instances of ecs.ebmre6p.26xlarge. For more information, see Create an ACK managed cluster.
- Configure the node to use PMEM resources. To ensure that the CSI plug-in works as expected, you must add the required label to the node.Add the following label to the node:
You can also add the following label to the node:pmem.csi.alibabacloud.com/type: direct
pmem.csi.alibabacloud.com/type: lvm
- Deploy the CSI plug-in for PMEM.
apiVersion: storage.k8s.io/v1 kind: CSIDriver metadata: name: localplugin.csi.alibabacloud.com spec: attachRequired: false podInfoOnMount: true --- kind: DaemonSet apiVersion: apps/v1 metadata: name: csi-local-plugin namespace: kube-system spec: selector: matchLabels: app: csi-local-plugin template: metadata: labels: app: csi-local-plugin spec: tolerations: - operator: Exists serviceAccount: admin priorityClassName: system-node-critical hostNetwork: true hostPID: true containers: - name: driver-registrar image: registry.cn-hangzhou.aliyuncs.com/acs/csi-node-driver-registrar:v1.3.0-6e9fff3-aliyun imagePullPolicy: Always args: - "--v=5" - "--csi-address=/csi/csi.sock" - "--kubelet-registration-path=/var/lib/kubelet/csi-plugins/localplugin.csi.alibabacloud.com/csi.sock" env: - name: KUBE_NODE_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName volumeMounts: - name: plugin-dir mountPath: /csi - name: registration-dir mountPath: /registration - name: csi-localplugin securityContext: privileged: true capabilities: add: ["SYS_ADMIN"] allowPrivilegeEscalation: true image: registry.cn-hangzhou.aliyuncs.com/acs/csi-plugin:v1.20.6-2be29b1-aliyun imagePullPolicy: "Always" args : - "--endpoint=$(CSI_ENDPOINT)" - "--v=5" - "--nodeid=$(KUBE_NODE_NAME)" - "--driver=localplugin.csi.alibabacloud.com" env: - name: KUBE_NODE_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName - name: CSI_ENDPOINT value: unix://var/lib/kubelet/csi-plugins/localplugin.csi.alibabacloud.com/csi.sock volumeMounts: - name: pods-mount-dir mountPath: /var/lib/kubelet mountPropagation: "Bidirectional" - mountPath: /dev mountPropagation: "HostToContainer" name: host-dev - mountPath: /var/log/ name: host-log volumes: - name: plugin-dir hostPath: path: /var/lib/kubelet/csi-plugins/localplugin.csi.alibabacloud.com type: DirectoryOrCreate - name: registration-dir hostPath: path: /var/lib/kubelet/plugins_registry type: DirectoryOrCreate - name: pods-mount-dir hostPath: path: /var/lib/kubelet type: Directory - name: host-dev hostPath: path: /dev - name: host-log hostPath: path: /var/log/ updateStrategy: rollingUpdate: maxUnavailable: 10% type: RollingUpdate
kind: Deployment apiVersion: apps/v1 metadata: name: csi-local-provisioner namespace: kube-system spec: selector: matchLabels: app: csi-local-provisioner replicas: 2 template: metadata: labels: app: csi-local-provisioner spec: tolerations: - operator: "Exists" affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: node-role.kubernetes.io/master operator: Exists priorityClassName: system-node-critical serviceAccount: admin hostNetwork: true containers: - name: external-local-provisioner image: registry.cn-hangzhou.aliyuncs.com/acs/csi-provisioner:v1.6.0-b6f763a43-ack args: - "--csi-address=$(ADDRESS)" - "--feature-gates=Topology=True" - "--volume-name-prefix=disk" - "--strict-topology=true" - "--timeout=150s" - "--extra-create-metadata=true" - "--enable-leader-election=true" - "--leader-election-type=leases" - "--retry-interval-start=500ms" - "--v=5" env: - name: ADDRESS value: /socketDir/csi.sock imagePullPolicy: "Always" volumeMounts: - name: socket-dir mountPath: /socketDir - name: external-local-resizer image: registry.cn-hangzhou.aliyuncs.com/acs/csi-resizer:v0.3.0 args: - "--v=5" - "--csi-address=$(ADDRESS)" - "--leader-election" env: - name: ADDRESS value: /socketDir/csi.sock imagePullPolicy: "Always" volumeMounts: - name: socket-dir mountPath: /socketDir/ volumes: - name: socket-dir hostPath: path: /var/lib/kubelet/csi-plugins/localplugin.csi.alibabacloud.com type: DirectoryOrCreate
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: csi-pmem-direct provisioner: localplugin.csi.alibabacloud.com mountOptions: - dax parameters: volumeType: PMEM pmemType: "direct" reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: pmem-lvm provisioner: localplugin.csi.alibabacloud.com mountOptions: - dax parameters: volumeType: PMEM nodeAffinity: "true" pmemType: "lvm" reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true
Examples
Use AEP as block storage volumes
- Create a PVC with the following template:
apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.kubernetes.io/selected-node: cn-zhangjiakou.192.168.XX.XX name: pmem-lvm spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi storageClassName: pmem-lvm
To schedule the PVC to a specific NVM node, add the following annotation to the PVC configurations:
annotations: volume.kubernetes.io/selected-node
. - Deploy a workload with the following template:
apiVersion: apps/v1 kind: StatefulSet metadata: name: sts-lvm labels: app: busybox-lvm spec: selector: matchLabels: app: busybox-lvm serviceName: "busybox" template: metadata: labels: app: busybox-lvm spec: containers: - name: busybox image: busybox command: ["sh", "-c"] args: ["sleep 10000"] volumeMounts: - name: pmem-pvc mountPath: "/data" volumes: - name: pmem-pvc persistentVolumeClaim: claimName: pmem-lvm
- View the results.
- Run the following command to query the created PVC:
kubectl get pvc
Expected output:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pmem-lvm Bound disk-**** 10Gi RWO pmem-lvm 10m
- Run the following command to query the created pod:
kubectl get pod
Expected output:
NAME READY STATUS RESTARTS AGE sts-lvm-0 1/1 Running 0 10m
- Run the following command to query the created PVC:
- Run the following command to access the application and check the mount path of the volume:
kubectl exec -ti sts-lvm-0 -- df /data
Expected output:
Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/pmemvgregion0-disk--**** 10255636 36888 10202364 1% /data
The output shows that a block storage volume is created and mounted to the application pod.
Use NVM as direct memory volumes
- Create a PVC with the following template:
apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.kubernetes.io/selected-node: cn-zhangjiakou.192.168.XX.XX name: pmem-direct spec: accessModes: - ReadWriteOnce resources: requests: storage: 9Gi storageClassName: pmem-direct
To schedule the PVC to a specific NVM node, add the following annotation to the PVC configurations:
annotations: volume.kubernetes.io/selected-node
. - Deploy a workload with the following template:
apiVersion: apps/v1 kind: StatefulSet metadata: name: sts-direct labels: app: busybox-direct spec: selector: matchLabels: app: busybox-direct serviceName: "busybox" template: metadata: labels: app: busybox-direct spec: containers: - name: busybox image: busybox command: ["sh", "-c"] args: ["sleep 1000"] volumeMounts: - name: pmem-pvc mountPath: "/data" volumes: - name: pmem-pvc persistentVolumeClaim: claimName: pmem-direct
- View the results.
- Run the following command to query information about the PVC:
kubectl get pvc pmem-direct
Expected output:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pmem-direct Bound disk-**** 9Gi RWO pmem-direct 17m
- Run the following command to query the pod:
kubectl get pod
Expected output:
NAME READY STATUS RESTARTS AGE sts-direct-0 1/1 Running 0 17m
- Run the following command to query information about the PVC:
- Run the following command to access the application and check the mount path of the volume:
kubectl exec -ti sts-lvm-0 -- df /data
Expected output:
Filesystem 1K-blocks Used Available Use% Mounted on /dev/pmem0 9076344 36888 9023072 1% /data
The output shows that a PMEM volume is created and mounted to the application pod.