By default, the smallest allocatable unit is 1 GiB for GPU sharing. If you require finer-grained GPU memory allocation, you can modify the smallest allocatable unit. This topic describes how to change the smallest allocatable unit to 128 MiB for GPU sharing.
Prerequisites
- ack-ai-installer is installed in your cluster before you can modify the smallest allocatable unit for GPU sharing. For more information about how to install ack-ai-installer, see Install and use ack-ai-installer and the GPU inspection tool.
- If ack-ai-installer is installed in your cluster, you must uninstall ack-ai-installer and then install
it again. Set
gpuMemoryUnit
to 128 MiB when you reinstall ack-ai-installer. - Your cluster is a Container Service for Kubernetes (ACK) Pro cluster that runs Kubernetes 1.18.8 or later. For more information about how to create and update ACK Pro clusters, see Create a professional managed Kubernetes cluster and UpgradeCluster.
Usage notes
- If the aliyun.com/gpu-mem field is configured for a pod, the pod requests GPU resources. If your cluster contains pods that request GPU resources, you cannot change the smallest allocatable unit for GPU sharing from 1 GiB to 128 MiB or from 128 MiB to 1 GiB. You must delete these pods before you can modify the smallest allocatable unit for GPU sharing.
- You can modify the smallest allocatable unit only for nodes for which GPU sharing is enabled but memory isolation is disabled. These nodes have the ack.node.gpu.schedule=share label. For nodes that have the ack.node.gpu.schedule=cgpu label, both GPU sharing and memory isolation are enabled. A GPU can be shared by up to 16 pods due to the limits of the memory isolation module. Therefore, you can create at most 16 pods to share a GPU with 32 GiB of memory even if you change the smallest allocatable unit to 128 MiB.
- If you set the smallest allocatable unit to 128 MiB, nodes in the cluster cannot be automatically scaled even if you enable auto scaling for nodes. For example, you set aliyun.com/gpu-mem to 32 for a pod. In this case, if the available GPU memory in the cluster is insufficient to meet the memory request of the pod, no new node is added and the pod remains in the Pending state.
- If you use a cluster that is created before October 20, 2021, you must Submit a ticket to restart the scheduler. The modification to the smallest allocatable unit is applied only after the scheduler restarts.
Modify the smallest allocatable unit
Examples
Use the following code block to create a pod. In the pod configuration, aliyun.com/gpu-mem
is set to 16. The smallest allocatable unit is 128 MiB. Therefore, the amount of
GPU memory requested by the pod is 2 GiB (16 × 128 MiB).
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: binpack
labels:
app: binpack
spec:
replicas: 1
serviceName: "binpack-1"
podManagementPolicy: "Parallel"
selector: # define how the deployment finds the pods it manages
matchLabels:
app: binpack-1
template: # The pod specifications.
metadata:
labels:
app: binpack-1
spec:
containers:
- name: binpack-1
image: registry.cn-beijing.aliyuncs.com/ai-samples/gpushare-sample:tensorflow-1.5
command:
- bash
- gpushare/run.sh
resources:
limits:
# 128 MiB
aliyun.com/gpu-mem: 16 # 16 * 128 MiB = 2 GiB