You can use node pools to regulate the GPU sharing and memory isolation policies of cGPU. In this topic, two labeled node pools are created to demonstrate how to use node pools to control the GPU sharing and memory isolation capabilities of cGPU.
Prerequisites
- Install a shared GPU.
- Node pools are configured.
You can customize the names of the node pools. In this example, the node pools are named cgpu and cgpu-no-isolation.
Node pool name GPU sharing Memory isolation Label cgpu Enabled Enabled - cgpu=true
- cgpu.disable.isolation=false
cgpu-no-isolation Enabled Disabled - cgpu=true
- cgpu.disable.isolation=true
Background information
When you use cGPU in a cluster of Container Service for Kubernetes (ACK), the following scenarios may occur at the same time:
- The amount of GPU memory that can be allocated to Job A is already specified in the script. In this case, the ACK cluster needs only to enable GPU sharing for Job A. No memory isolation is required.
- The amount of GPU memory that can be allocated to Job B is not specified in the script. In this case, the ACK cluster must enable both GPU sharing and memory isolation for Job B.
How do I configure an ACK cluster to support both scenarios?
To resolve this problem, you can use node pools to control cGPU. You must create two node pools:
- Create a node pool that supports only GPU sharing. Do not enable memory isolation. This node pool is used to run Job A.
- Create another node pool that supports both GPU sharing and memory isolation. This node pool is used to run Job B.
Usage notes
When you use node pools to control cGPU, take note of the following limits:
-
When you use node pools to control cGPU, if a job is not configured with a node selector, the pods of the job may be scheduled to other node pools. This may cause job execution errors.
Notice We recommend that you configure a node selector for each job. -
When a label of a node is changed, you must restart gpushare-device-plugin on the node to make the configuration of memory isolation take effect. For example, if label cgpu.disable.isolation=false is changed to cgpu.disable.isolation=true, you must restart gpushare-device-plugin.
The restart strategy deletes the pod of gpushare-device-plugin. Then, ACK automatically creates a new pod. To complete this task, perform the following operations:
- Query the pod of gpushare-device-plugin in the ACK cluster.
Run the following command:
kubectl get po -n kube-system -l name=gpushare-device-plugin-ds -o wide
The following output is returned:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES gpushare-device-plugin-ds-6r8gs 1/1 Running 0 18h 192.168.7.157 cn-shanghai.192.168.7.157 <none> <none> gpushare-device-plugin-ds-pjrvn 1/1 Running 0 15h 192.168.7.158 cn-shanghai.192.168.7.158 <none> <none>
- In this example, the pod of gpushare-device-plugin on node cn-shanghai.192.168.7.157
is deleted. Then, ACK automatically creates a new pod. Run the following command:
kubectl delete po gpushare-device-plugin-ds-6r8gs -n kube-system
- Query the pod of gpushare-device-plugin in the ACK cluster.
Step 1: Create node pools
Step 2: Submit jobs
Submit two jobs named cgpu-test and cgpu-test-no-isolation. You must set nodeSelector in the YAML files of both jobs.
-
cgpu-test: The amount of GPU memory to be allocated to this job is not specified in the script of the job. Therefore, memory isolation is required to run this job without errors. The following YAML template is an example:
apiVersion: apps/v1beta1 kind: StatefulSet metadata: name: cgpu-test labels: app: cgpu-test spec: replicas: 1 serviceName: "cgpu-test" podManagementPolicy: "Parallel" selector: # define how the deployment finds the pods it manages. matchLabels: app: cgpu-test template: # define the pods specifications. metadata: labels: app: cgpu-test spec: nodeSelector: # Add a node selector and select node pool cgpu. cgpu.disable.isolation: "false" containers: - name: cgpu-test image: registry.cn-shanghai.aliyuncs.com/tensorflow-samples/tensorflow-gpu-mem:10.0-runtime-centos7 command: - python3 - /app/main.py env: resources: limits: # Apply for 3 GiB of GPU memory. aliyun.com/gpu-mem: 3
Note- nodeSelector: Select node pool cgpu.
- cgpu.disable.isolation=false: Schedule jobs to the nodes in node pool cgpu.
- aliyun.com/gpu-mem: Specify the amount of GPU memory.
-
cgpu-test-no-isolation: The amount of memory to be allocated to this job per GPU is specified in the script of the job. Therefore, memory isolation is not required. The following YAML template is an example:
apiVersion: apps/v1beta1 kind: StatefulSet metadata: name: cgpu-test-no-isolation labels: app: cgpu-test-no-isolation spec: replicas: 1 serviceName: "cgpu-test-no-isolation" podManagementPolicy: "Parallel" selector: # define how the deployment finds the pods it manages matchLabels: app: cgpu-test-no-isolation template: # define the pods specifications metadata: labels: app: cgpu-test-no-isolation spec: nodeSelector: # Add a node selector and select node pool cgpu-no-isolation. cgpu.disable.isolation: "true" containers: - name: cgpu-test-no-isolation image: cheyang/gpu-player:v2 resources: limits: # Apply for 3 GiB of GPU memory. aliyun.com/gpu-mem: 3
Note- nodeSelector: Select node pool cgpu-no-isolation.
- cgpu.disable.isolation=true: Schedule jobs to the nodes in node pool cgpu-no-isolation.
- aliyun.com/gpu-mem: Specify the amount of GPU memory.