How to use eGPU to share and schedule GPU resources in ACK Lingjun managed clusters - Container Service for Kubernetes

This topic describes how to use eGPU to schedule and isolate GPU resources on the Lingjun nodes in a Container Service for Kubernetes (ACK) Lingjun managed cluster.

Prerequisites
Step 1: Enable GPU sharing and scheduling
Step 2: Use shared GPU resources
Step 3: Run a job to verify GPU sharing and scheduling
FAQ

Prerequisites

An ACK Lingjun managed cluster is created and the cluster contains GPU-accelerated Lingjun nodes.

Note

By default, an eGPU-based GPU sharing and scheduling component is installed in ACK Lingjun managed clusters to allow you to directly use the GPU sharing and scheduling feature. For more information about how to check whether the eGPU-based GPU sharing and scheduling component is installed, see How do I check whether the eGPU-based GPU sharing and scheduling component is installed in my cluster?
The eGPU-based GPU sharing and scheduling component does not have limits on instance types. H800 Lingjun nodes do not support GPU memory isolation or computing power isolation because eGPU does not support all features of H800 Lingjun nodes. If you want to use GPU memory isolation and computing power isolation, use other types of Lingjun nodes.

Step 1: Enable GPU sharing and scheduling

To enable GPU sharing and scheduling for a Lingjun node, perform the following steps:

Search for the /etc/lingjun_metadata file.
- If the file exists, run the nvidia-smi command. If no error is returned, the node for which you want to enable GPU sharing and scheduling is a Lingjun node. You can proceed to the next step.
- If the file does not exist, the node for which you want to enable GPU sharing and scheduling is not a Lingjun node. You cannot enable this feature for the node. To enable GPU sharing and scheduling in this case, you need to create a Lingjun node. For more information, see Lingjun node pools.
Run the following command to add the ack.node.gpu.schedule label to the node to enable GPU sharing and scheduling:
```
kubectl label node <NODE_NAME> ack.node.gpu.schedule=<SHARE_MODE>
```
Note
If the value of the label is egpu_mem, only GPU memory is isolated. If the value of the label is egpu_core_mem, both GPU memory and GPU computing power are isolated. GPU computing power must be requested together with GPU memory. You can request only GPU memory separately.

Step 2: Use shared GPU resources

In this example, the value of the label is set to egpu_core_mem.

Wait until the node reports the GPU information.

Run the following command to query the resources on the node:

kubectl get node <NODE_NAME> -oyaml

Expected output:

  allocatable:
    aliyun.com/gpu-core.percentage: "100"
    aliyun.com/gpu-count: "1"
    aliyun.com/gpu-mem: "80"
    ...
    nvidia.com/gpu: "0"
    ...
  capacity:
    aliyun.com/gpu-core.percentage: "100"
    aliyun.com/gpu-count: "1"
    aliyun.com/gpu-mem: "80
    ...
    nvidia.com/gpu: "0"
    ...

The output indicates that the aliyun.com/gpu-mem and aliyun.com/gpu-core.percentage resources are available.

Use shared GPU resources. For more information, see Configure the GPU sharing component.
Note
If you want to allocate an entire GPU to a pod when scheduling the pod, add the ack.gpushare.placement=require-whole-device label to the pod and specify the requested amount of GPU memory in gpu-mem. Then, a GPU that can provide the requested amount of GPU memory is automatically allocated to the pod.

Step 3: Run a job to verify GPU sharing and scheduling

Use the following YAML file to submit a Benchmark job:

apiVersion: batch/v1
kind: Job
metadata:
  name: benchmark-job
spec:
  parallelism: 1
  template:
    spec:
      containers:
      - name: benchmark-job
        image: registry.cn-beijing.aliyuncs.com/ai-samples/gpushare-sample:benchmark-tensorflow-2.2.3
        command:
        - bash
        - run.sh
        - --num_batches=500000000
        - --batch_size=8
        resources:
          limits:
            aliyun.com/gpu-mem: 10
            aliyun.com/gpu-core.percentage: 60
        workingDir: /root
      restartPolicy: Never
      hostNetwork: true
      tolerations:
        - operator: Exists

Run the command to submit the job:
```
kubectl apply -f benchmark.yaml
```
Run the following command to access the pod after the pod enters the Running state:
```
kubectl exec -ti benchmark-job-xxxx bash
```

Run the following command in the pod to query the GPU isolation information:

vgpu-smi

Expected output:

+------------------------------------------------------------------------------+
|    VGPU_SMI 460.91.03     DRIVER_VERSION: 460.91.03     CUDA Version: 11.2   |
+-------------------------------------------+----------------------------------+
| GPU  Name                Bus-Id           |        Memory-Usage     GPU-Util |
|===========================================+==================================|
|   0  xxxxxxxx            00000000:00:07.0 |  8307MiB / 10782MiB   60% /  60% |
+-------------------------------------------+----------------------------------+

The output indicates that 10 GB of GPU memory and 60% of computing power are allocated to the pod.

FAQ

How do I check whether the eGPU-based GPU sharing and scheduling component is installed in my cluster?

Run the following command to check whether the eGPU-based GPU sharing and scheduling component is installed:

kubectl get ds -nkube-system | grep gpushare

Expected output:

NAME                                 DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                    AGE
gpushare-egpu-device-plugin-ds       0         0         0       0            0           <none>
gpushare-egpucore-device-plugin-ds   0         0         0       0            0           <none>

The output indicates that the eGPU-based GPU sharing and scheduling component is installed.

Container Service for Kubernetes:Use eGPU to share and schedule GPU resources

Table of contents

Prerequisites

Step 1: Enable GPU sharing and scheduling

Step 2: Use shared GPU resources

Step 3: Run a job to verify GPU sharing and scheduling

FAQ

How do I check whether the eGPU-based GPU sharing and scheduling component is installed in my cluster?