GPUs provide higher parallel computing power than CPUs and can accelerate compute-intensive workloads by orders of magnitude. Windows containers support GPU acceleration for frameworks built on top of Direct eXtension (DirectX). This guide shows you how to install the DirectX device plugin on Windows nodes and enable DirectX GPU acceleration for your workloads.
Background: What is DirectX
DirectX is a suite of APIs that enhance 3D graphics and sound effects while improving performance for Windows-based games and multimedia applications. It provides a unified hardware driver standard, simplifying installation and setup. DirectX allows you to use GPUs to handle parallel and compute-intensive tasks, and reduces CPU loads.
Prerequisites
Before you begin, make sure you have:
-
An ACK managed cluster running Kubernetes 1.20.4 or later. See Create an ACK managed cluster
-
The kubeconfig file of the cluster configured to connect to the cluster. See Obtain and configure a kubeconfig file
Step 1: Create a Windows node pool with GPU support
Two node pool types support DirectX GPU acceleration. Choose based on your requirements:
| Standard Windows node pool | Elastic Windows node pool | |
|---|---|---|
| Image source | ECS public images (default) | Custom image (required) |
| Supported OS | Windows Server 2019 only | Windows Server 2019 or Windows Server 2022 |
| Setup effort | Lower — activate a GRID driver and create the node pool | Higher — submit a ticket to request a shared image, then create the node pool |
| Use when | You need a straightforward setup on Windows Server 2019 | You need Windows Server 2022 or a pre-licensed custom image |
Create a standard Windows node pool
-
Activate the GRID driver with a license. Two options are available:
-
NVIDIA enterprise users: Download and install the GRID driver from the NVIDIA enterprise licensing site.
-
Non-enterprise users: Use the community image with a pre-installed GRID driver provided by Alibaba Cloud.
-
-
Create a Windows node pool with the following configuration:
-
Instance type: A GPU-accelerated compute-optimized instance type (gn, ebm, or scc series) or a vGPU-accelerated instance type (vgn or sgn series). See GPU-accelerated compute-optimized instance families or vGPU-accelerated instance families.
-
Operating system: Windows Server 2019.
-
Create an elastic Windows node pool
ACK uses ECS public images for node images by default. To create an elastic Windows node, you need a custom image.
-
Submit a ticket to request a shared Windows image with an activated GRID driver license. Specify your required Windows version (Windows Server 2019 or Windows Server 2022) in the ticket.
-
Create a Windows node pool with the following configuration:
-
Instance type: A GPU-accelerated compute-optimized instance type (gn, ebm, or scc series) or a vGPU-accelerated instance type (vgn or sgn series). See GPU-accelerated compute-optimized instance families or vGPU-accelerated instance families.
-
Operating system: Select based on your requirements, for example, Windows Server 2022.
-
Custom image: Select the shared image you requested.
-
Step 2: Install the DirectX device plugin
Deploy the DirectX device plugin as a DaemonSet on your Windows nodes.
-
Create a file named
directx-device-plugin-windows.yamlwith the following content:apiVersion: apps/v1 kind: DaemonSet metadata: labels: k8s-app: directx-device-plugin-windows name: directx-device-plugin-windows namespace: kube-system spec: revisionHistoryLimit: 10 selector: matchLabels: k8s-app: directx-device-plugin-windows template: metadata: annotations: scheduler.alpha.kubernetes.io/critical-pod: "" labels: k8s-app: directx-device-plugin-windows spec: tolerations: - operator: Exists # hostNetwork: true is supported for Windows workloads since Kubernetes 1.18, # allowing deployment without NetworkReady. hostNetwork: true affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: type operator: NotIn values: - virtual-kubelet - key: beta.kubernetes.io/os operator: In values: - windows - key: windows.alibabacloud.com/deployment-topology operator: In values: - "2.0" - key: windows.alibabacloud.com/directx-supported operator: In values: - "true" - matchExpressions: - key: type operator: NotIn values: - virtual-kubelet - key: kubernetes.io/os operator: In values: - windows - key: windows.alibabacloud.com/deployment-topology operator: In values: - "2.0" - key: windows.alibabacloud.com/directx-supported operator: In values: - "true" containers: - name: directx command: - pwsh.exe - -NoLogo - -NonInteractive - -File - entrypoint.ps1 # Replace the region in the image address with the region of your cluster. image: registry-cn-hangzhou-vpc.ack.aliyuncs.com/acs/directx-device-plugin-windows:v1.0.0 imagePullPolicy: IfNotPresent volumeMounts: - name: host-binary mountPath: c:/host/opt/bin - name: wins-pipe mountPath: \\.\pipe\rancher_wins volumes: - name: host-binary hostPath: path: c:/opt/bin type: DirectoryOrCreate - name: wins-pipe hostPath: path: \\.\pipe\rancher_wins -
Deploy the DaemonSet:
kubectl create -f directx-device-plugin-windows.yaml
Step 3: Enable DirectX GPU acceleration for a workload
The DirectX device plugin automatically adds the class/<interface class GUID> device to Windows containers, enabling access to DirectX services on the Elastic Compute Service (ECS) host. For details, see Devices in containers on Windows.
Add the resources field to the container spec of any workload that requires GPU acceleration:
spec:
...
template:
...
spec:
...
containers:
- name: gpu-user
...
+ resources:
+ limits:
+ windows.alibabacloud.com/directx: "1"
+ requests:
+ windows.alibabacloud.com/directx: "1"
This configuration does not exclusively allocate all GPU resources on the ECS host to a single container or block other applications from accessing the GPU. GPU resources are dynamically shared between the ECS host and containers, so multiple Windows containers on the same host can each use DirectX hardware acceleration simultaneously.
For more information, see GPU acceleration in Windows containers.
Step 4: Verify GPU acceleration
Run a sample GPU job to confirm that DirectX acceleration is working end to end.
-
Create a file named
gpu-job-windows.yamlwith the following content:apiVersion: batch/v1 kind: Job metadata: labels: k8s-app: gpu-job-windows name: gpu-job-windows namespace: default spec: parallelism: 1 completions: 1 backoffLimit: 3 manualSelector: true selector: matchLabels: k8s-app: gpu-job-windows template: metadata: labels: k8s-app: gpu-job-windows spec: restartPolicy: Never affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: type operator: NotIn values: - virtual-kubelet - key: beta.kubernetes.io/os operator: In values: - windows - matchExpressions: - key: type operator: NotIn values: - virtual-kubelet - key: kubernetes.io/os operator: In values: - windows tolerations: - key: os value: windows containers: - name: gpu # Replace the region in the image address with the region of your cluster. image: registry-cn-hangzhou-vpc.ack.aliyuncs.com/acs/sample-gpu-windows:v1.0.0 imagePullPolicy: IfNotPresent resources: limits: windows.alibabacloud.com/directx: "1" requests: windows.alibabacloud.com/directx: "1"NoteThe sample image
registry-{region}-vpc.ack.aliyuncs.com/acs/sample-gpu-windowsis built on top of Microsoft Windows. See microsoft-windows. The image is 15.3 GB and may take some time to pull. Inside the job, WinMLRunner runs 100 evaluations using the Tiny YOLOv2 model and outputs performance data. Actual results may vary depending on your environment. -
Deploy the job:
kubectl create -f gpu-job-windows.yaml -
Check the job logs:
kubectl logs -f gpu-job-windowsExpected output:
INFO: Executing model of "tinyyolov2-7" 100 times within GPU driver ... Created LearningModelDevice with GPU: NVIDIA GRID T4-8Q Loading model (path = c:\data\tinyyolov2-7\model.onnx)... ================================================================= Name: Example Model Author: OnnxMLTools Version: 0 Domain: onnxconverter-common Description: The Tiny YOLO network from the paper 'YOLO9000: Better, Faster, Stronger' (2016), arXiv:1612.08242 Path: c:\data\tinyyolov2-7\model.onnx Support FP16: false Input Feature Info: Name: image Feature Kind: Image (Height: 416, Width: 416) Output Feature Info: Name: grid Feature Kind: FloatThe output confirms that the
gpu-job-windowsjob is running with DirectX GPU acceleration enabled.