All Products
Search
Document Center

Container Service for Kubernetes:Use DRA to schedule GPUs

Last Updated:Mar 26, 2026

In AI training and inference scenarios, multiple applications often share GPU resources. Deploy the NVIDIA Dynamic Resource Allocation (DRA) driver in your ACK cluster to overcome the scheduling limits of traditional device plugins. The Kubernetes DRA API dynamically allocates GPUs across pods and controls resources at a fine-grained level, improving GPU utilization and reducing costs.

How it works

Dynamic Resource Allocation (DRA) extends the persistent volume API to support generic resources. The experience is similar to dynamic volume provisioning: just as you use a PersistentVolumeClaim to claim storage from a StorageClass, you use a ResourceClaim to claim GPU resources from a DeviceClass.

DRA supports more flexible and fine-grained resource allocation than traditional device plugins:

  • Flexible device filtering: Use the Common Expression Language (CEL) to filter devices by specific attributes.

  • Device sharing: Share the same GPU across multiple containers or pods by referencing the same ResourceClaim.

  • Simplified pod requests: Specify resource requirements declaratively without per-container device counts.

NVIDIA DRA Driver for GPUs implements the DRA API for Kubernetes workloads. It supports controlled GPU sharing and dynamic GPU reconfiguration.

Prerequisites

Before you begin, ensure that you have:

Set up the DRA GPU scheduling environment

Step 1: Create a GPU node pool

Create a node pool that uses DRA GPU scheduling. Add a node label to disable default GPU device plugin resource reporting and prevent duplicate GPU allocation.

  1. Log on to the Container Service console. In the left navigation pane, choose Clusters. Click the cluster name, then choose Node management > Node Pools.

  2. Click Create Node Pool. Select a GPU instance type from GPU instance types supported by ACK. Keep all other settings at their default values.

    1. Click Specify Instance Type. Enter an instance type name, such as ecs.gn7i-c8g1.2xlarge. Set Expected Nodes to 1.

    2. Click Advanced to expand the node pool configuration. Under Node Labels, add the following label:

      ack.node.gpu.schedule: disabled

      This disables exclusive GPU scheduling and stops GPU device plugin resource reporting on the node.

      Important: Running both the device plugin and DRA on the same node causes duplicate GPU allocation. Always add this label to nodes where DRA is enabled.

Step 2: Install the NVIDIA DRA driver

Install the NVIDIA DRA GPU driver, which provides the concrete implementation of the DRA API.

  1. Install the Helm CLI.

    curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
  2. Add the NVIDIA Helm repository and update it.

    helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
    && helm repo update
  3. Install version 25.3.2 of the NVIDIA DRA GPU driver.

    Important

    --set controller.affinity=null removes the node affinity constraint from the controller workload, allowing it to schedule on any node. Evaluate this setting before use in production environments, as it may affect stability.

    helm install nvidia-dra-driver-gpu nvidia/nvidia-dra-driver-gpu --version="25.3.2" --create-namespace --namespace nvidia-dra-driver-gpu \
        --set gpuResourcesEnabledOverride=true \
        --set controller.affinity=null \
        --set "kubeletPlugin.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].key=ack.node.gpu.schedule" \
        --set "kubeletPlugin.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].operator=In" \
        --set "kubeletPlugin.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].values[0]=disabled"

    A successful installation produces output similar to:

    NAME: nvidia-dra-driver-gpu
    LAST DEPLOYED: Tue Oct 14 20:42:13 2025
    NAMESPACE: nvidia-dra-driver-gpu
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None

Step 3: Verify the environment

Verify that the NVIDIA DRA driver is running and GPU resources are reported to the cluster.

  1. Check that the DRA GPU driver pods are running.

    kubectl get pod -n nvidia-dra-driver-gpu

    All pods should show a Running status. If any pod is in Pending or CrashLoopBackOff, check whether the node label ack.node.gpu.schedule: disabled was applied correctly in Step 1.

  2. Check that DRA-related resources are created.

    kubectl get deviceclass,resourceslice

    The expected output is:

    NAME                                                                    AGE
    deviceclass.resource.k8s.io/compute-domain-daemon.nvidia.com            60s
    deviceclass.resource.k8s.io/compute-domain-default-channel.nvidia.com   60s
    deviceclass.resource.k8s.io/gpu.nvidia.com                              60s
    deviceclass.resource.k8s.io/mig.nvidia.com                              60s
    
    NAME                                                                                   NODE                      DRIVER                      POOL                      AGE
    resourceslice.resource.k8s.io/cn-beijing.1x.1x.3x.1x-compute-domain.nvidia.com-htjqn   cn-beijing.10.11.34.156   compute-domain.nvidia.com   cn-beijing.10.11.34.156   57s
    resourceslice.resource.k8s.io/cn-beijing.1x.1x.3x.1x-gpu.nvidia.com-bnwhj              cn-beijing.10.11.34.156   gpu.nvidia.com              cn-beijing.10.11.34.156   57s

    If the deviceclass resources do not appear, DRA may not be enabled on your cluster. Confirm that your cluster runs Kubernetes 1.34 or later. If no resourceslice resources appear, the driver pod may not be running — recheck Step 2.

  3. View GPU resource details from a ResourceSlice.

    Replace cn-beijing.1x.1x.3x.1x-gpu.nvidia.com-bnwhj with your actual ResourceSlice name from the previous step.
    kubectl get resourceslice.resource.k8s.io/cn-beijing.1x.1x.3x.1x-gpu.nvidia.com-bnwhj -o yaml

Deploy a workload that uses DRA GPU

The following steps use a ResourceClaimTemplate to automatically create a ResourceClaim per pod, so each pod gets independent access to a separate GPU.

  1. Create a file named resource-claim-template.yaml.

    apiVersion: resource.k8s.io/v1
    kind: ResourceClaimTemplate
    metadata:
      name: single-gpu
    spec:
      spec:
        devices:
          requests:
          - exactly:
              allocationMode: ExactCount
              deviceClassName: gpu.nvidia.com
              count: 1
            name: gpu

    Apply the template to the cluster.

    kubectl apply -f resource-claim-template.yaml
  2. Create a file named resource-claim-template-pod.yaml.

    apiVersion: v1
    kind: Pod
    metadata:
      name: pod1
      labels:
        app: pod
    spec:
      containers:
      - name: ctr
        image: registry-cn-hangzhou.ack.aliyuncs.com/dev/ubuntu:22.04
        command: ["bash", "-c"]
        args: ["nvidia-smi -L; trap 'exit 0' TERM; sleep 9999 & wait"]
        resources:
          claims:
          - name: gpu
      resourceClaims:
      - name: gpu
        resourceClaimTemplateName: single-gpu

    Deploy the pod.

    kubectl apply -f resource-claim-template-pod.yaml
  3. List the ResourceClaim objects created automatically for the pod.

    Replace pod1-gpu-wstqm with your actual ResourceClaim name.
    kubectl get resourceclaim

    The output includes an auto-generated ResourceClaim such as pod1-gpu-wstqm. To inspect it:

    kubectl describe resourceclaim pod1-gpu-wstqm
  4. Verify that the pod is using the GPU. The expected output is GPU 0: NVIDIA A10.

    kubectl logs pod1

(Optional) Clean up the environment

After testing, delete unused resources to avoid unnecessary charges.

  • Delete the pod and ResourceClaimTemplate.

    kubectl delete pod pod1
    kubectl delete resourceclaimtemplate single-gpu
  • Uninstall the NVIDIA DRA GPU driver.

    helm uninstall nvidia-dra-driver-gpu -n nvidia-dra-driver-gpu
  • Remove or release node resources.