All Products
Search
Document Center

Container Service for Kubernetes:Configure a GPU selection policy for nodes with the GPU sharing feature enabled

Last Updated:Mar 26, 2026

By default, the scheduler fills one GPU completely before moving workloads to the next GPU on the same node. This prevents GPU memory fragmentation but concentrates risk: if that GPU fails, all pods sharing it are affected simultaneously. ACK's GPU sharing feature lets you choose between two GPU selection policies—binpack and spread—to match your fault-tolerance requirements.

Prerequisites

Before you begin, ensure that you have:

GPU selection policies

If a node with GPU sharing enabled has multiple GPUs, you can apply one of the following policies:

Policy Behavior When to use
Binpack (default) Fills one GPU completely before allocating to the next GPU. Maximizes GPU utilization; acceptable when GPU-level fault isolation is not required.
Spread Distributes pods across all available GPUs on the node. Limits the blast radius of a single GPU failure; required when fault isolation across GPUs matters.
The spread policy only takes effect when a node has more than one GPU. Select an instance type with multiple GPU cards when creating the node pool.

Example: A node has two GPUs, each with 15 GiB of GPU memory. Pod1 requests 2 GiB and Pod2 requests 3 GiB.

GPU selection policy comparison

Configure the spread policy

By default, nodes use the binpack policy. To switch a node pool to the spread policy, complete the following steps:

  1. Create a node pool and apply the required node labels.

  2. Submit a GPU sharing job with a node selector.

  3. Verify that pods are distributed across GPUs.

Step 1: Create a node pool

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster. In the left-side navigation pane, choose Nodes > Node Pools.

  3. In the upper-right corner of the Node Pools page, click Create Node Pool.

  4. In the Create Node Pool dialog box, configure the following key parameters, and then click Confirm Order. For all other parameters, see Create and manage a node pool.

    Parameter Description
    Instance type Set Architecture to GPU-accelerated and select an instance type with multiple GPUs. The spread policy only takes effect on nodes with more than one GPU.
    Expected nodes Specify the initial number of nodes in the node pool. Enter 0 if you do not want to create nodes immediately.
    Node label Add the following two labels:
    Key Value Purpose
    ack.node.gpu.schedule cgpu Enables GPU sharing and GPU memory isolation.
    ack.node.gpu.placement spread Enables the spread policy.

Step 2: Submit a job

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster. In the left-side navigation pane, choose Workloads > Jobs.

  3. In the upper-right corner of the page, click Create from YAML. Paste the following YAML into the Template editor, update the placeholder values based on the inline comments, and then click Create.

    Click to view YAML content

    apiVersion: batch/v1
    kind: Job
    metadata:
      name: tensorflow-mnist-spread
    spec:
      parallelism: 3
      template:
        metadata:
          labels:
            app: tensorflow-mnist-spread
        spec:
          nodeSelector:
            kubernetes.io/hostname: <NODE_NAME> # Replace with the name of a GPU-accelerated node, for example, cn-shanghai.192.0.2.109.
          containers:
          - name: tensorflow-mnist-spread
            image: registry.cn-beijing.aliyuncs.com/ai-samples/gpushare-sample:tensorflow-1.5
            command:
            - python
            - tensorflow-sample-code/tfjob/docker/mnist/main.py
            - --max_steps=100000
            - --data_dir=tensorflow-sample-code/data
            resources:
              limits:
                aliyun.com/gpu-mem: 4  # Each pod requests 4 GiB of GPU memory.
            workingDir: /root
          restartPolicy: Never

    The YAML defines a TensorFlow MNIST job with the following behavior:

    • Creates 3 pods in parallel (parallelism: 3), each requesting 4 GiB of GPU memory (aliyun.com/gpu-mem: 4).

    • Pins all pods to a specific node using kubernetes.io/hostname: <NODE_NAME> so the GPU selection policy applies to that node.

Step 3: Verify the spread policy

Run the following command to query GPU allocation on the node:

kubectl inspect cgpu

Expected output:

NAME                       IPADDRESS      GPU0(Allocated/Total)  GPU1(Allocated/Total)  GPU2(Allocated/Total)  GPU3(Allocated/Total)  GPU Memory(GiB)
cn-shanghai.192.0.2.109    192.0.2.109    4/15                   4/15                   0/15                   4/15                   12/60
--------------------------------------------------------------------------------------
Allocated/Total GPU Memory In Cluster:
12/60 (20%)

Each GPU<N>(Allocated/Total) column shows how much memory is allocated on that GPU. In this example, GPU0, GPU1, and GPU3 each have 4 GiB allocated (one pod per GPU), while GPU2 has none. The pods are spread across multiple GPUs rather than stacked on one, which confirms the spread policy is in effect.