All Products
Search
Document Center

Container Service for Kubernetes:Enable scheduling features

Last Updated:Nov 18, 2025

On an ACK managed cluster Pro, assign scheduling labels to GPU nodes to optimize resource utilization and schedule applications with precision. These labels define properties such as exclusive access, shared use, topology awareness, and specific GPU card models.

Scheduling label overview

GPU scheduling labels identify GPU models and resource allocation policies to support fine-grained resource management and efficient scheduling.

Scheduling feature

Label value

Use cases

Exclusive scheduling (Default)

ack.node.gpu.schedule: default

 Performance-critical tasks that require exclusive access to an entire GPU, such as model training and high-performance computing (HPC).

Shared scheduling

ack.node.gpu.schedule: cgpu

ack.node.gpu.schedule: core_mem

ack.node.gpu.schedule: share

ack.node.gpu.schedule: mps

Improves GPU utilization. Ideal for scenarios with multiple concurrent lightweight tasks, such as multitenancy and inference.

  • cgpu: Shared computing power with isolated GPU memory, based on Alibaba Cloud's cGPU sharing technology.

  • core_mem: Isolated computing power and GPU memory.

  • share: Shared computing power and GPU memory with no isolation.

  • mps: Shared computing power with isolated GPU memory, based on NVIDIA MPS technology combined with Alibaba Cloud cGPU.

ack.node.gpu.placement: binpack

ack.node.gpu.placement: spread

Optimizes the resource allocation strategy on multi-GPU nodes when cgpu, core_mem, share, or mps sharing is enabled.

  • binpack: (Default) Compactly schedules pods on multiple cards. Fills each GPU with Pods before assigning Pods to the next available GPU. This reduces resource fragmentation and is ideal for maximizing resource utilization or energy savings.

  • spread: Distributes Pods across different GPUs. This reduces the impact of a single card failure and is suitable for high-availability (HA) tasks.

Topology-aware scheduling

ack.node.gpu.schedule: topology

Automatically assigns the optimal combination of GPUs to a Pod based on the physical GPU topology of a single node. This is suitable for tasks that are sensitive to GPU-to-GPU communication latency.

Card model scheduling

aliyun.accelerator/nvidia_name: <GPU_card_name>

Use these labels with card model scheduling for more specific targeting.
aliyun.accelerator/nvidia_mem: <video_memory_per_card>
aliyun.accelerator/nvidia_count: <total_number_of_GPU_cards>

Schedules tasks to nodes with a specific GPU model or avoids nodes with a specific model.

Exclusive scheduling

Exclusive scheduling

If a node has no GPU scheduling labels, Exclusive scheduling is enabled by default. In this mode, the node allocates GPU resources to Pods in whole-card units.

If you have enabled other GPU scheduling features, deleting the label alone does not restore exclusive scheduling. You must manually change the label value to ack.node.gpu.schedule: default to re-enable it.

Shared scheduling

Shared scheduling is available only for ACK managed cluster Pro. For more information, see Limits.

  1. Install the ack-ai-installer component.

    1. Log on to the ACK console. In the left navigation pane, click Clusters.

    2. On the Clusters page, find the cluster you want and click its name. In the left-side navigation pane, choose Applications > Cloud-native AI Suite.

    3. On the Cloud-native AI Suite page, click Deploy. On the Deploy Cloud-native AI Suite page, select Scheduling Policy Extension (Batch Task Scheduling, GPU Sharing, Topology-aware GPU Scheduling).

      To learn how to set the computing power scheduling policy for the cGPU service, see Install and use the cGPU component.
    4. Click Deploy Cloud-native AI Suite.

      On the Cloud-native AI Suite page, verify that the ack-ai-installer component appears in the list of installed components.

  2. Enable shared scheduling.

    1. On the Clusters page, click the name of your target cluster. In the navigation pane on the left, choose Nodes > Node Pools.

    2. On the Node Pools page, click Create Node Pool, configure the node labels, then click Confirm.

      You can keep the default values for other configuration items. For details on each label's function, see Scheduling label overview.
      • Configure basic shared scheduling.

        Click the Node Labels icon for Node Labels, set the Key to ack.node.gpu.schedule, and select a value such as: cgpu, core_mem, share, or mps (requires installing the MPS Control Daemon component).

      • Configure multi-card shared scheduling.

        On multi-GPU nodes, you can add a placement strategy to your basic shared scheduling configuration.

        Click the Node Label icon for Node Labels, set the Key to ack.node.gpu.placement, select either binpack or spread as the value.

  3. Verify that shared scheduling is enabled.

    cgpu/share/mps

    Replace <NODE_NAME> with the name of a node in the target node pool and run the following command to verify that cgpu, share, or mps shared scheduling is enabled on the node.

    kubectl get nodes <NODE_NAME> -o yaml | grep -q "aliyun.com/gpu-mem"

    Expected output:

    aliyun.com/gpu-mem: "60"

    If the aliyun.com/gpu-mem field is not 0, cgpu, share, or mps shared scheduling is enabled.

    core_mem

    Replace <NODE_NAME> with the name of a node in the target node pool and run the following command to verify that core_mem shared scheduling is enabled.

    kubectl get nodes <NODE_NAME> -o yaml | grep -E 'aliyun\.com/gpu-core\.percentage|aliyun\.com/gpu-mem'

    Expected output:

    aliyun.com/gpu-core.percentage:"80"
    aliyun.com/gpu-mem:"6"

    If both the aliyun.com/gpu-core.percentage and aliyun.com/gpu-mem fields are not 0, core_mem shared scheduling is enabled.

    binpack

    Use the shared GPU resource query tool to check the GPU resource allocation on the node:

    kubectl inspect cgpu

    Expected output:

    NAME                   IPADDRESS      GPU0(Allocated/Total)  GPU1(Allocated/Total)  GPU2(Allocated/Total)  GPU3(Allocated/Total)  GPU Memory(GiB)
    cn-shanghai.192.0.2.109  192.0.2.109  15/15                   9/15                   0/15                   0/15                   24/60
    --------------------------------------------------------------------------------------
    Allocated/Total GPU Memory In Cluster:
    24/60 (40%)

    The output shows that GPU0 is fully allocated (15/15) while GPU1 is partially allocated (9/15). This confirms that the binpack strategy is active, filling one GPU completely before allocating resources on the next.

    spread

    Use the shared scheduling GPU resource query tool to check the GPU resource allocation on the node:

    kubectl inspect cgpu

    Expected output:

    NAME                   IPADDRESS      GPU0(Allocated/Total)  GPU1(Allocated/Total)  GPU2(Allocated/Total)  GPU3(Allocated/Total)  GPU Memory(GiB)
    cn-shanghai.192.0.2.109  192.0.2.109  4/15                   4/15                   0/15                   4/15                   12/60
    --------------------------------------------------------------------------------------
    Allocated/Total GPU Memory In Cluster:
    12/60 (20%)

    The output shows that resources are allocated across GPU0 (4/15), GPU1 (4/15), and GPU3 (4/15). This confirms that the spread strategy, which distributes Pods across different GPUs, is active.

Topology-aware scheduling

Topology-aware scheduling is available only for ACK managed cluster Pro. For more information, see System component version requirements.

  1. Install the ack-ai-installer component.

  2. Enable topology-aware scheduling.

    Replace <NODE_NAME> with the name of your target node and run the following command to add a label to the node and enable topology-aware GPU scheduling.

    kubectl label node <NODE_NAME> ack.node.gpu.schedule=topology
    A node with topology-aware scheduling enabled no longer supports GPU workloads that are not topology-aware. To restore exclusive scheduling, run the command kubectl label node <NODE_NAME> ack.node.gpu.schedule=default --overwrite.
  3. Verify that topology-aware scheduling is enabled.

    Replace <NODE_NAME> with the name of your target node and run the following command to verify that topology scheduling is enabled.

    kubectl get nodes <NODE_NAME> -o yaml | grep aliyun.com/gpu

    Expected output:

    aliyun.com/gpu: "2"

    If the aliyun.com/gpu field is not 0, topology scheduling is enabled.

Card model scheduling

Schedule tasks to nodes with a specific GPU model or avoid nodes with a specific model.

  1. View the GPU model on the node.

    Run the following command to query the GPU model of the nodes in your cluster.

    The NVIDIA_NAME field shows the GPU card model.
    kubectl get nodes -L aliyun.accelerator/nvidia_name

    The expected output is similar to the following:

    NAME                        STATUS   ROLES    AGE   VERSION            NVIDIA_NAME
    cn-shanghai.192.XX.XX.176   Ready    <none>   17d   v1.26.3-aliyun.1   Tesla-V100-SXM2-32GB
    cn-shanghai.192.XX.XX.177   Ready    <none>   17d   v1.26.3-aliyun.1   Tesla-V100-SXM2-32GB

    Expand to view more ways to check the GPU model.

    On the Clusters page, click the name of the target cluster. In the navigation pane on the left, choose Workloads > Pods. In the row of a created Pod (for example, tensorflow-mnist-multigpu-***), click Terminal in the Actions column. Select the container you want to log in to from the drop-down list and run the following commands.

    • Query the card model: nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 | sed -e 's/ /-/g'

    • Query the GPU memory of each card: nvidia-smi --id=0 --query-gpu=memory.total --format=csv,noheader | sed -e 's/ //g'

    • Query the total number of GPU cards on the node: nvidia-smi -L | wc -l

    image

  2. Enable card model scheduling.

    1. On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose Workloads > Jobs.

    2. On the Jobs page, click Create From YAML. Use the following examples to create an application and enable card model scheduling.

      image

      Specify a particular card model

      Use the GPU card model scheduling label to ensure your application runs on nodes with a specific card model.

      In the code aliyun.accelerator/nvidia_name: "Tesla-V100-SXM2-32GB", replace Tesla-V100-SXM2-32GB with the actual card model of your node.

      Expand to view the YAML file details

      apiVersion: batch/v1
      kind: Job
      metadata:
        name: tensorflow-mnist
      spec:
        parallelism: 1
        template:
          metadata:
            labels:
              app: tensorflow-mnist
          spec:
            nodeSelector:
              aliyun.accelerator/nvidia_name: "Tesla-V100-SXM2-32GB" # Runs the application on a Tesla V100-SXM2-32GB GPU.
            containers:
            - name: tensorflow-mnist
              image: registry.cn-beijing.aliyuncs.com/acs/tensorflow-mnist-sample:v1.5
              command:
              - python
              - tensorflow-sample-code/tfjob/docker/mnist/main.py
              - --max_steps=1000
              - --data_dir=tensorflow-sample-code/data
              resources:
                limits:
                  nvidia.com/gpu: 1
              workingDir: /root
            restartPolicy: Never

      After the Job is created, choose Workloads > Pods from the navigation pane on the left. The Pod list shows the example Pod scheduled to a matching node, confirming that scheduling based on the GPU card model label is working.

      Exclude a particular card model

      Use the GPU card model scheduling label with node affinity and anti-affinity to prevent your application from running on certain card models.

      In values: - "Tesla-V100-SXM2-32GB", replace Tesla-V100-SXM2-32GB with the actual card model of your node.

      Expand to view the YAML file details

      apiVersion: batch/v1
      kind: Job
      metadata:
        name: tensorflow-mnist
      spec:
        parallelism: 1
        template:
          metadata:
            labels:
              app: tensorflow-mnist
          spec:
            affinity:
              nodeAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                  nodeSelectorTerms:
                  - matchExpressions:
                    - key: aliyun.accelerator/nvidia_name  # Card model scheduling label
                      operator: NotIn
                      values:
                      - "Tesla-V100-SXM2-32GB"            # Prevents the pod from being scheduled to a node with a Tesla-V100-SXM2-32GB card.
            containers:
            - name: tensorflow-mnist
              image: registry.cn-beijing.aliyuncs.com/acs/tensorflow-mnist-sample:v1.5
              command:
              - python
              - tensorflow-sample-code/tfjob/docker/mnist/main.py
              - --max_steps=1000
              - --data_dir=tensorflow-sample-code/data
              resources:
                limits:
                  nvidia.com/gpu: 1
              workingDir: /root
            restartPolicy: Never

      After the Job is created, the application will not be scheduled on nodes with the aliyun.accelerator/nvidia_name: 'Tesla-V100-SXM2-32GB' label. It can, however, be scheduled on other GPU nodes.