On an ACK managed cluster Pro, assign scheduling labels to GPU nodes to optimize resource utilization and schedule applications with precision. These labels define properties such as exclusive access, shared use, topology awareness, and specific GPU card models.
Scheduling label overview
GPU scheduling labels identify GPU models and resource allocation policies to support fine-grained resource management and efficient scheduling.
Scheduling feature | Label value | Use cases |
Exclusive scheduling (Default) |
| Performance-critical tasks that require exclusive access to an entire GPU, such as model training and high-performance computing (HPC). |
Shared scheduling |
| Improves GPU utilization. Ideal for scenarios with multiple concurrent lightweight tasks, such as multitenancy and inference.
|
| Optimizes the resource allocation strategy on multi-GPU nodes when
| |
Topology-aware scheduling |
| Automatically assigns the optimal combination of GPUs to a Pod based on the physical GPU topology of a single node. This is suitable for tasks that are sensitive to GPU-to-GPU communication latency. |
Card model scheduling |
Use these labels with card model scheduling for more specific targeting. | Schedules tasks to nodes with a specific GPU model or avoids nodes with a specific model. |
Exclusive scheduling
Exclusive scheduling
If a node has no GPU scheduling labels, Exclusive scheduling is enabled by default. In this mode, the node allocates GPU resources to Pods in whole-card units.
If you have enabled other GPU scheduling features, deleting the label alone does not restore exclusive scheduling. You must manually change the label value to ack.node.gpu.schedule: default to re-enable it.Shared scheduling
Shared scheduling is available only for ACK managed cluster Pro. For more information, see Limits.
Install the
ack-ai-installercomponent.Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, find the cluster you want and click its name. In the left-side navigation pane, choose .
On the Cloud-native AI Suite page, click Deploy. On the Deploy Cloud-native AI Suite page, select Scheduling Policy Extension (Batch Task Scheduling, GPU Sharing, Topology-aware GPU Scheduling).
To learn how to set the computing power scheduling policy for the cGPU service, see Install and use the cGPU component.
Click Deploy Cloud-native AI Suite.
On the Cloud-native AI Suite page, verify that the ack-ai-installer component appears in the list of installed components.
Enable shared scheduling.
On the Clusters page, click the name of your target cluster. In the navigation pane on the left, choose .
On the Node Pools page, click Create Node Pool, configure the node labels, then click Confirm.
You can keep the default values for other configuration items. For details on each label's function, see Scheduling label overview.
Configure basic shared scheduling.
Click the
icon for Node Labels, set the Key to ack.node.gpu.schedule, and select a value such as:cgpu,core_mem,share, ormps(requires installing the MPS Control Daemon component).Configure multi-card shared scheduling.
On multi-GPU nodes, you can add a placement strategy to your basic shared scheduling configuration.
Click the
icon for Node Labels, set the Key to ack.node.gpu.placement, select eitherbinpackorspreadas the value.
Verify that shared scheduling is enabled.
cgpu/share/mpsReplace
<NODE_NAME>with the name of a node in the target node pool and run the following command to verify thatcgpu,share, ormpsshared scheduling is enabled on the node.kubectl get nodes <NODE_NAME> -o yaml | grep -q "aliyun.com/gpu-mem"Expected output:
aliyun.com/gpu-mem: "60"If the
aliyun.com/gpu-memfield is not0,cgpu,share, ormpsshared scheduling is enabled.core_memReplace
<NODE_NAME>with the name of a node in the target node pool and run the following command to verify thatcore_memshared scheduling is enabled.kubectl get nodes <NODE_NAME> -o yaml | grep -E 'aliyun\.com/gpu-core\.percentage|aliyun\.com/gpu-mem'Expected output:
aliyun.com/gpu-core.percentage:"80" aliyun.com/gpu-mem:"6"If both the
aliyun.com/gpu-core.percentageandaliyun.com/gpu-memfields are not0,core_memshared scheduling is enabled.binpackUse the shared GPU resource query tool to check the GPU resource allocation on the node:
kubectl inspect cgpuExpected output:
NAME IPADDRESS GPU0(Allocated/Total) GPU1(Allocated/Total) GPU2(Allocated/Total) GPU3(Allocated/Total) GPU Memory(GiB) cn-shanghai.192.0.2.109 192.0.2.109 15/15 9/15 0/15 0/15 24/60 -------------------------------------------------------------------------------------- Allocated/Total GPU Memory In Cluster: 24/60 (40%)The output shows that GPU0 is fully allocated (15/15) while GPU1 is partially allocated (9/15). This confirms that the
binpackstrategy is active, filling one GPU completely before allocating resources on the next.spreadUse the shared scheduling GPU resource query tool to check the GPU resource allocation on the node:
kubectl inspect cgpuExpected output:
NAME IPADDRESS GPU0(Allocated/Total) GPU1(Allocated/Total) GPU2(Allocated/Total) GPU3(Allocated/Total) GPU Memory(GiB) cn-shanghai.192.0.2.109 192.0.2.109 4/15 4/15 0/15 4/15 12/60 -------------------------------------------------------------------------------------- Allocated/Total GPU Memory In Cluster: 12/60 (20%)The output shows that resources are allocated across GPU0 (4/15), GPU1 (4/15), and GPU3 (4/15). This confirms that the
spreadstrategy, which distributes Pods across different GPUs, is active.
Topology-aware scheduling
Topology-aware scheduling is available only for ACK managed cluster Pro. For more information, see System component version requirements.
Enable topology-aware scheduling.
Replace
<NODE_NAME>with the name of your target node and run the following command to add a label to the node and enable topology-aware GPU scheduling.kubectl label node <NODE_NAME> ack.node.gpu.schedule=topologyA node with topology-aware scheduling enabled no longer supports GPU workloads that are not topology-aware. To restore exclusive scheduling, run the command
kubectl label node <NODE_NAME> ack.node.gpu.schedule=default --overwrite.Verify that topology-aware scheduling is enabled.
Replace
<NODE_NAME>with the name of your target node and run the following command to verify thattopologyscheduling is enabled.kubectl get nodes <NODE_NAME> -o yaml | grep aliyun.com/gpuExpected output:
aliyun.com/gpu: "2"If the
aliyun.com/gpufield is not0,topologyscheduling is enabled.
Card model scheduling
Schedule tasks to nodes with a specific GPU model or avoid nodes with a specific model.
View the GPU model on the node.
Run the following command to query the GPU model of the nodes in your cluster.
The NVIDIA_NAME field shows the GPU card model.
kubectl get nodes -L aliyun.accelerator/nvidia_nameThe expected output is similar to the following:
NAME STATUS ROLES AGE VERSION NVIDIA_NAME cn-shanghai.192.XX.XX.176 Ready <none> 17d v1.26.3-aliyun.1 Tesla-V100-SXM2-32GB cn-shanghai.192.XX.XX.177 Ready <none> 17d v1.26.3-aliyun.1 Tesla-V100-SXM2-32GBEnable card model scheduling.
On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose .
On the Jobs page, click Create From YAML. Use the following examples to create an application and enable card model scheduling.

Specify a particular card model
Use the GPU card model scheduling label to ensure your application runs on nodes with a specific card model.
In the code
aliyun.accelerator/nvidia_name: "Tesla-V100-SXM2-32GB", replaceTesla-V100-SXM2-32GBwith the actual card model of your node.After the Job is created, choose from the navigation pane on the left. The Pod list shows the example Pod scheduled to a matching node, confirming that scheduling based on the GPU card model label is working.
Exclude a particular card model
Use the GPU card model scheduling label with node affinity and anti-affinity to prevent your application from running on certain card models.
In
values: - "Tesla-V100-SXM2-32GB", replaceTesla-V100-SXM2-32GBwith the actual card model of your node.After the Job is created, the application will not be scheduled on nodes with the
aliyun.accelerator/nvidia_name: 'Tesla-V100-SXM2-32GB'label. It can, however, be scheduled on other GPU nodes.
