All Products
Search
Document Center

Container Service for Kubernetes:Install the topology-aware GPU scheduling component

Last Updated:Mar 03, 2026

Install the ack-ai-installer component in your ACK cluster to enable topology-aware GPU scheduling. This feature selects the GPU combination on a node that provides the optimal training speed based on the physical topology of GPU devices.

Before you begin

Before you begin, make sure that you have:

Version requirements

ComponentRequired version
Kubernetes1.18.8 or later
NVIDIA driver418.87.01 or later
NVIDIA Collective Communications Library (NCCL)2.7 or later
GPUV100

Supported operating systems

  • CentOS 7.6, CentOS 7.7

  • Ubuntu 16.04, Ubuntu 18.04

  • Alibaba Cloud Linux 2, Alibaba Cloud Linux 3

Install the component from Cloud-native AI Suite

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find your cluster and click its name.

  3. In the left-side navigation pane, choose Applications > Cloud-native AI Suite.

  4. On the Cloud-native AI Suite page, click Deploy.

  5. In the Scheduling section, select Scheduling Policy Extension (Batch Task Scheduling, GPU Sharing, Topology-aware GPU Scheduling), and then click Deploy Cloud-native AI Suite. For more information about the parameters, see Install the cloud-native AI suite.

  6. Verify that ack-ai-installer appears in the Components list on the Cloud-native AI Suite page.

Note: If you have already installed a component of the Cloud-native AI Suite, find ack-ai-installer in the Components list and click Deploy in the Actions column.

What to do next

After you install the component, configure topology-aware GPU scheduling policies for your workloads. For more information, see GPU topology-aware scheduling.