All Products
Search
Document Center

Container Service for Kubernetes:Overview of heterogeneous computing clusters

Last Updated:Nov 26, 2025

Alibaba Cloud Container Service for Kubernetes (ACK) supports unified scheduling and operations management of various heterogeneous computing resources, improving resource utilization in heterogeneous computing clusters. This topic describes the Kubernetes clusters for heterogeneous computing that ACK supports.

Introduction to Kubernetes clusters for heterogeneous computing

ACK supports unified scheduling and operations management of heterogeneous resources, such as GPUs, Application-Specific Integrated Circuits (ASICs), and elastic Remote Direct Memory Access (eRDMA), to improve cluster resource utilization. The following table describes the Kubernetes clusters and features that ACK supports for heterogeneous computing.

Heterogeneous resource

Description

GPU

ACK lets you create clusters that contain mainstream GPU cards, such as T4, P100, and V100.

  • Supports resource requests for individual GPUs.

  • Supports auto scaling based on GPU metrics.

  • Supports GPU sharing and computing power fencing. The GPU sharing technology developed by Alibaba Cloud lets multiple model inference applications run on the same GPU at the same time. This significantly reduces costs. With the cGPU solution provided by Alibaba Cloud, GPU memory and computing power fencing are achieved without the need to modify application containers. This enhances the stability of applications. The following GPU device allocation policies are supported:

    • Single-pod-single-GPU sharing: This policy is commonly used in model inference scenarios.

    • Single-pod-multi-GPU sharing: This policy is commonly used for distributed training development.

    • Binpack allocation policy: Multiple pods are preferentially scheduled to the same GPU card. This policy is suitable for scenarios where you need to improve GPU utilization.

    • Spread allocation policy: Multiple pods are scheduled to different GPU cards as much as possible. This policy is suitable for high availability (HA) scenarios.

  • Supports the topology-aware GPU scheduling feature. This feature retrieves the topology of heterogeneous computing resources from nodes. The scheduler makes scheduling decisions based on the node topology information. This provides the best scheduling options for NVLINK, PCIe Switch, QPI, and RDMA NICs to achieve optimal performance.

  • Supports GPU resource monitoring. This feature provides monitoring metrics from the node and application perspectives, automatically detects and creates alerts for device (software and hardware) exceptions, and supports both dedicated and shared GPU scenarios.

ASIC

ACK lets you create clusters that contain NETINT ASIC devices and supports resource requests for individual NETINT ASIC cards.

eRDMA

ACK lets you create clusters that contain eRDMA devices.

  • Submit distributed deep learning training jobs that use eRDMA devices through Arena.

  • Supports jobs that have high requirements for network bandwidth, such as distributed deep learning training jobs.

GPU instance types supported by ACK

ACK supports multiple GPU-accelerated compute-optimized instance families. To add GPU nodes to an ACK cluster, you can select an instance type from the ECS instance families listed below.

Confidential computing instances are not supported. These instance types contain the -tee field, such as ecs.gn8v-tee.4xlarge.
Note

You cannot select vGPU-accelerated instances as cluster nodes in the ACK console. For more information, see Does Container Service for Kubernetes support vGPU-accelerated instances?.

ASIC instance types supported by ACK

To add ASIC nodes to an ACK cluster, you can select the instance type ecs.video-trans.26xhevc.

eRDMA instance types supported by ACK

ACK supports multiple eRDMA-accelerated instance families. You can select from the ECS instance families listed below. For more information, see Enable eRDMA on an enterprise-level instance and Enable eRDMA on a GPU-accelerated instance.