All Products
Search
Document Center

Container Compute Service:GPU-HPN capacity reservation

Last Updated:Dec 04, 2025

A GPU-HPN capacity reservation for an Alibaba Cloud Container Service for Serverless Kubernetes (ACS) cluster is a type of resource reservation that supports GPU computing power on High-Performance Networks (HPN). You can associate a GPU-HPN capacity reservation with virtual nodes in an ACS cluster. This enables affinity-based scheduling for GPU container computing power and the efficient use of computing resources. This topic describes how to create a GPU-HPN capacity reservation and associate it with a cluster.

Background information

ACS clusters do not require you to manage nodes. However, to maintain compatibility with native Kubernetes, virtual nodes are still visible in the cluster. Virtual nodes provide a large capacity of computing resources, which gives ACS clusters high elasticity to handle sudden increases in service traffic. The default virtual nodes are generated based on the vSwitchIds in the acs-profile ConfigMap and do not consume any computing resources.

About GPU-HPN capacity reservation

GPU-HPN capacity reservations currently support only the subscription billing method. After a successful reservation, the resources remain available throughout the subscription period. If you reserve multiple instances at once, you can associate them with multiple clusters simultaneously. Reserved nodes can be allocated as pod instances of different quantities and specifications. For example, in scenarios such as large model training and fine-tuning that require an HPN for distributed communication between GPU cards, a GPU-HPN capacity reservation allows for billing deductions based on whole GPU cards. The number of cards is determined by the pod specification, with no constraints on CPU and memory sizes.

Billing and limits

Capacity reservations are billed by the second and follow pay-as-you-go standards. The billing cycle for a capacity reservation is as follows:

Billing start: Billing starts when the reservation is successfully created and its status changes to active.

Billing end: Billing stops after the reserved capacity expires and is automatically released.

Important
  • You must submit a ticket to enable GPU-HPN capacity reservations.

  • ACS clusters support capacity reservation deductions only for the same GPU resource type. If you purchase different types of GPU cards, their costs cannot be offset against each other. ACS bills you based on the price of the card types that you purchased.

  • Nodes from a GPU-HPN capacity reservation can be used only by pods with the compute type set to High-Performance Network GPU (gpu-hpn).

Create a GPU-HPN capacity reservation

  1. Log on to the ACS console. In the left-side navigation pane, click Capacity Reservation.

  2. On the Capacity Reservation page, click Create GPU-HPN Capacity Reservation and enter the following information.

    Parameter

    Description

    Region

    Select the region where the resource reservation is located.

    Zone

    Select the zone where the resource reservation is located.

    Quantity

    Currently, only Instance is supported.

    Category

    Currently, only GPU is supported.

    HPN Zone

    Select the HPN Zone where the resource reservation is located.

    Node Type

    Refer to the options displayed in the console.

    Subscription Duration

    1 Month, 1 Year, 3 Years.

    Down Payment (%)

    0% down payment. Use the service immediately after purchase.

    Discount Information

    Varies by subscription duration. A 1-month subscription has No Discount. A 1-year subscription has a Discount for 12-month Subscription. A 3-year subscription has a Discount for 36-month Subscription.

    Billing Cycle

    Monthly.

    Installments

    Depends on the subscription duration. For example, a 1-month subscription has 1 Installment, and a 1-year subscription has 12 Installments.

    Quantity

    The number of instances to purchase.

    After you complete the configuration, click Buy Now. On the Confirm Order page, click Pay. On the payment page, click Subscribe to complete the purchase.

Associate a cluster

  1. Log on to the ACS console. In the left-side navigation pane, click Capacity Reservation.

  2. On the Capacity Reservation page, find the capacity reservation that you want to associate with a cluster. In the status bar of the reservation, click Associate Cluster.

    Important

    Only ACS clusters, ACK managed clusters, ACK One registered clusters, and ACK One distributed workflow Argo clusters can be associated. Other cluster types are not supported.

  3. In the Resource Association dialog box, select or enter the cluster ID and the number of instances to associate. Then, click OK.

    Important

    In the Resource Association dialog box, the Associate Cluster drop-down list shows only the IDs of ACS clusters and ACK managed clusters. For ACK One registered clusters and ACK One distributed workflow Argo clusters, you must enter the cluster ID directly.

    image

Query allocated resources in a GPU-HPN capacity reservation

A GPU-HPN capacity reservation is represented as a Kubernetes node in the cluster. You can use the kubectl tool to view the allocated resources of the node.

  1. Run the following command to view the GPU-HPN node by its label.

    kubectl get node -l alibabacloud.com/node-type=reserved

    Expected output:

    NAME                                      STATUS   ROLES   AGE   VERSION
    cn-wulanchabu-c.cr-rkccqmu0xz8rea1*****   Ready    agent   20m   v1.28.3-aliyun
  2. Run the following command to query the allocated resources in the GPU-HPN capacity reservation. Replace the node name in the command with the GPU-HPN node name that you obtained in the previous step.

    kubectl describe node cn-wulanchabu-c.cr-rkccqmu0xz8rea1***** | grep Allocated -A 10

    Expected output:

    Allocated resources:
      (Total limits may be over 100 percent, i.e., overcommitted.)
      Resource              Requests    Limits
      --------              --------    ------
      cpu                   16 (8%)     16 (8%)
      memory                128Gi (7%)  128Gi (7%)
      ephemeral-storage     30Gi (0%)   30Gi (0%)
      nvidia.com/gpu        1           1
    Events:                 <none>

    Lines 5 to 8 show the allocated amount and the allocation rate for CPU, memory, ephemeral storage, and GPUs, respectively.