GPU-HPN Capacity Reservation is a feature for ACS Clusters that lets you reserve GPU computing resources with High-Performance Network (HPN) support. You can associate a GPU-HPN Capacity Reservation with a Virtual Node in your ACS Cluster. This enables you to use Affinity-based Scheduling to run GPU container workloads and use your computing resources efficiently. This topic explains how to create a GPU-HPN Capacity Reservation and associate clusters.
Background information
With an ACS cluster, you do not need to manage nodes. However, to maintain compatibility with native Kubernetes, you can still see Virtual Nodes in the cluster. Virtual Nodes offer vast computing capacity, which gives ACS clusters significant elasticity so you do not have to worry about sudden bursts of service traffic. The default Virtual Node is generated based on the vSwitchIds in the acs-profile ConfigMap and does not consume any computing resources by itself.
Introduction to GPU-HPN Capacity Reservation
Currently, GPU-HPN Capacity Reservation is available only by subscription. Once your reservation is active, the resources are guaranteed for the entire subscription period. If you reserve multiple instances, you can associate them with multiple clusters simultaneously.
Use the reserved capacity to run Pod instances of various sizes and specifications. For example, in scenarios that require High-Performance Network (HPN) for distributed communication between GPUs, such as large-scale model training and fine-tuning, the reserved capacity is consumed based on the number of whole GPUs specified in the Pod, regardless of the Pod's CPU and memory allocation.
Billing and limitations
Capacity reservations are billed based on the pay-as-you-go rate of the instance type, calculated on a per-second basis. However, resources are reserved via a subscription model. The billing cycle is as follows:
Billing starts when the reservation is successfully created, and its status changes to active.
Billing stops after the reserved capacity expires, then it's automatically released.
Submit a ticket to enable GPU-HPN Capacity Reservation.
Capacity reservations can only offset the costs of Pods that use the exact same GPU resource type. If you purchase a reservation for one GPU model, it cannot be used to deduct the costs of a different GPU model. In such cases, ACS will bill you at the standard rate for the actual GPU type used.
Only Pods with the compute type set to High-Performance Network GPU (
gpu-hpn) can use nodes from a GPU-HPN Capacity Reservation.
Create a GPU-HPN Capacity Reservation
Log on to the ACS console. In the left navigation pane, click Capacity Reservation.
On the Capacity Reservation page, click Create GPU-HPN Capacity Reservation and enter the following parameters.
Parameter
Description
Region
Select the region where the resource reservation will be located.
Zone
Select the zone where the resource reservation will be located.
Resource Type
Currently, only Instance is supported.
Category
Currently, only GPU is supported.
HPN Zone
Select the HPN Zone where the resource reservation will be located.
Node Type
The available options are displayed on the console.
Subscription Duration
1 month, 1 year, or 3 years.
Down Payment (%)
No down payment. You can use the resources immediately after purchase.
Discount Information
Discounts vary based on the subscription duration. here is No Discount for a 1-month subscription, a Discount for 12-month Subscription for a 1-year term, and a Discount for 36-month Subscription for a 3-year term.
Billing Cycle
Monthly.
Installments
Depends on the subscription duration. For example, a 1-month subscription has 1 Installment, and a 1-year subscription has 12 Installments.
Quantity
The number of instances to purchase.
After you configure the parameters, click Create Capacity Reservation, then click OK to confirm.
Associate clusters with a capacity reservation
Log on to the ACS console. In the left navigation pane, click Capacity Reservation.
On the Capacity Reservation page, find the capacity reservation you want to associate and click Associate Cluster for that entry.
ImportantYou can only associate reservations with ACS Clusters, ACK Managed Clusters, ACK One Registered Clusters, and ACK One Distributed Workflow Argo Clusters. Other cluster types are not supported.
In the Resource Association dialog box, select or enter the cluster ID and specify the number of instances to associate. Then, click OK.
ImportantIn the Resource Association dialog box, the Associate Cluster drop-down list only displays the IDs of ACS Clusters and ACK Managed Clusters. To associate ACK One Registered Clusters and ACK One Distributed Workflow Argo Clusters, enter the cluster ID manually.

Query the allocated resources of a GPU-HPN capacity reservation
A GPU-HPN Capacity Reservation is represented as a Kubernetes Node in the cluster. You can use the kubectl tool to view the allocated resources of the Node.
Run the following command to view the GPU-HPN node by its label:
kubectl get node -l alibabacloud.com/node-type=reservedExpected output:
NAME STATUS ROLES AGE VERSION cn-wulanchabu-c.cr-rkccqmu0xz8rea1***** Ready agent 20m v1.28.3-aliyunRun the following command to query the resources allocated in the GPU-HPN Capacity Reservation. Replace the node name with the one obtained in the previous step.
kubectl describe node cn-wulanchabu-c.cr-rkccqmu0xz8rea1***** | grep Allocated -A 10Expected output:
Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 16 (8%) 16 (8%) memory 128Gi (7%) 128Gi (7%) ephemeral-storage 30Gi (0%) 30Gi (0%) nvidia.com/gpu 1 1 Events: <none>In the output, the
cpu,memory,ephemeral-storage, andnvidia.com/gpulines show the allocated resources and their utilization.