In large language model (LLM) training and model inference scenarios, specific GPU models may be unavailable and GPU resources may be out of stock in a region. In this case, the computing power provided by the region may become insufficient and computing jobs may become pending in the region. The registered clusters provided by Distributed Cloud Container Platform for Kubernetes (ACK One) use the virtual nodes of Container Service for Kubernetes (ACK) to seamlessly add serverless computing resources in multiple regions to Kubernetes clusters. This allows you to dynamically schedule GPU resources and centrally manage GPU resources across regions. You can use registered clusters of ACK One to resolve resource bottlenecks in multi-region scenarios, utilize heterogeneous computing resources in specific regions on demand, and prevent GPU resource scheduling failures caused by GPU model unavailability and insufficient inventory. This helps you greatly improve resource utilization and business continuity and reduce the complexity and cost of hybrid cloud deployment.
How it works
The region information in the preceding figure indicates the region ID of serverless computing resources. For example, the ID of the China (Zhangjiakou) region is cn-zhangjiakou.
For each serverless pod that you create, the virtual node of ACK creates an on-cloud serverless computing instance. You do not need to maintain additional nodes.
The data center connects to virtual private clouds (VPCs) in multiple regions by using Express Connect circuits.
Prerequisites
An ACK One registered cluster is created and connected to a data center or a Kubernetes cluster of another cloud service provider (Kubernetes 1.24 or later is recommended).
Install ack-virtual-node and enable multi-region serverless computing power scheduling
Install ack-virtual-node.
NoteIf ack-virtual-node is already installed, make sure that the installed version is 2.13.0 or later. If the installed version is earlier than 2.13.0, upgrade the component.
Configure ack-virtual-node.
On the Add-ons page, find ack-virtual-node and click Configuration.
Configure global parameters.
Parameter
Description
Example
Specify whether to use VPC internal access
Specifies whether the image and API can both be accessed through the Virtual Private Cloud (VPC) endpoint.
Selected
APIServerHost
The IP address of the API server of the Kubernetes cluster in the data center.
192.168.1.1
APIServerPort
The port exposed for the API server of the Kubernetes cluster in the data center.
6443
Specifies whether to enable multi-region virtual nodes
Specifies whether to enable multi-region serverless computing power scheduling. If you want to enable multi-region serverless computing power scheduling, you must specify region information.
Selected
Specify the primary region.
Parameter
Description
Example
Region ID
The ID of the region where serverless computing power is used.
cn-beijing
VPC ID
The ID of the VPC where serverless computing power is used.
vpc-xxxxx
vSwitch ID(s)
The IDs of the vSwitches used by serverless computing power. Separate multiple IDs with commas (,).
vsw-xxxxx,vsw-xxxxx
SecurityGroup ID
The ID of the security group used by serverless computing power.
sg-xxxxx
Specifies whether to use the region of virtual nodes as the default region
Specifies whether to set the specified region as the primary region.
ImportantYou can specify only one primary region.
Selected
Specify secondary regions. Click Add in the lower-right corner to add more regions.
Parameter
Description
Example
Region ID
The ID of the region where serverless computing power is used.
cn-hangzhou
VPC ID
The ID of the VPC where serverless computing power is used.
vpc-xxxxx
vSwitch ID(s)
The IDs of the vSwitches used by serverless computing power. Separate multiple IDs with commas (,).
vsw-xxxxx,vsw-xxxxx
SecurityGroup ID
The ID of the security group used by serverless computing power.
sg-xxxxx
After the configuration is completed, click OK.
Examples
CPU scenarios
Use the default region.
apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx-default-region name: nginx-deployment-default-region namespace: default spec: replicas: 1 selector: matchLabels: app: nginx-default-region template: metadata: labels: alibabacloud.com/acs: "true" alibabacloud.com/compute-class: general-purpose alibabacloud.com/compute-qos: default app: nginx-default-region spec: containers: - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine' imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 80 protocol: TCPSpecify the region of serverless computing power. You must add the
alibabacloud.com/serverless-region-id: <RegionID>label.apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx-specified-region name: nginx-deployment-specified-region namespace: default spec: replicas: 1 selector: matchLabels: app: nginx-specified-region template: metadata: labels: alibabacloud.com/acs: "true" alibabacloud.com/compute-class: general-purpose alibabacloud.com/compute-qos: default alibabacloud.com/serverless-region-id: cn-beijing # Specify the region. app: nginx-specified-region spec: containers: - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine' imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 80 protocol: TCP
GPU scenarios
Use the default region.
apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx-gpu-default-region name: nginx-gpu-deployment-default-region namespace: default spec: replicas: 1 selector: matchLabels: app: nginx-gpu-default-region template: metadata: labels: alibabacloud.com/acs: "true" alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model. Specify the actual model that you want to use, such as T4. app: nginx-gpu-default-region spec: containers: - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine' imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 80 protocol: TCP resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"Specify the region of serverless computing power. You must add the
alibabacloud.com/serverless-region-id: <RegionID>label.apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx-gpu-specified-region name: nginx-gpu-deployment-specified-region namespace: default spec: replicas: 1 selector: matchLabels: app: nginx-gpu-specified-region template: metadata: labels: alibabacloud.com/acs: "true" alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model. Specify the actual model that you want to use, such as T4. alibabacloud.com/serverless-region-id: cn-beijing # Specify the region. app: nginx-gpu-specified-region spec: containers: - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine' imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 80 protocol: TCP resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"