Multi-region serverless computing power scheduling - - Alibaba Cloud Documentation Center

In large language model (LLM) training and model inference scenarios, specific GPU models may be unavailable and GPU resources may be out of stock in a region. In this case, the computing power provided by the region may become insufficient and computing jobs may become pending in the region. The registered clusters provided by Distributed Cloud Container Platform for Kubernetes (ACK One) use the virtual nodes of Container Service for Kubernetes (ACK) to seamlessly add serverless computing resources in multiple regions to Kubernetes clusters. This allows you to dynamically schedule GPU resources and centrally manage GPU resources across regions. You can use registered clusters of ACK One to resolve resource bottlenecks in multi-region scenarios, utilize heterogeneous computing resources in specific regions on demand, and prevent GPU resource scheduling failures caused by GPU model unavailability and insufficient inventory. This helps you greatly improve resource utilization and business continuity and reduce the complexity and cost of hybrid cloud deployment.

How it works

The region information in the preceding figure indicates the region ID of serverless computing resources. For example, the ID of the China (Zhangjiakou) region is cn-zhangjiakou.
For each serverless pod that you create, the virtual node of ACK creates an on-cloud serverless computing instance. You do not need to maintain additional nodes.
The data center connects to virtual private clouds (VPCs) in multiple regions by using Express Connect circuits.

Prerequisites

An ACK One registered cluster is created and connected to a data center or a Kubernetes cluster of another cloud service provider (Kubernetes 1.24 or later is recommended).

Install ack-virtual-node and enable multi-region serverless computing power scheduling

Install ack-virtual-node.
1. Grant RAM permissions to ack-virtual-node.
2. Install ack-virtual-node.
Note
If ack-virtual-node is already installed, make sure that the installed version is 2.13.0 or later. If the installed version is earlier than 2.13.0, upgrade the component.

Configure ack-virtual-node.

On the Add-ons page, find ack-virtual-node and click Configuration.

Configure global parameters.

Parameter	Description	Example
Specify whether to use VPC internal access	Specifies whether the image and API can both be accessed through the Virtual Private Cloud (VPC) endpoint.	Selected
APIServerHost	The IP address of the API server of the Kubernetes cluster in the data center.	192.168.1.1
APIServerPort	The port exposed for the API server of the Kubernetes cluster in the data center.	6443
Specifies whether to enable multi-region virtual nodes	Specifies whether to enable multi-region serverless computing power scheduling. If you want to enable multi-region serverless computing power scheduling, you must specify region information.	Selected

Specify the primary region.

Parameter	Description	Example
Region ID	The ID of the region where serverless computing power is used.	cn-beijing
VPC ID	The ID of the VPC where serverless computing power is used.	vpc-xxxxx
vSwitch ID(s)	The IDs of the vSwitches used by serverless computing power. Separate multiple IDs with commas (,).	vsw-xxxxx,vsw-xxxxx
SecurityGroup ID	The ID of the security group used by serverless computing power.	sg-xxxxx
Specifies whether to use the region of virtual nodes as the default region	Specifies whether to set the specified region as the primary region. Important You can specify only one primary region.	Selected

Specify secondary regions. Click Add in the lower-right corner to add more regions.

Parameter	Description	Example
Region ID	The ID of the region where serverless computing power is used.	cn-hangzhou
VPC ID	The ID of the VPC where serverless computing power is used.	vpc-xxxxx
vSwitch ID(s)	The IDs of the vSwitches used by serverless computing power. Separate multiple IDs with commas (,).	vsw-xxxxx,vsw-xxxxx
SecurityGroup ID	The ID of the security group used by serverless computing power.	sg-xxxxx

After the configuration is completed, click OK.

Examples

CPU scenarios

Use the default region.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-default-region
  name: nginx-deployment-default-region
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-default-region
  template:
    metadata:
      labels:
        alibabacloud.com/acs: "true"
        alibabacloud.com/compute-class: general-purpose 
        alibabacloud.com/compute-qos: default 
        app: nginx-default-region
    spec:  
      containers:
        - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
          imagePullPolicy: IfNotPresent
          name: nginx
          ports:
            - containerPort: 80
              protocol: TCP

Specify the region of serverless computing power. You must add the alibabacloud.com/serverless-region-id: <RegionID> label.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-specified-region
  name: nginx-deployment-specified-region
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-specified-region
  template:
    metadata:
      labels:
        alibabacloud.com/acs: "true" 
        alibabacloud.com/compute-class: general-purpose 
        alibabacloud.com/compute-qos: default 
        alibabacloud.com/serverless-region-id: cn-beijing # Specify the region.
        app: nginx-specified-region
    spec:  
      containers:
        - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
          imagePullPolicy: IfNotPresent
          name: nginx
          ports:
            - containerPort: 80
              protocol: TCP

GPU scenarios

Use the default region.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-gpu-default-region
  name: nginx-gpu-deployment-default-region
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-gpu-default-region
  template:
    metadata:
      labels:
        alibabacloud.com/acs: "true"
        alibabacloud.com/compute-class: gpu
        alibabacloud.com/compute-qos: default
        alibabacloud.com/gpu-model-series: example-model  # The GPU model. Specify the actual model that you want to use, such as T4.
        app: nginx-gpu-default-region
    spec:  
      containers:
        - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
          imagePullPolicy: IfNotPresent
          name: nginx
          ports:
            - containerPort: 80
              protocol: TCP
          resources:
            limits:
              cpu: 1
              memory: 1Gi
              nvidia.com/gpu: "1"
            requests:
              cpu: 1
              memory: 1Gi
              nvidia.com/gpu: "1"

Specify the region of serverless computing power. You must add the alibabacloud.com/serverless-region-id: <RegionID> label.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-gpu-specified-region
  name: nginx-gpu-deployment-specified-region
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-gpu-specified-region
  template:
    metadata:
      labels:
        alibabacloud.com/acs: "true" 
        alibabacloud.com/compute-class: gpu
        alibabacloud.com/compute-qos: default
        alibabacloud.com/gpu-model-series: example-model  # The GPU model. Specify the actual model that you want to use, such as T4.
        alibabacloud.com/serverless-region-id: cn-beijing # Specify the region.
        app: nginx-gpu-specified-region
    spec:  
      containers:
        - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
          imagePullPolicy: IfNotPresent
          name: nginx
          ports:
            - containerPort: 80
              protocol: TCP
          resources:
            limits:
              cpu: 1
              memory: 1Gi
              nvidia.com/gpu: "1"
            requests:
              cpu: 1
              memory: 1Gi
              nvidia.com/gpu: "1"