All Products
Search
Document Center

:Multi-region serverless computing power scheduling

Last Updated:May 21, 2025

In large language model (LLM) training and model inference scenarios, specific GPU models may be unavailable and GPU resources may be out of stock in a region. In this case, the computing power provided by the region may become insufficient and computing jobs may become pending in the region. The registered clusters provided by Distributed Cloud Container Platform for Kubernetes (ACK One) use the virtual nodes of Container Service for Kubernetes (ACK) to seamlessly add serverless computing resources in multiple regions to Kubernetes clusters. This allows you to dynamically schedule GPU resources and centrally manage GPU resources across regions. You can use registered clusters of ACK One to resolve resource bottlenecks in multi-region scenarios, utilize heterogeneous computing resources in specific regions on demand, and prevent GPU resource scheduling failures caused by GPU model unavailability and insufficient inventory. This helps you greatly improve resource utilization and business continuity and reduce the complexity and cost of hybrid cloud deployment.

How it works

image
  • The region information in the preceding figure indicates the region ID of serverless computing resources. For example, the ID of the China (Zhangjiakou) region is cn-zhangjiakou.

  • For each serverless pod that you create, the virtual node of ACK creates an on-cloud serverless computing instance. You do not need to maintain additional nodes.

  • The data center connects to virtual private clouds (VPCs) in multiple regions by using Express Connect circuits.

Prerequisites

An ACK One registered cluster is created and connected to a data center or a Kubernetes cluster of another cloud service provider (Kubernetes 1.24 or later is recommended).

Install ack-virtual-node and enable multi-region serverless computing power scheduling

  1. Install ack-virtual-node.

    1. Grant RAM permissions to ack-virtual-node.

    2. Install ack-virtual-node.

    Note

    If ack-virtual-node is already installed, make sure that the installed version is 2.13.0 or later. If the installed version is earlier than 2.13.0, upgrade the component.

  2. Configure ack-virtual-node.

    On the Add-ons page, find ack-virtual-node and click Configuration.

    1. Configure global parameters.

      Parameter

      Description

      Example

      Specify whether to use VPC internal access

      Specifies whether the image and API can both be accessed through the Virtual Private Cloud (VPC) endpoint.

      Selected

      APIServerHost

      The IP address of the API server of the Kubernetes cluster in the data center.

      192.168.1.1

      APIServerPort

      The port exposed for the API server of the Kubernetes cluster in the data center.

      6443

      Specifies whether to enable multi-region virtual nodes

      Specifies whether to enable multi-region serverless computing power scheduling. If you want to enable multi-region serverless computing power scheduling, you must specify region information.

      Selected

    2. Specify the primary region.

      Parameter

      Description

      Example

      Region ID

      The ID of the region where serverless computing power is used.

      cn-beijing

      VPC ID

      The ID of the VPC where serverless computing power is used.

      vpc-xxxxx

      vSwitch ID(s)

      The IDs of the vSwitches used by serverless computing power. Separate multiple IDs with commas (,).

      vsw-xxxxx,vsw-xxxxx

      SecurityGroup ID

      The ID of the security group used by serverless computing power.

      sg-xxxxx

      Specifies whether to use the region of virtual nodes as the default region

      Specifies whether to set the specified region as the primary region.

      Important

      You can specify only one primary region.

      Selected

    3. Specify secondary regions. Click Add in the lower-right corner to add more regions.

      Parameter

      Description

      Example

      Region ID

      The ID of the region where serverless computing power is used.

      cn-hangzhou

      VPC ID

      The ID of the VPC where serverless computing power is used.

      vpc-xxxxx

      vSwitch ID(s)

      The IDs of the vSwitches used by serverless computing power. Separate multiple IDs with commas (,).

      vsw-xxxxx,vsw-xxxxx

      SecurityGroup ID

      The ID of the security group used by serverless computing power.

      sg-xxxxx

  3. After the configuration is completed, click OK.

Examples

CPU scenarios

  • Use the default region.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: nginx-default-region
      name: nginx-deployment-default-region
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx-default-region
      template:
        metadata:
          labels:
            alibabacloud.com/acs: "true"
            alibabacloud.com/compute-class: general-purpose 
            alibabacloud.com/compute-qos: default 
            app: nginx-default-region
        spec:  
          containers:
            - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
              imagePullPolicy: IfNotPresent
              name: nginx
              ports:
                - containerPort: 80
                  protocol: TCP
    
    
  • Specify the region of serverless computing power. You must add the alibabacloud.com/serverless-region-id: <RegionID> label.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: nginx-specified-region
      name: nginx-deployment-specified-region
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx-specified-region
      template:
        metadata:
          labels:
            alibabacloud.com/acs: "true" 
            alibabacloud.com/compute-class: general-purpose 
            alibabacloud.com/compute-qos: default 
            alibabacloud.com/serverless-region-id: cn-beijing # Specify the region.
            app: nginx-specified-region
        spec:  
          containers:
            - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
              imagePullPolicy: IfNotPresent
              name: nginx
              ports:
                - containerPort: 80
                  protocol: TCP

GPU scenarios

  • Use the default region.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: nginx-gpu-default-region
      name: nginx-gpu-deployment-default-region
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx-gpu-default-region
      template:
        metadata:
          labels:
            alibabacloud.com/acs: "true"
            alibabacloud.com/compute-class: gpu
            alibabacloud.com/compute-qos: default
            alibabacloud.com/gpu-model-series: example-model  # The GPU model. Specify the actual model that you want to use, such as T4.
            app: nginx-gpu-default-region
        spec:  
          containers:
            - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
              imagePullPolicy: IfNotPresent
              name: nginx
              ports:
                - containerPort: 80
                  protocol: TCP
              resources:
                limits:
                  cpu: 1
                  memory: 1Gi
                  nvidia.com/gpu: "1"
                requests:
                  cpu: 1
                  memory: 1Gi
                  nvidia.com/gpu: "1"
  • Specify the region of serverless computing power. You must add the alibabacloud.com/serverless-region-id: <RegionID> label.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: nginx-gpu-specified-region
      name: nginx-gpu-deployment-specified-region
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx-gpu-specified-region
      template:
        metadata:
          labels:
            alibabacloud.com/acs: "true" 
            alibabacloud.com/compute-class: gpu
            alibabacloud.com/compute-qos: default
            alibabacloud.com/gpu-model-series: example-model  # The GPU model. Specify the actual model that you want to use, such as T4.
            alibabacloud.com/serverless-region-id: cn-beijing # Specify the region.
            app: nginx-gpu-specified-region
        spec:  
          containers:
            - image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
              imagePullPolicy: IfNotPresent
              name: nginx
              ports:
                - containerPort: 80
                  protocol: TCP
              resources:
                limits:
                  cpu: 1
                  memory: 1Gi
                  nvidia.com/gpu: "1"
                requests:
                  cpu: 1
                  memory: 1Gi
                  nvidia.com/gpu: "1"