how to configure external Kubernetes clusters to use CPU and GPU resources provided by serverless ECI - Container Service for Kubernetes

Kubernetes clusters deployed in data centers can run client pods on serverless elastic container instances. You can directly submit client pods in these clusters and run the pods on elastic container instances without the need to maintain node pools. The pods can use CPU and GPU resources provided by elastic container instances in a more flexible, efficient, and elastic manner. This topic describes how to use registered clusters to allow external Kubernetes clusters to use CPU and GPU resources provided by serverless Elastic Container Instance (ECI).

Background information
Prerequisites
Use scenarios
Step 1: Install components
Step 2: View nodes
Step 3: Use serverless elastic container instances to run pods (CPU-accelerated jobs and GPU-accelerated jobs)
Step 4: Use multilevel resources scheduling

Background information

If you want to customize the configuration of nodes in an external Kubernetes cluster, such as the runtime, kubelet, NVIDIA settings, or use specific Elastic Compute Service (ECS) instance types, you can directly add on-cloud nodes or GPU-accelerated nodes to the cluster. However, this method requires you to manually maintain these nodes, which increases the O&M cost. To reduce O&M costs, you can run client pods on serverless elastic container instances. This saves you the need to manage node pools in the cloud and enables the cluster to use CPU and GPU resources provided by ECI in a more efficient and elastic manner.

Prerequisites

An external Kubernetes cluster is connected to a registered cluster. For more information, see Use onectl to create a registered cluster and Create a registered cluster in the ACK console.
ECI is activated.
A kubectl client is connected to the registered cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

Use scenarios

Using CPU and GPU resources provided by serverless elastic container instances can efficiently improve the elasticity of external Kubernetes clusters in order to meet the demand for handling traffic spikes due to business development.

The serverless architecture design allows you to directly submit client pods in external Kubernetes clusters and run the pods on elastic container instances of Alibaba Cloud. Elastic container instances are fast to launch and billed based on the lifecycle of client pods because their lifecycles are the same. Serverless saves you the need to create on-cloud nodes, plan on-cloud resources, or wait for the system to create ECS instances. Serverless can greatly reduce O&M expenses on nodes. The following figure shows how an external Kubernetes cluster uses CPU and GPU resources provided by serverless elastic container instances.

Using serverless ECI to run pods in external Kubernetes clusters is suitable for the following scenarios:

Auto scaling based on traffic fluctuations: Serverless ECI is suitable for industries in which traffic fluctuates, such as online education and e-commerce. Serverless ECI can reduce O&M expenses on fixed resource pools.
Data computing: Serverless ECI is suitable for Spark, Presto, and Argo Workflows. Serverless ECI can efficiently reduce computing costs because pods are billed based on their uptime.
Continuous integration and continuous delivery (CI/CD) pipeline: Serverless ECI is suitable for Jenkins and GitLab Runner.
Jobs: Serverless ECI is suitable for AI jobs and CronJobs.

Step 1: Install components

To use registered clusters to allow external Kubernetes clusters to use CPU and GPU resources provided by serverless ECI, you need to install the following components:

ack-virtual-node: This component allows you to benefit from the elasticity of virtual nodes and ECI.
ack-co-scheduler: This component allows you to create ResourcePolicy custom resources (CRs) to use multilevel resource scheduling.

You can use the following methods to install the components.

Use onectl

Install onectl on your on-premises machine. For more information, see Use onectl to manage registered clusters.

Run the following command to install the ack-virtual-node and ack-co-scheduler components:

onectl addon install ack-virtual-node
onectl addon install ack-co-scheduler

Expected output:

Addon ack-virtual-node, version **** installed.
Addon ack-co-scheduler, version **** installed.

Use the console

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Add-ons in the left-side navigation pane.
On the Add-ons page, click the Others tab, find ack-virtual-node or ack-co-scheduler, and then click Install in the lower-right part of the card.
In the message that appears, click OK.

Step 2: View nodes

After the ack-virtual-node component is installed, you can use the kubeconfig file of the registered cluster to view the node pool. The name of the virtual node is prefixed with virtual-kubelet. The virtual node runs on a serverless elastic container instance.

Run the following command to query the node information:

kubectl get node

Expected output:

NAME                               STATUS   ROLES    AGE    VERSION
iz8vb1xtnuu0ne6b58hvx0z            Ready    master   4d3h   v1.20.9   # An on-premises node. In this example, the node serves as a master node and worker node at the same time. The node can run client pods. 
virtual-kubelet-cn-zhangjiakou-a   Ready    agent    99s    v1.20.9.  # The virtual node for the ack-virtual-node component.

Step 3: Use serverless elastic container instances to run pods (CPU-accelerated jobs and GPU-accelerated jobs)

You can use the following methods to run pods on serverless elastic container instances.

Method 1: Configure pod labels

In the following example, a GPU-accelerated elastic container instance is used to run a CUDA job. You do not need to install or configure the NVIDIA driver or runtime. The job runs in a serverless architecture.

Add the alibabacloud.com/eci=true label to a pod to run the pod on a serverless elastic container instance.

Use the following YAML template to submit a pod that runs on a serverless elastic container instance:

> cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
  labels:
    alibabacloud.com/eci: "true"  # Run the pod on a serverless elastic container instance. 
  annotations:
    k8s.aliyun.com/eci-use-specs: ecs.gn5-c4g1.xlarge  # Specify an ECS instance type. An ECS instance of this type is equipped with an NVIDIA P100 GPU. 
spec:
  restartPolicy: Never
  containers:
    - name: cuda-container
      image: acr-multiple-clusters-registry.cn-hangzhou.cr.aliyuncs.com/ack-multiple-clusters/cuda10.2-vectoradd
      resources:
        limits:
          nvidia.com/gpu: 1 # Apply for a GPU. 
EOF

Run the following command to query pods:

kubectl get pod -o wide

Expected output:

NAME       READY   STATUS      RESTARTS   AGE     IP              NODE                               NOMINATED NODE   READINESS GATES
gpu-pod    0/1     Completed   0          5m30s   172.16.XX.XX   virtual-kubelet-cn-zhangjiakou-a   <none>           <none>

kubectl logs gpu-pod

Expected output:

Using CUDA Device [0]: Tesla P100-PCIE-16GB
GPU Device has SM 6.0 compute capability
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

The output indicates that the pod runs on the virtual node virtual-kubelet. The virtual node runs on a serverless elastic container instance.

Method 2: Configure namespace labels

Add the alibabacloud.com/eci=true label to the namespace to run all newly created pods in the namespace on serverless elastic container instances.

kubectl label namespace <namespace-name> alibabacloud.com/eci=true

Step 4: Use multilevel resources scheduling

After you complete Step 3, you can use the multilevel resource scheduling feature of the registered cluster of Distributed Cloud Container Platform for Kubernetes (ACK One). This feature can preferably schedule application pods to on-premises nodes. When on-premises nodes are insufficient, this feature schedules application pods to serverless elastic container instances.

The ack-co-scheduler component allows you to create a ResourcePolicy CR to use multilevel resource scheduling. ResourcePolicy CRs are namespace resources. The following table describes the parameters of ResourcePolicy CRs.

Parameter	Description
selector	Select pods that have the `key1=value1` `label` in the same namespace.
strategy	The scheduling policy. Set the value to `prefer`.
units	The custom scheduling units. During scale-out activities, the system attempts to select resources listed in `units` in sequence. During scale-in activities, the system attempts to release resources in the reverse order. `resource`: the type of elastic resources. Valid values: `idc`, `ecs`, and `eci`. `nodeSelector`: select `nodes` with specific `labels`. This parameter takes effect only on elastic resources of the `ecs` type. `max`: the maximum number of pods that can be deployed by using the resources.

Create a ResourcePolicy CR based on the following content to prioritize on-premises resource over serverless elastic container instances.

> cat << EOF | kubectl apply -f -
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
  name: cost-balance-policy
spec:
  selector:
    app: nginx           # Select application pods. 
    key1: value1
  strategy: prefer
  units:
  - resource: idc        # Prioritize on-premises nodes. 
    max: 3
  - resource: eci        # Use serverless elastic container instances when on-premises nodes are insufficient. 
    nodeSelector:
      key2: value2
EOF

Create a Deployment based on the following content to deploy two replicated pods. Each replicated pod requests two CPUs.

> cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx
      annotations:
        addannotion: "true"
      labels:
        app: nginx      # The pod label must be the same as the one that you specified for the selector in the ResourcePolicy. 
    spec:
      schedulerName: ack-co-scheduler
      containers:
      - name: nginx
        image: acr-multiple-clusters-registry.cn-hangzhou.cr.aliyuncs.com/ack-multiple-clusters/nginx
        resources:
          requests:
            cpu: 2
          limits:
            cpu: 2
EOF

Run the following command to scale the number of replicated pods to four:
The external Kubernetes cluster contains only one node with six CPUs. The node can host at most two NGINX pods because some resources are reserved by the system. The remaining two replicated pods are scheduled to a serverless elastic container instance.
```
kubectl scale deployment nginx --replicas 4
```

Run the following command to query the status of the pods:

> kubectl get pod -o wide

Expected output:

NAME                     READY   STATUS    RESTARTS   AGE     IP              NODE                      
nginx-79cd98b4b5-97s47   1/1     Running   0          84s     10.100.XX.XX    iz8vb1xtnuu0ne6b58h****   
nginx-79cd98b4b5-gxd8z   1/1     Running   0          84s     10.100.XX.XX    iz8vb1xtnuu0ne6b58h****   
nginx-79cd98b4b5-k55rb   1/1     Running   0          58s     10.100.XX.XX    virtual-kubelet-cn-zhangjiakou-a
nginx-79cd98b4b5-m9jxm   1/1     Running   0          58s     10.100.XX.XX    virtual-kubelet-cn-zhangjiakou-a

The output indicates that two pods run on the on-premises node and two pods run on the virtual node deployed on the serverless elastic container instance.

Container Service for Kubernetes:Use registered clusters to allow external Kubernetes clusters to use CPU and GPU resources provided by serverless ECI

Table of contents

Background information

Prerequisites

Use scenarios

Step 1: Install components

Use onectl

Use the console

Step 2: View nodes

Step 3: Use serverless elastic container instances to run pods (CPU-accelerated jobs and GPU-accelerated jobs)

Method 1: Configure pod labels

Method 2: Configure namespace labels

Step 4: Use multilevel resources scheduling