Use the computing power of ACS in ACK One registered clusters - Container Compute Service

Container Compute Service (ACS) is integrated into Distributed Cloud Container Platform for Kubernetes (ACK One) registered clusters. You can use ACK One registered clusters to quickly use the computing power provided by ACS. This topic describes how to use the computing power of ACS in registered clusters of ACK One.

How to use the computing power of ACS in ACK One registered clusters

Container Compute Service (ACS) is a cloud computing service that provides container compute resources that comply with the container specifications of Kubernetes. ACS adopts a layered architecture to implement Kubernetes control and computing power. The compute resources layer schedules and allocates resources to pods. The Kubernetes control layer manages workloads, such as Deployments, Services, StatefulSets, and CronJobs.

The computing power of ACS can be implemented in Kubernetes clusters by using virtual nodes. This way, Kubernetes clusters are empowered with high elasticity and are no longer limited by the computing capacity of cluster nodes. After you use ACS to take over infrastructure management for pods, the Kubernetes cluster no longer needs to schedule or launch individual pods. In addition, the Kubernetes cluster no longer needs to be concerned about the resources of underlying VMs. ACS can meet the resource requirements of pods at any time.

In ACK One registered clusters, you must install the ack-virtual-node component to deploy virtual nodes and create ACS pods. If you need to scale out your cluster, you can create ACS pods on virtual nodes without the need to plan the resource capacities of the nodes. ACS pods can communicate with pods on physical nodes in the cluster. To efficiently use resources, shorten the scaling time, and reduce costs, we recommend that you schedule specific workloads that run for an extended period of time and have elastic traffic to virtual nodes. As the number of business traffic decreases, the pods on the virtual node can be quickly released to reduce usage costs. Pods on virtual nodes run in a secure and isolated environment that is built on top of ACS. In this case, a pod is referred to as an ACS pod. For more information, see Overview of registered clusters.

Prerequisites

Create an ACK One registered cluster and connect the cluster to a data center or a Kubernetes cluster of another cloud service provider. We recommend that you select Kubernetes 1.24 or later. For more information, see Create a registered cluster.
The ACK virtual node component is installed and the version of the component is 2.13.0 or later. For more information, see Step 1: Grant RAM permissions to ack-virtual-node and Step 2: Install ack-virtual-node.

How to use the CPU computing power of ACS in ACK One registered clusters

After you install the required version of ACK Virtual Node or update the component to version 2.13.0 or later, you can create ACS pods and elastic container instances.

Note

When you schedule pods to virtual nodes, if you do not specify the compute class of the pods, elastic container instances are prioritized for pod scheduling by default.

To use the computing power of ACS in an ACK cluster, perform the following steps:

Configure node selectors, affinity and anti-affinity rules, ResourcePolicies, and the alibabacloud.com/acs: true label to schedule pods to virtual nodes. For more information, see Node affinity scheduling.
When you create an ACS pod, add the alibabacloud.com/compute-class:Compute class label to the pod to specify the compute class of the pod. For more information about the compute classes of ACS pods, see ACS pod overview.

Perform the following steps:

Create a Deployment.

NodeSelector

Run the following command to query the labels of a virtual node. Replace virtual-kubelet-cn-shanghai-l in the following command with the actual virtual node name.

kubectl get node virtual-kubelet-cn-shanghai-l -oyaml

Expected output:

apiVersion: v1
kind: Node
metadata:
  labels:
   beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: virtual-kubelet-cn-shanghai-l
    kubernetes.io/os: linux
    kubernetes.io/role: agent
    service.alibabacloud.com/exclude-node: "true"
    topology.diskplugin.csi.alibabacloud.com/zone: cn-shanghai-l
    topology.kubernetes.io/region: cn-shanghai
    topology.kubernetes.io/zone: cn-shanghai-l
    type: virtual-kubelet # Each virtual node has this label. If you want to schedule a pod to a virtual node, you can configure this label as the node selector of the pod.
  name: virtual-kubelet-cn-shanghai-l
spec:
  taints:
  - effect: NoSchedule
    key: virtual-kubelet.io/provider
    value: alibabacloud

Create a YAML file named nginx.yaml by using the following content to provision two pods:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx 
        alibabacloud.com/compute-class: general-purpose # The compute class of the ACS pod. Default value: general-purpose.
        alibabacloud.com/compute-qos: default # The quality of service (QoS) class of the ACS pod. Default value: default.
    spec:
      nodeSelector:
        type: virtual-kubelet # The node selector used to select a virtual node.
      tolerations:
      - key: "virtual-kubelet.io/provider" # The toleration used to tolerate virtual nodes.
        operator: "Exists"
        effect: "NoSchedule"
      containers:
      - name: nginx
        image: mirrors-ssl.aliyuncs.com/nginx:stable-alpine
        ports:
          - containerPort: 80
            protocol: TCP        
        resources:
          limits:
            cpu: 2
          requests:
            cpu: 2

Deploy an NGINX application and query the pods.

Run the following command to create an NGINX application:
```
kubectl apply -f nginx.yaml 
```

Run the following command to check whether the NGINX application is deployed:

kubectl get pods -o wide

Expected output:

NAME                     READY   STATUS    RESTARTS   AGE     IP               NODE                            NOMINATED NODE   READINESS GATES
nginx-54bcbc9b66-****   1/1     Running   0          3m29s   192.168.XX.XXX   virtual-kubelet-cn-shanghai-l   <none>           <none>
nginx-54bcbc9b66-****   1/1     Running   0          3m29s   192.168.XX.XXX   virtual-kubelet-cn-shanghai-l   <none>           <none>

The command output indicates that the two pods are deployed on nodes that have the type=virtual-kubelet label, which is specified by the nodeSelector parameter in the Deployment configurations.

Label scheduling

Create a file named nginx.yaml that contains the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx 
        alibabacloud.com/acs: "true" # Use the computing power of ACS.
        alibabacloud.com/compute-class: general-purpose # The compute class of the ACS pod. Default value: general-purpose.
        alibabacloud.com/compute-qos: default # The QoS class of the ACS pod. Default value: default.
    spec:
      containers:
      - name: nginx
        image: mirrors-ssl.aliyuncs.com/nginx:stable-alpine
        ports:
          - containerPort: 80
            protocol: TCP 
        resources:
          limits:
            cpu: 2
          requests:
            cpu: 2

Run the following command to create an NGINX application:
```
kubectl apply -f nginx.yaml 
```

Run the following command to check whether the NGINX application is deployed:

kubectl get pods -o wide

Expected output:

NAME                     READY   STATUS    RESTARTS   AGE     IP               NODE                            NOMINATED NODE   READINESS GATES
nginx-54bcbc9b66-****   1/1     Running   0          3m29s   192.168.XX.XXX   virtual-kubelet-cn-shanghai-l   <none>           <none>
nginx-54bcbc9b66-****   1/1     Running   0          3m29s   192.168.XX.XXX   virtual-kubelet-cn-shanghai-l   <none>           <none>

The command output indicates that the two pods are deployed on nodes that have the type=virtual-kubelet label, which is specified by the nodeSelector parameter in the Deployment configurations.

Run the following command to query the details of the pod created for the NGINX application:

kubectl describe pod nginx-54bcbc9b66-****

Expected output:

Annotations:  ProviderCreate: done
              alibabacloud.com/instance-id: acs-uf6008giwgjxlvn*****
              alibabacloud.com/pod-ephemeral-storage: 30Gi
              alibabacloud.com/pod-use-spec: 2-2Gi
              kubernetes.io/pod-stream-port: 10250
              network.alibabacloud.com/enable-dns-cache: false
              topology.kubernetes.io/region: cn-shanghai

The command output indicates that the configurations of the pod include the alibabacloud.com/instance-id: acs-uf6008giwgjxlvn***** annotation. This indicates that the pod is an ACS pod.

How to use the GPU computing power of ACS in ACK One registered clusters

The procedure for using ACS GPU compute power is similar to that for using ACS CPU compute power. However, you also need to ensure that the scheduling components meet the version requirements and add some additional configurations.

Activation

The feature of using ACS GPU computing power in ACK clusters is in invitational preview. To use this feature, submit a ticket.

How to use the computing power of ACS in registered clusters of ACK One

...     
     labels:
        # Add labels to request ACS GPU resources.
        alibabacloud.com/compute-class: gpu     #Set to gpu if GPU compute power is used.
        alibabacloud.com/compute-qos: default   #The QoS class, which is the same as regular ACS compute power.
        alibabacloud.com/gpu-model-series: example-model  # The GPU model. Specify the actual model that you use, such as T4.
...

Note

For more information about the relationship between ACS compute classes and QoS classes, see Mappings between compute classes and computing power QoS classes.
For more information about the GPU models supported by gpu-model-series, see GPU models.

NodeSelector

Use the following content to create a GPU workload:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dep-node-selector-demo
  labels:
    app: node-selector-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: node-selector-demo
  template:
    metadata:
      labels:
        app: node-selector-demo
        # The ACS attributes.
        alibabacloud.com/compute-class: gpu
        alibabacloud.com/compute-qos: default
        alibabacloud.com/gpu-model-series: example-model  # The GPU model. Specify the actual model that you want to use, such as T4.
    spec:
      # The specified label.
      nodeSelector:
        type: virtual-kubelet
      # The taint to be tolerated.
      tolerations:
      - key: "virtual-kubelet.io/provider" # The toleration used to tolerate virtual nodes.
        operator: "Exists"
        effect: "NoSchedule"
      containers:
      - name: node-selector-demo
        image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4
        command:
        - "sleep"
        - "1000h"
        resources:
          limits:
            cpu: 1
            memory: 1Gi
            nvidia.com/gpu: "1"
          requests:
            cpu: 1
            memory: 1Gi
            nvidia.com/gpu: "1"

Run the following command to query the status of the GPU-accelerated workload:

kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyaml

Expected output:

    phase: Running

    resources:
      limits:
        #other resources
        nvidia.com/gpu: "1"
      requests:
        #other resources
        nvidia.com/gpu: "1"

Label scheduling

Use the following content to create a GPU workload:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dep-node-selector-demo
  labels:
    app: node-selector-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: node-selector-demo
  template:
    metadata:
      labels:
        app: node-selector-demo
        # The ACS attributes.
        alibabacloud.com/acs: "true" # Use the computing power of ACS.
        alibabacloud.com/compute-class: gpu
        alibabacloud.com/compute-qos: default
        alibabacloud.com/gpu-model-series: example-model  # The GPU model. Specify the actual model that you want to use, such as T4.
    spec:
      containers:
      - name: node-selector-demo
        image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4
        command:
        - "sleep"
        - "1000h"
        resources:
          limits:
            cpu: 1
            memory: 1Gi
            nvidia.com/gpu: "1"
          requests:
            cpu: 1
            memory: 1Gi
            nvidia.com/gpu: "1"

Run the following command to query the status of the GPU-accelerated workload:

kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyaml

Expected output:

    phase: Running

    resources:
      limits:
        #other resources
        nvidia.com/gpu: "1"
      requests:
        #other resources
        nvidia.com/gpu: "1"