This topic describes how to schedule pods from an ACK One registered cluster to Alibaba Cloud Serverless (ACS) computing power by using virtual nodes. Virtual nodes allow you to run workloads on ACS without provisioning or managing physical nodes.
How it works
Virtual nodes integrate ACS computing power into your ACK One registered cluster. After you install the ack-virtual-node component, a virtual node is created in your cluster. When you schedule a pod to the virtual node, ACS automatically provisions serverless compute resources for the pod. The pod runs in an isolated environment on ACS and can communicate with other pods in the cluster. Virtual nodes are ideal for workloads with variable or unpredictable traffic patterns because they scale on demand without requiring you to provision or manage physical nodes.
For more information about virtual nodes, see Overview of registered clusters.
Prerequisites
You have created an ACK One registered cluster and connected it to a Kubernetes cluster. Kubernetes 1.24 or later is required. For more information, see Create an ACK One registered cluster.
The
ack-virtual-nodecomponent version 2.13.0 or later is installed in the registered cluster. For more information, see Install the ack-virtual-node component.
Configure RAM permissions for the ack-virtual-node component
onectl
Install onectl on your on-premises machine. For more information, see Use onectl to manage registered clusters.
Run the following command to configure RAM permissions for the
ack-virtual-nodecomponent:onectl ram-user grant --addon ack-virtual-nodeExpected output:
Ram policy ack-one-registered-cluster-policy-ack-virtual-node granted to ram user ack-one-user-ce313528c3 successfully.
Console
Before installing the component, create a RAM user, grant it the necessary permissions, and create an AccessKey for it. You will use this AccessKey to create a Secret that allows the component to access cloud services.
(Optional) Create a custom policy. Use the following policy content.
Grant permissions to the RAM user. You can attach the system policies
AliyunECIFullAccess,AliyunVPCReadOnlyAccess, andAliyunAccFullAccess, or attach the custom policy you created.Create an AccessKey for the RAM user.
WarningConfigure a network access policy as described in Network access control policies for AccessKeys to restrict AccessKey calls to trusted network environments. This enhances the security of your AccessKey.
Use the AccessKey to create a Secret named
alibaba-addon-secretin the registered cluster.kubectl -n kube-system create secret generic alibaba-addon-secret --from-literal='access-key-id=<your access key id>' --from-literal='access-key-secret=<your access key secret>'Here
<your access key id>and<your access key secret>are the AccessKey values that you obtained in the previous step.When you install the
ack-virtual-nodecomponent, it automatically uses this AccessKey to access the corresponding cloud services.
Example: Use ACS CPU computing power
After you install or upgrade the ack-virtual-node component to version 2.13.0 or later, the component supports both ACS and Elastic Container Instance (ECI) computing power.
In scenarios where pods are scheduled to virtual nodes, ECI is used by default if you do not specify ACS as the computing power type.
Perform the following steps to use ACS CPU computing power with an ACK One registered cluster:
Update the security group configuration of the registered cluster.
On the Basic Information page of the cluster, click the ID of the Control Plane Security Group.
On the security group details page, click Add Rule. Configure the rule by using the following values.
Rule Type
Protocol
Port Range
Source IP Range
Description
Inbound
TCP
80
CIDR block of the IDC cluster, for example, 192.168.1.0/24.
For configuring ACS endpoint scenarios.
Inbound
TCP
443
CIDR block of the IDC cluster, for example, 192.168.1.0/24.
For configuring ACS endpoint scenarios.
Inbound
TCP
10250
CIDR block of the IDC cluster, for example, 192.168.1.0/24.
The port listened on by the serverless kubelet service.
Create a Deployment that uses ACS computing power.
Create a file named
nginx.yamland copy the following content to the file:apiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx alibabacloud.com/acs: "true" # Configure to use ACS computing power. alibabacloud.com/compute-class: general-purpose # Configure the computing power type for the ACS pod. alibabacloud.com/compute-qos: default # Configure the computing power quality for the ACS pod. spec: containers: - name: nginx image: mirrors-ssl.aliyuncs.com/nginx:stable-alpine ports: - containerPort: 80 protocol: TCP resources: limits: cpu: 2 requests: cpu: 2Run the following command to create the nginx application:
kubectl apply -f nginx.yamlRun the following command to check the deployment status:
kubectl get pods -o wideExpected output (simplified):
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-54bcbc9b66-**** 1/1 Running 0 3m29s 192.168.XX.XXX virtual-kubelet-cn-shanghai-l <none> <none> nginx-54bcbc9b66-**** 1/1 Running 0 3m29s 192.168.XX.XXX virtual-kubelet-cn-shanghai-l <none> <none>The output shows that the two pods are scheduled to the node labeled
type=virtual-kubelet.Run the following command to view the details of an nginx pod:
kubectl describe pod nginx-54bcbc9b66-****Expected output:
Annotations: ProviderCreate: done alibabacloud.com/instance-id: acs-uf6008giwgjxlvn***** alibabacloud.com/pod-ephemeral-storage: 30Gi alibabacloud.com/pod-use-spec: 2-2Gi kubernetes.io/pod-stream-port: 10250 network.alibabacloud.com/enable-dns-cache: false topology.kubernetes.io/region: cn-shanghaiIf the output contains
Annotation alibabacloud.com/instance-id: acs-uf6008giwgjxlvn*****, the pod is an ACS pod instance.
Example: Use ACS GPU computing power
ACS GPU computing power for ACK One registered clusters is in invitational preview. To enable this feature, submit a tick.
Configure a GPU workload
After the feature is enabled, you can configure a GPU workload. The following example shows the required labels for scheduling pods to ACS GPU computing power:
...
labels:
# Declare the ACS GPU resource requirements in the label.
alibabacloud.com/compute-class: gpu # If the type is GPU, set this to gpu.
alibabacloud.com/compute-qos: default # The QoS type for computing. The meaning is the same as for normal ACS computing power.
alibabacloud.com/gpu-model-series: GN8IS # The GPU model. Replace this with the actual model.
...For more information about ACS compute classes and QoS classes, see Mapping between computing types and computing power quality.
For available GPU models for the
gpu-model-serieslabel, see Specify a GPU model and driver version for an ACS GPU pod.
Create a GPU workload using the following sample YAML:
apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # ACS properties alibabacloud.com/acs: "true" # Configure to use ACS computing power. alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU card model. Replace it as needed, for example, T4. spec: containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"Run the following command to check the running status of the GPU workload:
kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyamlExpected output:
phase: Running resources: limits: #other resources nvidia.com/gpu: "1" requests: #other resources nvidia.com/gpu: "1"
Example: Use ACS GPU HPN computing power
The process for using ACS GPU HPN computing power is similar to using ACS CPU computing power, but has the following requirements:
You must purchase a GPU-HPN capacity reservation in advance and associate it with the cluster.
ACS GPU HPN computing power requires upgrading the
ack-virtual-nodecomponent. The required component version is currently in invitational preview. To enable this feature, submit a ticket.
Configure a GPU HPN workload
To use ACS GPU HPN computing power, configure the following labels in your pod specification:
...
labels:
# Declare the ACS GPU resource requirements in the label.
alibabacloud.com/compute-class: gpu-hpn # Set to the gpu-hpn type.
alibabacloud.com/compute-qos: default # The computing QoS type. The meaning is the same as for regular ACS computing power.
alibabacloud.com/acs: "true" # The label to configure the use of ACS computing power.
...For more information about the relationship between ACS compute types and computing power quality, see Relationship between compute types and computing power quality.
For more information about other parameters of ACS pods, see ACS Pod.
Nodes of the ACS GPU HPN type can schedule only pods of the
gpu-hpncompute class. The GPU resource requirement can be omitted from the pod resource declaration. These nodes cannot schedule pods of other compute classes or pods that do not have a compute class declared.
Use a Kubernetes
nodeSelectorto schedule Pods to GPU HPN nodes. For example:apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # ACS properties alibabacloud.com/compute-class: gpu-hpn alibabacloud.com/compute-qos: default alibabacloud.com/acs: "true" spec: # Specify the gpu-hpn reserved node label nodeSelector: alibabacloud.com/node-type: reserved containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" # Enter the corresponding resource name based on the actual card model. requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" # Enter the corresponding resource name based on the actual card model.ImportantFor pods of the ACS GPU HPN type, take note of the following field configurations:
Specify the compute class:
alibabacloud.com/compute-class: gpu-hpn.Specify the reserved node label:
alibabacloud.com/node-type: reserved.For the device resource names in the
requestsandlimitsfields of the resource specification, use the actual device model, such as NVIDIA.
View the running status of the GPU workload.
kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyamlThe following output is a snippet of the key information:
phase: Running resources: limits: #other resources nvidia.com/gpu: "1" requests: #other resources nvidia.com/gpu: "1"