When on-premises compute resources reach capacity, you can create a node pool in ACK One to scale out with cloud-based Elastic Compute Service (ECS) nodes. This topic explains how to write and deploy the custom bootstrap scripts that provision those ECS nodes into your registered cluster.
Prerequisites
Before you begin, ensure that you have:
An ACK One registered cluster with an external Kubernetes cluster connected to it. See Create an ACK One registered cluster
The external Kubernetes cluster's network connected to the virtual private cloud (VPC) of the ACK One registered cluster. See Scenario-based networking for VPC connections
The proxy configuration of the external Kubernetes cluster imported to the ACK One registered cluster (private network mode). See Associate an external Kubernetes cluster with an ACK One registered cluster
Limitations
The sample scripts in this topic support only operating systems that use YUM (Yellowdog Updater, Modified) as the package manager.
The sample scripts apply to regular ECS nodes only. For GPU-accelerated nodes, submit a ticket to the R&D team to get a customized script.
Unlike regular ECS nodes, you must install drivers and device plug-ins for GPU-accelerated nodes. For more information, see Manually update the NVIDIA driver of a node.
Overview
The process has three steps:
Prepare the node script — either adapt your existing custom script or start from a sample script.
Upload the script to a file server accessible from your registered cluster.
Register the script path in the
ack-agent-configConfigMap so the node pool can retrieve it during scale-out.
Which path to take in step 1:
| Situation | Action |
|---|---|
| You have an existing bootstrap script | Use Step 1(A): add the required Alibaba Cloud environment variables to it |
| You are starting from scratch | Use Step 1(B): download and customize a sample script |
Step 1(A): Add Alibaba Cloud environment variables to an existing script
Your node pool needs four environment variables from the registered cluster to manage nodes correctly. The registered cluster injects these at scale-out time.
ALIBABA_CLOUD_PROVIDER_ID is required for the registered cluster to function. If your script does not consume this variable, the cluster cannot run normally.
The four environment variables are:
| Variable | Required | Effect if missing | Example value |
|---|---|---|---|
ALIBABA_CLOUD_PROVIDER_ID | Yes | Registered cluster cannot run normally | cn-shenzhen.i-wz92ewt14n9wx9mo*** |
ALIBABA_CLOUD_NODE_NAME | Yes | Nodes in the node pool may enter an abnormal state | cn-shenzhen.192.168.1.*** |
ALIBABA_CLOUD_LABELS | Yes | Node pool management and workload scheduling between cloud and on-premises nodes may fail | alibabacloud.com/nodepool-id=np0e2031e952c4492bab32f512ce142*,ack.aliyun.com=cc3df6d939b0d4463b493b82d0d670*,alibabacloud.com/instance-id=i-wz960ockeekr3dok0***,alibabacloud.com/external=true,workload=cpu |
ALIBABA_CLOUD_TAINTS | Yes | Taints defined in the node pool configuration do not take effect | workload=ack:NoSchedule |
InALIBABA_CLOUD_LABELS, theworkload=cpulabel is a custom label you define in the node pool configuration. All other labels are system labels added automatically.
Add the variable-injection block to your script based on how your cluster was installed.
The cluster is created using kubeadm
Insert the following block into your script. Place it after your main node configuration and before restarting kubelet.
####### Begin: Alibaba Cloud environment variable injection
# Configure node labels, taints, node name, and provider ID.
KUBELET_CONFIG_FILE="/etc/sysconfig/kubelet"
if [[ $ALIBABA_CLOUD_LABELS != "" ]];then
option="--node-labels"
if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_LABELS},@g" $KUBELET_CONFIG_FILE
elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_LABELS} @g" $KUBELET_CONFIG_FILE
else
sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_LABELS}\"" $KUBELET_CONFIG_FILE
fi
fi
if [[ $ALIBABA_CLOUD_TAINTS != "" ]];then
option="--register-with-taints"
if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_TAINTS},@g" $KUBELET_CONFIG_FILE
elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_TAINTS} @g" $KUBELET_CONFIG_FILE
else
sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_TAINTS}\"" $KUBELET_CONFIG_FILE
fi
fi
if [[ $ALIBABA_CLOUD_NODE_NAME != "" ]];then
option="--hostname-override"
if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_NODE_NAME},@g" $KUBELET_CONFIG_FILE
elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_NODE_NAME} @g" $KUBELET_CONFIG_FILE
else
sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_NODE_NAME}\"" $KUBELET_CONFIG_FILE
fi
fi
if [[ $ALIBABA_CLOUD_PROVIDER_ID != "" ]];then
option="--provider-id"
if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_PROVIDER_ID},@g" $KUBELET_CONFIG_FILE
elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_PROVIDER_ID} @g" $KUBELET_CONFIG_FILE
else
sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_PROVIDER_ID}\"" $KUBELET_CONFIG_FILE
fi
fi
####### End: Alibaba Cloud environment variable injectionAfter adding this block, proceed to Step 2: Upload the script.
The cluster is installed using binary files
If your cluster was set up by installing Kubernetes binary files directly, modify the kubelet boot configuration in your script. The kubelet.service file is typically located at /usr/lib/systemd/system/.
cat >/usr/lib/systemd/system/kubelet.service <<EOF
# Custom configurations are not shown.
...
[Service]
ExecStart=/data0/kubernetes/bin/kubelet \
# Add the following flags:
--node-ip=${ALIBABA_CLOUD_NODE_NAME} \
--hostname-override=${ALIBABA_CLOUD_NODE_NAME} \
--node-labels=${ALIBABA_CLOUD_LABELS} \
--provider-id=${ALIBABA_CLOUD_PROVIDER_ID} \
--register-with-taints=${ALIBABA_CLOUD_TAINTS} \
....
--v=4
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOFAfter adding these flags, proceed to Step 2: Upload the script.
Step 1(B): Use a sample script
If you do not have an existing bootstrap script, use one of the sample scripts below as a starting point. The samples cover the two most common container runtimes: Docker and containerd.
Collect the required values
Before filling in the script, collect three values from your external cluster.
Step 1: Get the Kubernetes version
Run the appropriate command based on your cluster's Kubernetes version.
For Kubernetes 1.18 and later (use control-plane label):
kubectl get no $(kubectl get nodes -l node-role.kubernetes.io/control-plane -o json | jq -r '.items[0].metadata.name') -o json | jq -r '.status.nodeInfo.kubeletVersion'For Kubernetes earlier than 1.18 (use master label):
kubectl get no $(kubectl get nodes -l node-role.kubernetes.io/master -o json | jq -r '.items[0].metadata.name') -o json | jq -r '.status.nodeInfo.kubeletVersion'Expected output:
v1.14.10Record this as your KUBE_VERSION value.
Step 2: Get the container runtime and version
For Kubernetes 1.18 and later:
kubectl get no $(kubectl get nodes -l node-role.kubernetes.io/control-plane -o json | jq -r '.items[0].metadata.name') -o json | jq -r '.status.nodeInfo.containerRuntimeVersion'For Kubernetes earlier than 1.18:
kubectl get no $(kubectl get nodes -l node-role.kubernetes.io/master -o json | jq -r '.items[0].metadata.name') -o json | jq -r '.status.nodeInfo.containerRuntimeVersion'Expected output:
docker://18.6.3 # Docker runtime
containerd://1.4.3 # containerd runtimeRecord the version number (without the docker:// or containerd:// prefix) as your RUNTIME_VERSION value.
Step 3: Generate the kubeadm join command
The --ttl 0 flag is required. Without it, the join token expires and auto scaling stops working.
kubeadm token create --ttl 0 --print-join-commandExpected output:
kubeadm join 192.168.8.XXX:6443 --token k8xsq8.4oo8va9wcqpb*** \
--discovery-token-ca-cert-hash sha256:cb5fc894ab965dfbc4c194e1065869268f8845c3ec40f78f9021dde24610d***Record the full command as your KUBEADM_JOIN_CMD value.
Sample script: Docker
Replace the three placeholders at the top of the script with the values you collected:
<KUBE_VERSION>— the Kubernetes version (for example,1.21.0)<RUNTIME_VERSION>— the Docker version (for example,18.6.3)<KUBEADM_JOIN_CMD>— the fullkubeadm join ...command
Sample script: containerd
Replace the three placeholders at the top of the script with the values you collected:
<KUBE_VERSION>— the Kubernetes version (for example,1.21.0)<RUNTIME_VERSION>— the containerd version (for example,1.4.3)<KUBEADM_JOIN_CMD>— the fullkubeadm join ...command
Step 2: Upload the script
Upload the script to a file server that your registered cluster can reach. Object Storage Service (OSS) is a common choice.
Example URL after upload:
https://kubelet-****.oss-cn-hangzhou-internal.aliyuncs.com/join-ecs-nodes.shRecord the URL — you will need it in the next step.
Step 3: Register the script path
Complete this step before creating a node pool. If the script path is not registered, the node pool cannot retrieve the script during scale-out and the operation will fail.
Set the addNodeScriptPath field in the ack-agent-config ConfigMap in the kube-system namespace.
kubectl edit cm ack-agent-config -n kube-systemUpdate the addNodeScriptPath field with the URL of your uploaded script:
apiVersion: v1
data:
addNodeScriptPath: https://kubelet-****.oss-cn-hangzhou-internal.aliyuncs.com/join-ecs-nodes.sh
kind: ConfigMap
metadata:
name: ack-agent-config
namespace: kube-systemSave and close the file. The registered cluster now uses this script to bootstrap ECS nodes when you scale out the node pool.