FAQ about adding edge nodes and solutions - Container Service for Kubernetes

This topic provides answers to some frequently asked questions about using edge nodes in ACK Edge clusters.

How do I connect edge nodes to the cloud through Express Connect circuits?

Pay attention to the following requirements when you connect edge nodes in ACK Edge clusters to the cloud through Express Connect circuits. For more information, see Special configurations of ACK Edge clusters when Express Connect circuits are used.

When you generate the script for connecting edge nodes to the cloud, set inDedicatedNetwork to true.
When you connect edge nodes to the cloud through Express Connect circuits, the edge nodes need to communicate with Alibaba Cloud services through private addresses. Make sure that the edge nodes are connected to the relevant Alibaba Cloud services, such as Object Storage Service (OSS), Container Registry, and Server Load Balancer (SLB).

How do I connect GPU-accelerated nodes to the cloud?

You need to configure the gpuVersion parameter when you generate the node connection script. The following GPU models are supported:
- "Nvidia_Tesla_T4"
- "Nvidia_Tesla_P4"
- "Nvidia_Tesla_P100"
- "Nvidia_Tesla_V100"
- "Nvidia_Tesla_A10"
After you configure the parameter, the connection tool will automatically install nvidia-containerd-runtime. For more information, see nvidia-containerd-runtime.

How do I handle node connection script execution failures?

The following table describes how to handle a script execution failure. If your issue is not described in the following table, collect the node diagnostic information and submit a ticket. For more information about how to collect edge node diagnostic information, see How do I collect the diagnostic information of nodes in an ACK Edge cluster?

Error message	Cause of failure	Suggested solution
The os XXX unsupport	The OS version of the edge node is not supported.	For more information about the supported OS versions, see Add an edge node.
invalid nodeName	The node name is invalid.	The node name can contain lowercase letters, hyphens (-), and periods (.). The node name must be 1 to 253 characters in length. The node name cannot start with localhost.
Node route overlaps with service cidr	The route of the node conflicts with the pod CIDR block or Service CIDR block of the cluster.	Recreate the cluster and reconfigure the pod CIDR block or Service CIDR block. Make sure that these CIDR blocks do not conflict with the NameServer address and route of the node.
Unfortunately, an error has occurred: response error msg: TOKEN_EXPIRED	The token for connecting the node to the cloud is expired.	Generate another script to connect the node to the cloud. Check whether the system clock of the node is normal.
A node named XXX is already exist in the cluster	A node with the same name already exists in the cluster.	Remove the node from the cluster.
Unfortunately, an error has occurred: error run phase post-check: timed out waiting for the condition	The system components fail to start up.	Check whether the edge node can access the relevant public addresses as normal. For more information about the public addresses, see Add an edge node. Collect the diagnostic information of the node and submit a ticket. For more information about how to collect diagnostic information, see How do I collect the diagnostic information of nodes in an ACK Edge cluster?

How do I collect the diagnostic information of nodes in an ACK Edge cluster?

If a node in an ACK Edge cluster encounters an exception, perform the following steps to collect the diagnostic information of the node for data analysis:

Log on to the abnormal node in the ACK Edge cluster.

Run the following command to download the diagnostic script:

curl -o /usr/local/bin/diagnose_edge_node.sh https://aliacs-k8s-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/public/diagnose/diagnose_k8s.sh

Run the following command to make the diagnostic script executable:
```
chmod u+x /usr/local/bin/diagnose_edge_node.sh
```
Run the following command to switch to the specified directory:
```
cd /usr/local/bin/
```

Run the following command to run the diagnostic script:

./diagnose_edge_node.sh

Expected output: Each time you run the diagnostic script, a file with a different name is generated. In this example, the log file is named diagnose_1578310147.tar.gz.

......
+ echo 'please get diagnose_1578310147.tar.gz for diagnostics'
please get diagnose_1578310147.tar.gz for diagnostics
+ echo 'Submit the file named diagnose_1578310147.tar.gz to request technical support.'
Submit the file named diagnose_1578310147.tar.gz to request technical support.

Run the ll command to verify that the diagnostic report named diagnose_1578310147.tar.gz exists.