This topic provides answers to some frequently asked questions about using edge nodes in ACK Edge clusters.
How do I connect edge nodes to the cloud through Express Connect circuits?
Pay attention to the following requirements when you connect edge nodes in ACK Edge clusters to the cloud through Express Connect circuits. For more information, see Special configurations of ACK Edge clusters when Express Connect circuits are used.
When you generate the script for connecting edge nodes to the cloud, set
inDedicatedNetwork
totrue
.When you connect edge nodes to the cloud through Express Connect circuits, the edge nodes need to communicate with Alibaba Cloud services through private addresses. Make sure that the edge nodes are connected to the relevant Alibaba Cloud services, such as Object Storage Service (OSS), Container Registry, and Server Load Balancer (SLB).
How do I connect GPU-accelerated nodes to the cloud?
You need to configure the
gpuVersion
parameter when you generate the node connection script. The following GPU models are supported:"Nvidia_Tesla_T4"
"Nvidia_Tesla_P4"
"Nvidia_Tesla_P100"
"Nvidia_Tesla_V100"
"Nvidia_Tesla_A10"
After you configure the parameter, the connection tool will automatically install nvidia-containerd-runtime. For more information, see nvidia-containerd-runtime.
How do I handle node connection script execution failures?
The following table describes how to handle a script execution failure. If your issue is not described in the following table, collect the node diagnostic information and submit a ticket. For more information about how to collect edge node diagnostic information, see How do I collect the diagnostic information of nodes in an ACK Edge cluster?
Error message | Cause of failure | Suggested solution |
The os XXX unsupport | The OS version of the edge node is not supported. | For more information about the supported OS versions, see Add an edge node. |
invalid nodeName | The node name is invalid. |
|
Node route overlaps with service cidr | The route of the node conflicts with the pod CIDR block or Service CIDR block of the cluster. | Recreate the cluster and reconfigure the pod CIDR block or Service CIDR block. Make sure that these CIDR blocks do not conflict with the NameServer address and route of the node. |
Unfortunately, an error has occurred: response error msg: TOKEN_EXPIRED | The token for connecting the node to the cloud is expired. |
|
A node named XXX is already exist in the cluster | A node with the same name already exists in the cluster. | Remove the node from the cluster. |
Unfortunately, an error has occurred: error run phase post-check: timed out waiting for the condition | The system components fail to start up. |
|
How do I collect the diagnostic information of nodes in an ACK Edge cluster?
If a node in an ACK Edge cluster encounters an exception, perform the following steps to collect the diagnostic information of the node for data analysis:
Log on to the abnormal node in the ACK Edge cluster.
Run the following command to download the diagnostic script:
curl -o /usr/local/bin/diagnose_edge_node.sh https://aliacs-k8s-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/public/diagnose/diagnose_k8s.sh
Run the following command to make the diagnostic script executable:
chmod u+x /usr/local/bin/diagnose_edge_node.sh
Run the following command to switch to the specified directory:
cd /usr/local/bin/
Run the following command to run the diagnostic script:
./diagnose_edge_node.sh
Expected output: Each time you run the diagnostic script, a file with a different name is generated. In this example, the log file is named
diagnose_1578310147.tar.gz
....... + echo 'please get diagnose_1578310147.tar.gz for diagnostics' please get diagnose_1578310147.tar.gz for diagnostics + echo 'Submit the file named diagnose_1578310147.tar.gz to request technical support.' Submit the file named diagnose_1578310147.tar.gz to request technical support.
Run the
ll
command to verify that the diagnostic report nameddiagnose_1578310147.tar.gz
exists.