All Products
Search
Document Center

:How can I troubleshoot the network issues that occur when ENIs are enabled for a Kubernetes cluster that uses the Terway network plug-in?

Last Updated:May 17, 2022

Problem description

The following issues may occur when elastic network interfaces (ENIs) are enabled for a Kubernetes cluster that uses the Terway network plug-in:

  • Issue 1: A Domain Name System (DNS) resolution failure occurs in a pod.
  • Issue 2: The Elastic Compute Service (ECS) instance that serves as a node of the Kubernetes cluster can access the Internet but the pods on the node cannot access the Internet.
  • Issue 3: The monitoring data of the cluster cannot be retrieved.
  • Issue 4: The application in a pod cannot access the target Relational Database Service (RDS) instance.

 

Causes

The Flannel plug-in and the Terway plug-in provide different networking solutions. If you enable ENIs when you select the Terway plug-in, pod CIDR blocks are not used. In this case, the IP address of a pod belongs to the CIDR block of the vSwitch in the VPC network and is in the same CIDR block as the ECS instance that runs the pod in the cluster. The private IP address used by the pod is the private IP address of the secondary ENI that is associated with the ECS instance in the cluster. Outbound traffic of the pod passes through the ENI. This does not require source network address translation (SNAT) that is used to forward network traffic to the Internet from the IP address of the ECS instance. Possible causes of the issues include:

  • Cause 1: The ENI of the pod and the ECS instance that runs the pod do not belong to the same security group. The DNS resolution in a pod requires cross-host communication. Therefore, the ENI of the pod must belong to the same security group as the ECS instance.
  • Cause 2: The required SNAT entry is not configured on the vSwitch. The SNAT entry must point to the ENI that is associated with the pod.
  • Cause 3: To collect cluster monitoring data, the API server must access the kubelet on a target node. Therefore, the ENI of the API server and the ENI of the target pod must belong to the same security group as the ECS instance that runs the pod.
    Note: When you create a cluster, a default security group is created. For a managed cluster, the ENI that is associated with the API server must belong to the default security group.
  • Cause 4: The settings of the RDS whitelist are invalid. In the RDS whitelist, you must include the CIDR block of the vSwitch with which the pod is associated, instead of the CIDR block of the ECS instance.

 

Solutions

 

Additional information

To check and modify the security group and secondary private IP addresses of the ENI, perform the following steps:

  1. Log on to the Container Service for Kubernetes console and find the IP address of the pod.
  2. Connect to a master node of the Kubernetes cluster by using kubectl. For more information, see Connect to a Kubernetes cluster by using kubectl.
  3. On the command line, run the following command to view the vSwitch that is associated with the pod in the cluster.
    kubectl get cm eni-config -n kube-system -o yaml
    The following command output is returned.
  4. Log on to the ECS console, go to the Network Interfaces page, and then search ENIs by vSwitch ID. Find the ENI that is associated with the specified pod IP address found in Step 1. Then, click Modify.
  5. In the dialog box that appears, check whether the ENI belongs to the required security group. If not, you can modify the security group.
  6. Return to the Network Interfaces page, find the target ENI, and then click Manage Secondary Private IP Address.
  7. In the dialog box that appears, check and manage the secondary private IP addresses of the ENI.

 

Application scope

  • Dedicated clusters of Container Service for Kubernetes
  • Managed clusters of Container Service for Kubernetes

Note: The topic applies to the environments in which ENIs are enabled for the Terway network plug-in.