×
Community Blog Learning More about Container Network Communication

Learning More about Container Network Communication

This article explains the container network and offers a detailed analysis of the pod data link for Hybridnet VXLAN, Hybridnet VLAN, Hybridnet BGP, Calico IPIP, Calico BGP, and Bifrost transformed.

By Chen Yunhao (Huanhe)

Background

Origin of Container Network

With the development of cloud computing, communications between applications have evolved from physical machine networks and virtual machine networks to the current container networks. Since containers are different from physical machines and virtual machines, containers can be understood as standard, lightweight, portable, and independent containers. Containers are isolated from each other, and they have their own environment and resources. However, since the environment is changing, containers need to transmit information among containers or between containers and the outside of the cluster during operation. At this time, containers must have a name (IP address) at the network level. Therefore, the container network was created.

Let's talk about the origin of the container network from a technical viewpoint. We must start with the essence of the container. It is realized through the following points:

  • cgroup: It implements the available quota of resources.
  • overlay fs: It provides file system security and portability.
  • namespace: It implements resource isolation.

    • IPC: System V IPC and POSIX Message Queue
    • Network: Network Device, Network Protocol Stack, Network Port, etc.
    • PID: Process
    • Mount: Mount Point
    • UTS: Host Name and Domain Name
    • USR: User and User Group

The network stacks between hosts and containers and among containers are not connected to each other, and there is no unified control panel. As a result, there is no direct awareness between containers. The container network we discussed in this article appeared to solve this problem. Diversified container network solutions have come with different network virtualization technologies.

Basic Requirements for Container Networks

  • IP-per-Pod: Each pod has an independent IP address, and all containers in the pod share one network namespace.
  • All pods in the cluster are in a directly connected flat network and can be accessed through IP addresses.

    • All containers can access each other without NAT.
    • All nodes and containers can access each other without NAT.
    • The IP seen by the container is the same as that seen by other containers.
  • Service cluster IP can be accessed within the cluster. If there are external requests, the IP must be accessed through NodePort, LoadBalance, or Ingress.

An Introduction to Network Plug-ins

Network Plug-in Overview

The container and the host where the container is located are separated. If you need to connect them, you must build a bridge. However, since the container side does not have a name, you cannot build the bridge. At this time, you need to name the container side beforehand. Network plug-ins are used to name the container side and build bridges.

Network plug-in inserts a network interface into the container network namespace (such as one end of a Veth pair) and makes necessary changes on the host (such as connecting the other end of the Veth to the bridge). Then, it assigns an idle IP address to the interface by calling the appropriate IPAM plug-in (IP address management plug-in) and sets routing rules corresponding to this IP address.

A network is one of the most important functions of Kubernetes. Without a good network, pods among different nodes of the cluster or in the same node fail to run well.

However, when Kubernetes designs the network, the principle adopted is flexible. How can it be flexible? Kubernetes does not realize many network-related operations, but it formulates a specification:

  1. There is a configuration file that can provide the name of the network plug-in to be used and the information required by the plug-in.
  2. Let CRI call the plug-in and pass the runtime information of the container (including the namespace and ID) to the plug-in.
  3. Do not focus on the internal implementation of the network plug-in. It is required that the output of the pod IP provided by the network plug-in can be achieved.

These are the three points of the famous, simple, and flexible CNI specification.

However, it is precisely because Kubernetes does nothing that everyone can freely realize different CNI plug-ins, namely network plug-ins. In addition to the well-known Calico and Bifrost network plug-ins in the community, Alibaba developed a network plug-in Hybridnet with excellent functions and performance.

  • Hybridnet

Hybridnet is an open-source container networking solution designed for hybrid clouds. It integrates with Kubernetes and is used by the following PaaS platforms:

  • Alibaba Cloud ACK Released Edition
  • Alibaba Cloud AECP
  • Ant Financial SOFAStack

Hybridnet focuses on efficient large-scale clustering, heterogeneous infrastructure, and user-friendliness.

  • Calico

Calico is a widely adopted and proven open-source network and network security solution for Kubernetes, virtual machine, and bare machine workloads. Calico offers two major services for cloud-native applications:

  • Network Connections among Workloads
  • Network Security Policies among Workloads

  • Bifrost

Bifrost is an open-source solution that enables L2 networking for Kubernetes and supports the following features:

  • Network traffic in Bifrost can be managed and monitored through traditional devices.
  • It supports access to service traffic by MACVLAN.

An Introduction to Communication Path

Overlay Solution: A cross-host network that connects containers on different hosts with the same virtual network.

  • VXLAN

Virtual eXtensible Local Area Network (VXLAN) is one of the Network Virtualization over Layer 3 (NVO3) standard technologies defined by IETF. It adopts the L2 over L4 (MAC-in-UDP) message encapsulation mode to encapsulate the layer 2 message with a layer 3 protocol, which can realize the expansion of the layer 2 network within the layer 3 range and meet the needs of the data center's large layer 2 virtual migration and multi-tenant.

  • IPIP

IPIP tunnel is implemented based on the TUN device. The TUN network device can encapsulate layer 3 (IP network data packet) data packets into another layer 3 data packet. Linux natively supports several different IPIP tunnel types, but the supports depend on the TUN network device.

  • ipip: It is the ordinary IPIP tunnel, which is to encapsulate an IPv4 message on the basis of the message.
  • gre: It is the Generic Routing Encapsulation, which defines the mechanism for encapsulating other Network Layer protocols on any Network Layer protocol. It applies to both IPv4 and IPv6.
  • sit: This mode is mainly used to encapsulate IPv6 messages with IPv4 messages, which is IPv6 over IPv4.
  • isatap: It is the Intra-Site Automatic Tunnel Addressing Protocol, which is also used for IPv6 tunnel encapsulation (like sit).
  • vti: It is a virtual tunnel interface, which is an IPsec tunneling.
  • This article uses ipip, the common IPIP tunnel.

Underlay Solution: A network that consists of devices (such as switches and routers) and is driven by Ethernet protocols, routing protocols, VLAN protocols, etc.

  • BGP

Border Gateway Protocol (BGP) is a distance vector routing protocol that realizes the reachability of routes between autonomous systems (AS) and selects the best route.

  • VLAN

Virtual Local Area Network (VLAN) is a communication technology that logically divides a physical LAN into multiple broadcast domains. Hosts in a VLAN can communicate with each other directly, while VLANs fail to communicate with each other, limiting broadcast messages to one VLAN.

Principles of Network Plug-ins

  • Calico uses IPIP and other tunneling or establishes BGP connections between hosts to complete mutual learning of container routes, which solves the problem of cross-node communication.
  • Hybridnet uses VLAN tunneling and establishes BGP connections between hosts to complete mutual learning of container routes or ARP proxy to solve cross-node communication problems.
  • Bifrost solves container communication problems by leveraging the capabilities of switch VLAN through the kernel MACVLAN module.

Classification and Comparison of Network Plug-ins

  • Network Plug-in Classification

Table_1

  • Comparison of Network Plug-ins

Table_2

  • SNAT: It converts the source IP address of a data packet.
  • podIP: It performs direct communication by podIP.
  • Veth Pair: Under Linux, you can create a pair of Veth pair network interface controllers, which can be sent from one side and received from the other side. For container traffic, it will enter the host network stack through the Veth pair network interface controller on the host side, which means it will be sent by the physical network interface controller after passing through the iptables rule of the host.
  • MACVLAN Sub-Interface: The MACVLAN sub-interface is independent of the original host interface. The MAC address and IP address can be configured separately. During external communications, container traffic will not enter the host network stack, which means the container traffic will not pass through the iptables rule of the host but be sent by the physical network interface controller through layer 2.

Network Plug-in Applications

In view of the complex network situation of the data center, we should choose the corresponding container network solution according to the requirements.

  • If you want to be less intrusive to the physical network of the data center, you can use a tunneling solution.

    • If dual stacks are supported, the Hybridnet VXLAN solution is preferred.
    • If only single-stack IPv4 is supported, Calico IPIP and Calico VXLAN solutions are preferred.
  • You want data centers to support and use BGP.

    • If the hosts are in the same network segment, the Calico BGP solution is preferred. (Dual stacks are supported.)
    • If the hosts are in different network segments, the Hybridnet BGP solution is preferred. (Dual stacks are supported.)
  • MACVLAN, IPVLAN l2, and other solutions have emerged for the pursuit of high performance and low latency of business.
  • The Terway solution, other IPVLAN l3 solutions, or the tunnel solution can be selected in the public cloud scenario.
  • There are also solutions developed to meet all scenarios (such as Hybridnet and Multus). Multus is an open-source container network plug-in that supports other CNI capabilities.

This article contains a detailed analysis of the pod data link for Hybridnet VXLAN, Hybridnet VLAN, Hybridnet BGP, Calico IPIP, Calico BGP, and Bifrost transformed through MACVLAN.

Network Plug-in Architecture and Communication Path

Hybridnet

  • Overall Architecture

3

  • Hybridnet-daemon: It controls the data plane configuration on each node, such as Iptables rules and policy routes.
  • Communication Path

1.  VXLAN Mode

  • Same-Node Communication

4

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic is matched to route table 39999 in the policy routes of the host and to the routing rules of Pod2 in route table 39999.
  3. Traffic enters the Pod2 container network stack from the hybrYYY network interface controller to complete the packet sending action.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrYYY on the host side of Pod2, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 39999 in the policy routes of the host and to the routing rules of Pod1 in the route table 39999.
  3. Traffic enters the Pod1 container network stack from the hybrXXX network interface controller to complete the packet return.
  • Cross-Node Communication

5

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 40000 in the policy routes of the host, and the route table 40000 is matched to the route rules that need to be sent to the eth0.vxlan20 network interface controller in the network segment where Pod2 is located.
  3. The forwarding table of the eth0.vxlan20 device records the correspondence between the MAC address and remoteip of the peer vtep.
  4. Traffic passes through the eth0.vxlan20 network interface controller and encapsulates a UDP header.
  5. After querying the route, it is in the same network segment as this machine. The MAC address of the peer physical network interface controller is obtained through the MAC address query and sent through Node1 eth0 physical network interface controller.
  6. Traffic enters from the Node2 eth0 physical network interface controller and removes the encapsulation of a UDP header through the eth0.vxlan20 network interface controller.
  7. According to route table 39999, traffic enters the Pod2 container network stack from the hybrYYY network interface controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrYYY on the host side of Pod2, and enters the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 40000 in the policy routes of the host, and the route table 40000 is matched to the route rules that need to be sent to the eth0.vxlan20 network interface controller in the network segment where Pod1 is located.
  3. The forwarding table of the eth0.vxlan20 device records the correspondence between the MAC address and remoteip of the peer vtep.
  4. Traffic passes through the eth0.vxlan20 network interface controller and encapsulates a UDP header.
  5. After querying the route, it is in the same network segment as this machine. The MAC address of the peer physical network interface controller is obtained through the MAC address query and sent through Node2 eth0 physical network interface controller.
  6. Traffic enters from the Node1 eth0 physical network interface controller and removes the encapsulation of a UDP header through the eth0.vxlan20 network interface controller.
  7. According to route table 39999, traffic enters the Pod1 container network stack from the hybrXXX network interface controller to complete the packet return.

2.  VLAN Mode

  • Same-Node Communication

6

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 39999 in the policy routes of the host and to the routing rules of Pod2 in the route table 39999.
  3. Traffic enters the Pod2 container network stack from the hybrYYY network interface controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrYYY on the host side of Pod2, and enters the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 39999 in the policy routes of the host and to the routing rules of Pod1 in the route table 39999.
  3. Traffic enters the Pod1 container network stack from the hybrXXX network interface controller to complete the packet return.

Cross-Node Communication

7

Communication of Pod1 Accessing Pod2

Package Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP, the traffic is matched to the route table 10001 in the policy routes of the host and to the routing rule corresponding to Pod2 in the route table 10001.
  3. According to the routing rules, traffic is sent from the eth0 physical network interface controller corresponding to the eth0.20 network interface controller and sent to the switch.
  4. The MAC address of Pod2 is matched on the switch, so the traffic is sent to the eth0 physical network interface controller corresponding to Node2.
  5. The traffic is received by the eth0.20 vlan network interface controller. According to the route matched by the route table 39999, the traffic is driven from the hybrYYY network interface controller into the pod2 container network stack to complete the packet sending.

Package Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrYYY on the host side of Pod2, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 10001 in the policy routes of the host and to the routing rules corresponding to Pod1 in the route table 10001.
  3. According to the routing rules, traffic is sent from the eth0 physical network interface controller corresponding to the eth0.20 network interface controller and sent to the switch.
  4. The MAC address of Pod1 is matched on the switch, so the traffic is sent to the eth0 physical network interface controller corresponding to Node1.
  5. The traffic is received by the eth0.20 vlan network interface controller. According to the route matched by the route table 39999, the traffic is driven from the hybrXXX network interface controller into the Pod1 container network stack to complete the packet return.

3.  BGP Mode

  • Same-Node Communication

8

Communication of Pod1 Accessing Pod2

Package Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 39999 in the policy routes of the host and to the routing rules of Pod2 in the route table 39999.
  3. Traffic enters the Pod2 container network stack from the hybrYYY network interface controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrYYY on the host side of Pod2, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 39999 in the policy routes of the host and to the routing rules of Pod1 in the route table 39999.
  3. Traffic enters the Pod1 container network stack from the hybrXXX network interface controller to complete the packet return.
  • Cross-Node Communication

9

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 10001 in the policy routes of the host and to the default route in the route table 10001.
  3. Send traffic to the switch corresponding to 10.0.0.1 based on the route.
  4. Match the specific route corresponding to Pod2 on the switch and send traffic to the physical network interface controller of Node2 eth0.
  5. Traffic enters the Pod2 container network stack from the hybrYYY network interface controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> hybrYYY on the host side of Pod2, to the host network stack.
  2. According to the destination IP address, the traffic is matched to the route table 10001 in the policy routes of the host and to the default route in the route table 10001.
  3. Send traffic to the switch corresponding to 10.0.0.1 based on the route.
  4. Match the specific route corresponding to Pod1 on the switch and send traffic to Node1 eth0 physical network interface controller.
  5. Traffic enters the Pod1 container network stack from the hybrXXX network interface controller to complete the packet return action.

Calico

Basic Concepts:

  • A pure layer 3 data center network solution
  • Linux Kernel can enable host to implement vRouter to be responsible for data forwarding.
  • vRouter propagates routing information through the BGP protocol.
  • Based on iptables, it also provides rich and flexible network policy rules.
  • Overall Architecture

10

  • Felix: It runs on each container host node. It is mainly responsible for configuring route, ACL, and other information to ensure the connectivity of containers.
  • Brid: It distributes route information written by Felix to the Kernel to the Calico network to ensure the effectiveness of communication between containers.
  • etcd: It is the distributed Key/Value storage and is responsible for network metadata consistency, ensuring the accuracy of Calico network status.
  • RR: It is the route reflector. Calico works in the node-mesh mode by default. All nodes are connected to each other. The node-mesh mode can work in small-scale deployment without any problem. When there is a large-scale deployment, the number of connections will be large and consume many resources. BGP RR can be used to avoid this situation. Centralized route distribution can be completed through one or more BGP RR. This is to reduce the consumption of network resources and improve Calico's work efficiency and stability.
  • Communication Path

1.  IPIP Mode

  • Same-Node Communication

11

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic matches the routing rules to Pod2 in the route table.
  3. Traffic enters the Pod2 container network stack from the caliYYY network interface controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliYYY on the host side of Pod2, to the host network stack.
  2. According to the destination IP address, the traffic matches the routing rules to Pod1 in the route table.
  3. Traffic enters the Pod1 container network stack from the caliXXX network interface controller to complete the packet return.
  • Cross-Node Communication

12

Communication of Pod1 Accessing Pod2

Packet Sending Process:

1.  Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliXXX on the host side of Pod1, to the host network stack.

  • src: pod1IP
  • dst: pod2IP

2.  According to the destination IP address, the traffic is matched in the route table to the routing rule that forwards the traffic to the tunl0 network interface controller.

  • src: pod1IP
  • dst: pod2IP

3.  Traffic is IPIP encapsulated from tunl0 (which means an IP header is encapsulated) and sent through eth0 physical network interface controller.

  • src: Node1IP
  • dst: Node2IP

4.  Traffic enters the host network stack of Node2 from the eth0 network interface controller of Node2.

  • src: Node1IP
  • dst: Node2IP

5.  Traffic enters tunl0 for IPIP unpacking.

  • src: pod1IP
  • dst: pod2IP

6.  Traffic enters the Pod2 container network stack from the caliYYY network interface controller to complete the packet sending.

  • src: pod1IP
  • dst: pod2IP

Packet Return Process:

1.  Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliYY on the host side of Pod2, to the host network stack.

  • src: pod2IP
  • dst: pod1IP

2.  According to the destination IP address, the traffic is matched in the route table to the routing rule that forwards the traffic to the tunl0 network interface controller.

  • src: pod2IP
  • dst: pod1IP

3.  Traffic is IPIP encapsulated from tunl0 (which means an IP header is encapsulated) and sent through eth0 physical network interface controller.

  • src: Node2IP
  • dst: Node1IP

4.  Traffic enters the host network stack of Node1 from the eth0 network interface controller of Node1.

  • src: Node2IP
  • dst: Node1IP

5.  Traffic enters tunl0 for IPIP unpacking.

  • src: pod2IP
  • dst: pod1IP

6.  Traffic enters the Pod1 container network stack from the caliXXX network interface controller to complete the packet return.

  • src: pod2IP
  • dst: pod1IP

2.  BGP Mode

  • Same-Node Communication

13

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic matches the routing rules to Pod2 in the route table.
  3. Traffic enters the Pod2 container network stack from the caliYYY network interface controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliYY on the host side of Pod2, to the host network stack.
  2. According to the destination IP address, the traffic matches the routing rules to Pod1 in the route table.
  3. Traffic enters the Pod1 container network stack from the caliXXX network interface controller to complete the packet return.
  • Cross-Node Communication

14

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliXXX on the host side of Pod1, to the host network stack.
  2. According to the destination IP address, the traffic is matched in the route table to the routing rules of the corresponding network segment of Pod2 and sent from the Node1 eth0 physical network interface controller.
  3. Traffic enters from the Node2 eth0 physical network interface controller and enters the Pod2 container network stack from the caliYYY network interface controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the veth-pair network interface controller, which is from eth0 -> caliYY on the host side of Pod2, to the host network stack.
  2. According to the destination IP address, the traffic is matched in the route table to the routing rules of the corresponding network segment of Pod1 and sent from the Node2 eth0 physical network interface controller.
  3. Traffic enters from the Node1 eth0 physical network interface controller and enters the Pod1 container network stack from the caliXXX network interface controller to complete the packet return.

Bifrost

  • Overall Architecture

15

  • veth0-bifrXXX: Bifrost implements a set of solutions for service accessing MACVLAN. It completes the kube-proxy + iptables in the host network stack for service traffic accessing pods through the veth-pair network interface controller.
  • eth0: The eth0 network interface controller in the container is the MACVLAN network interface controller corresponding to the host VLAN network interface sub-controller.
  • Communication Path

1.  MACVLAN Mode

  • Same Node and Same VLAN Communication

16

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the MACVLAN network interface controller, which means eth0 of Pod1 enters the eth0-10 VLAN network interface sub-controller through the layer 2 network.
  2. Since MACVLAN enables bridge mode, it can match the MAC address of Pod2.
  3. Traffic enters the Pod2 container network stack from the eth0-10 VLAN network interface sub-controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the MACVLAN network interface controller, which means eth0 of Pod2 enters the eth0-10 VLAN network interface sub-controller through the layer 2 network.
  2. Since MACVLAN enables bridge mode, it can match the MAC address of Pod1.
  3. Traffic enters the Pod1 container network stack from the eth0-10 VLAN network interface sub-controller to complete the packet return.
  • Same-Node Cross-VLAN Communication

17

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the MACVLAN network interface controller, which means eth0 of Pod1 takes the default route (gateway 5.0.0.1) to enter the eth0-5 VLAN network interface sub-controller.
  2. Since the MAC address of gateway 5.0.0.1 is found on the eth0-5, traffic is sent out from the eth0 physical network interface controller to the switch.
  3. Traffic matches the MAC address of Pod2 on the switch.
  4. Traffic enters the physical network interface controller of the host where Pod2 resides and to the corresponding eth0-10 VLAN network interface sub-controller.
  5. Traffic enters the Pod2 container network stack from the eth0-10 VLAN network interface sub-controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the MACVLAN network interface controller, which means eth0 of Pod2 takes the default route (the gateway is 10.0.0.1) to enter the eth0-10 VLAN network interface sub-controller.
  2. Since the MAC address of gateway 10.0.0.1 is found on the eth0-10, traffic is sent out from the eth0 physical network interface controller to the switch.
  3. Traffic matches the MAC address of pod1 on the switch.
  4. Traffic enters the physical network interface controller of the host where Pod1 is located and enters the corresponding eth0-5 VLAN network interface sub-controller.
  5. Traffic enters the Pod1 container network stack from the eth0-5 VLAN network interface sub-controller to complete the packet return.
  • Cross-Node Communication

18

Communication of Pod1 Accessing Pod2

Packet Sending Process:

  1. Pod1 traffic passes through the MACVLAN network interface controller, which means eth0 of Pod1 takes the default route (gateway 5.0.0.1) to enter the eth0-5 VLAN network interface sub-controller.
  2. Since the MAC address of gateway 5.0.0.1 is found on the eth0-5, traffic is sent out from the eth0 physical network interface controller to the switch.
  3. Traffic matches the MAC address of Pod2 on the switch.
  4. Traffic enters the physical network interface controller of the host where Pod2 resides and enters the corresponding eth0-10 VLAN network interface sub-controller.
  5. Traffic enters the Pod2 container network stack from the eth0-10 VLAN network interface sub-controller to complete the packet sending.

Packet Return Process:

  1. Pod2 traffic passes through the MACVLAN network interface controller, which means eth0 of Pod2 takes the default route (the gateway is 10.0.0.1) to enter the eth0-10 VLAN network interface sub-controller.
  2. Since the MAC address of gateway 10.0.0.1 is found on the eth0-10, traffic is sent out from the eth0 physical network interface controller to the switch.
  3. Traffic matches the MAC address of pod1 on the switch.
  4. Traffic enters the physical network interface controller of the host where Pod1 is located and enters the corresponding eth0-5 VLAN network interface sub-controller.
  5. Traffic enters the Pod1 container network stack from the eth0-5 VLAN network interface sub-controller to complete the packet return.

Problems and Development

IPv4/IPv6 Dual Stacks

  • Background

As the most basic element of the Internet, IP is a protocol designed for computer networks to communicate with each other. The Internet has rapidly developed into the world's largest open computer communication network because of the IP protocol. With the development of the Internet, the IP protocol has produced IPv4 and IPv6.

  • IPv4

IPv4 is the fourth version of the Internet Protocol and the first widely deployed IP protocol, which is the datagram transmission mechanism used by computer networks. Every device connected to the Internet (switch, PC, or other devices) will be assigned a unique IP address (such as 192.149.252.76). As shown in the following figure, IPv4 uses 32-bit (4 bytes) addresses and can store about 4.3 billion addresses. As more users access the Internet, the global IPv4 addresses have been exhausted in November 2019. This is one of the reasons why the follow-up Internet Engineering Task Force (IETF) proposed IPv6.

  • IPv6

IPv6 is the sixth version of the Internet protocol proposed by IETF, which is used to replace the next-generation protocol of IPv4. Its proposal solves the problem of lack of network address resources and the obstacles of various devices accessing the Internet. An IPv6 address is 128 bits and can support more than 340 trillion addresses. As shown in the following figure, 3ffe:1900:fe21:4545:0000:0000:0000:0000 is an IPv6 address. IPv6 addresses are divided into eight groups, with four hexadecimal numbers as a group. Each group is separated by a colon.

When IPv4 is the mainstream and IPv6 has not yet emerged, the main problems are listed below:

  1. The number of IPv4 addresses no longer meets the requirements, and IPv6 addresses are required for expansion.
  2. With the clarification of domestic next-generation Internet development policies, customer data centers need to use IPv6 to comply with stricter regulations.
  • Status Quo
HybridNet Calico IPIP Calico BGP Bifrost
IPv6 /Dual Stacks Supported or Not Supported Not Supported Supported Not Supported

Why does Calico IPIP not support IPv6?

  • Ipip is a common IPIP tunnel, which is to encapsulate an IPv4 message on the basis of the message, so IPv6 messages are not supported.

Multi Network Interface Controller (Multi-Communication Mechanism)

  • Background

Typically, in Kubernetes, a pod only has one interface, which is a single network interface controller. It is used for communication between pods in a cluster network. If a pod needs to communicate with a heterogeneous network, you can create multiple interfaces in the pod, which is the multiple network interface controller.

Current Issues:

  1. Some customers are short of real IP resources, which makes it impossible to use the underlay solution.
  2. The customer wants to separate the UDP network from the TCP network. This leads to the network model based on the TCP protocol that cannot exist independently in the UDP network.
  • Status Quo

Currently, there are two schemes to realize multiple network interface controllers.

  • When a single CNI calls IPAM, the corresponding network interface controller is generated through the CNI config, and the appropriate IP resources are allocated.
  • Call each CNI through the meta CNI to complete the corresponding network interface controller and allocate appropriate IP resources, such as the Multus solution.

Network Traffic Control

  • Background

Generally, we divide network traffic into two types in a data center. One type is the traffic that interacts between users outside the data center and internal servers. Such traffic is called north-south traffic or vertical traffic. The other type is the traffic that interacts between servers inside the data center, which is also called east-west traffic or horizontal traffic.

In the container cloud, we define east-west traffic as the traffic between hosts and containers, between containers, or between hosts in the cluster. The north-south traffic is the traffic between the outside container cloud and the internal container cloud.

Current Issues:

1.  It is difficult for traditional firewalls to control traffic in the east-west scenario of the container cloud. They need inter-service or inter-container traffic control capabilities.

  • Status Quo
Calico Cillum Bifrost - Commercial Version
Technical Basis iptables ebpf ebpf
Adaptability Layer 3 routes with traffic passing through the host network stack Layer 2 routes with the Cillum communication mode Mainstream CNI plug-ins

References (In Chinese)

  1. Analysis of Cross-host Communication in Calico VXLAN IPv4 Overlay Networking: https://www.jianshu.com/p/5edd6982e3be
  2. Qunar Container Platform Network: Calico: http://dockone.io/article/2434328
  3. Introduction to the Best VXLAN: https://www.jianshu.com/p/cccfb481d548
  4. Reveal the IPIP Tunnel: https://morven.life/posts/networking-3-ipip/
  5. Basic Knowledge of BGP: https://blog.csdn.net/qq_38265137/article/details/80439561
  6. VLAN Basics: https://cshihong.github.io/2017/11/05/VLAN%E5%9F%BA%E7%A1%80%E7%9F%A5%E8%AF%86/
  7. Differences and Overview of Overlay and Underlay Network Protocols: https://www.cnblogs.com/fengdejiyixx/p/15567609.html#%E4%BA%8Cunderlay%E7%BD%91%E7%BB%9C%E6%A8%A1%E5%9E%8B
  8. Summary of East-West Flow Traction Scheme: http://blog.nsfocus.net/east-west-flow-sum/
  9. Container Network Interface: https://jimmysong.io/kubernetes-handbook/concepts/cni.html
  10. Kubernetes Network: In-depth Understanding of CNI: https://zhuanlan.zhihu.com/p/450140876
0 1 0
Share on

You may also like

Comments

Related Products