How to Inspect Kubernetes Networking

By Ghulam Qadir, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

Kubernetes is a container orchestration system that can manage containerized applications across a cluster of server nodes. Maintaining network connectivity between all the containers in a cluster requires some advanced networking techniques. In this article, we will briefly cover some tools and techniques for inspecting this networking setup.

These tools may be useful if you are debugging connectivity issues, investigating network throughput problems, or exploring Kubernetes to learn how it operates.

Here are some of the key things to know about Kubernetes:

Master Node: This is the node which manages worker nodes in a cluster and controls the deployment of pods.

Worker Node: These servers typically run the application containers (in Pods) and other Kubernetes components such as proxies.

Service: A service functions as a proxy to replicated pods and service requests can be load balanced across pods.

Pod: The basic object for deployments in Kubernetes. Each pod has its own IP address and can contain one or more containers.

Other important system components which Kubernetes deploys include the API Server, Kubelet, and etcd.

How Kubernetes Networking is Different

Let's set the ground rules by stating that Kubernetes has some requirements for any network implementation that it runs on:

All containers can communicate with all other containers without NAT.
All nodes can communicate with all containers (and vice versa) without NAT.
The IP a container sees itself as is the same IP which others see it as.

In actuality, Kubernetes doesn't manage individual containers. Kubernetes manages pods. A pod is defined as one or more containers which make up a service offering. So if you have a Java app using Spring Boot that has no hard dependencies, then that could be a single container pod. If you also require a container running an NGINX reverse proxy for SSL termination and a container running MariaDB to make the Java app run, those three containers would make up a pod—The idea being if any member of the pod fails, Kubernetes will shut it down, and based on the rules you put in place, replace the entire pod with a new one. Kubernetes really shines at scale when there are multiple copies of each pod running with multiple hosts in the mix to handle the client traffic or availability requirements that are defined.

To expose all the instances of a pod to the outside world, or to evenly distribute traffic from internal clients, they are grouped together as a service, which is load-balanced and managed by Kubernetes.

Advanced Networking Add-ons for Kubernetes

Kubernetes is happy to hand over control of network management to a third-party network implementation. That implementation needs to leverage the Kubernetes Container Network Interface (CNI) plugin framework.

By leveraging this framework, replacing the entire underpinning networking infrastructure can be as straightforward as running a single command line. Each of the drop-in replacements offer their own benefits and downfalls.

Popular drop-in replacements include:

kubenet backed by Kubernetes - A basic networking plugin that only handles container networking at the host level. It is often used when running on a cloud provider that will handle routing the IPs to the host (for example, when using Google Compute Engine (GCE) to map individual subnets to the VMs running on the platform).
Contrail backed by Juniper - Contrail is the commercial offering of OpenContrail. Contrail is a truly open networking platform built around essential networking standards which can integrate across multiple cloud platforms and container orchestration engines.
Flannel backed by CoreOS - A simple networking plugin for Kubernetes that satisfies all the core principles which Kubernetes outlines, including routing between nodes.
Open VSwitch - Originally by Nicira, and used as the base for NSX, OpenVSwitch is a full-featured, software-defined networking platform which is more complicated than other options, and is backed by several large technology firms including VMware and HPE.
Project Calico backed by Tigera - From Kubernetes.io: Calico provides a highly scalable networking and network policy solution for connecting Kubernetes pods based on the same IP networking principles as the Internet. Calico can be deployed without encapsulation or overlays to provide high-performance, high-scale data center networking. Calico also provides fine-grained, intent-based network security policy for Kubernetes pods via its distributed firewall. Calico can also be run in policy enforcement mode in conjunction with other networking solutions such as Flannel (aka canal), or native GCE networking.

Getting Started

In this tutorial, we will learn how inspect Kubernetes Networking on an Alibaba Cloud Elastic Compute Service (ECS) instance with Ubuntu 16.04.

Prerequisites

You must have Alibaba Cloud Elastic Compute Service (ECS) activated. If you are a new user, you can enjoy the Free Trial worth $300 in your Alibaba Cloud account. If you don't know about how to setup your ECS instance, you can refer to this tutorial or check quick-start guide.

You should setup your server's hostname, and have access to VNC console in your Alibaba Cloud or SSH client installed in your PC.

Logging to your Console

After completing the prerequisites, login as root user with your root username & password via SSH client (e.g. Putty) or VNC console available in your Alibaba Cloud account dashboard.

Before you begin this tutorial, you'll need:

One Ubuntu 16.04 server with at least 2 GB of RAM, private networking enabled, and a non-root user. This can be set up by following the Initial Server Setup with Ubuntu 16.04.

This tutorial will assume that you have a Kubernetes cluster, with kubectl installed locally and configured to connect to the cluster.

The following sections contain many commands that are intended to be run on a Kubernetes node. They will look like this:

 # echo 'this is a node command'

Commands that should be run on your local machine will have the following appearance:

 # echo 'this is a local command'

Note: Most of the commands in this tutorial will need to be run as the root user. If you instead use a sudo-enabled user on your Kubernetes nodes, please add sudo to run the commands when necessary.

Finding a Pod's Cluster IP

To find the cluster IP address of a Kubernetes pod, use the kubectl get pod command on your local machine, with the option -o wide. This option will list more information, including the node the pod resides on, and the pod's cluster IP.


 # kubectl get pod –o wide



 Output

NAME                           READY     STATUS    RESTARTS   AGE       IP            NODE
hello-world-5b446dd74b-7c7pk   1/1       Running   0          22m       10.244.18.4   node-one
hello-world-5b446dd74b-pxtzt   1/1       Running   0          22m       10.244.3.4    node-two

The IP column will contain the internal cluster IP address for each pod.

If you don't see the pod you're looking for, make sure you're in the right namespace. You can list all pods in all namespaces by adding the flag --all-namespaces.

Finding a Service's IP

We can find a Service IP using kubectl as well. In this case we will list all services in all namespaces:

 # kubectl get service –all-namespaces



Output
NAMESPACE     NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
default       kubernetes                 ClusterIP   10.32.0.1       <none>        443/TCP         6d
kube-system   csi-attacher-doplugin      ClusterIP   10.32.159.128   <none>        12345/TCP       6d
kube-system   csi-provisioner-doplugin   ClusterIP   10.32.61.61     <none>        12345/TCP       6d
kube-system   kube-dns                   ClusterIP   10.32.0.10      <none>        53/UDP,53/TCP   6d
kube-system   kubernetes-dashboard       ClusterIP   10.32.226.209   <none>        443/TCP         6d

The service IP can be found in the CLUSTER-IP column.

Finding and Entering Pod Network Namespaces

Each Kubernetes pod gets assigned its own network namespace. Network namespaces (or netns) are a Linux networking primitive that provide isolation between network devices.

It can be useful to run commands from within a pod's netns, to check DNS resolution or general network connectivity. To do so, we first need to look up the process ID of one of the containers in a pod. For Docker, we can do that with a series of two commands. First, list the containers running on a node:

 # docker ps



Output
CONTAINER ID        IMAGE                                   COMMAND                  CREATED             STATUS              PORTS               NAMES
173ee46a3926        gcr.io/google-samples/node-hello        "/bin/sh -c 'node se…"   9 days ago          Up 9 days                               k8s_hello-world_hello-world-5b446dd74b-pxtzt_default_386a9073-7e35-11e8-8a3d-bae97d2c1afd_0
11ad51cb72df        k8s.gcr.io/pause-amd64:3.1              "/pause"                 9 days ago          Up 9 days                               k8s_POD_hello-world-5b446dd74b-pxtzt_default_386a9073-7e35-11e8-8a3d-bae97d2c1afd_0
. . .

Find the container ID or name of any container in the pod you're interested in. In the above output we're showing two containers:

The first container is the hello-world app running in the hello-world pod
The second is a pause container running in the hello-world pod. This container exists solely to hold onto the pod's network namespace

To get the process ID of either container, take note of the container ID or name, and use it in the following docker command:

 # docker inspect –format '{{ .State.Pid }}' container-id-or-name


 
 Output
 14552

A process ID (or PID) will be output. Now we can use the nsenter program to run a command in that process's network namespace:

 # nsenter –t your-container-pid –n ip addr

Be sure to use your own PID, and replace ip addr with the command you'd like to run inside the pod's network namespace.

Note: One advantage of using nsenter to run commands in a pod's namespace – versus using something like docker exec – is that you have access to all of the commands available on the node, instead of the typically limited set of commands installed in containers.

Finding a Pod's Virtual Ethernet Interface

Each pod's network namespace communicates with the node's root netns through a virtual ethernet pipe. On the node side, this pipe appears as a device that typically begins with veth and ends in a unique identifier, such as veth77f2275 or veth01. Inside the pod this pipe appears as eth0.

It can be useful to correlate which veth device is paired with a particular pod. To do so, we will list all network devices on the node, then list the devices in the pod's network namespace. We can then correlate device numbers between the two listings to make the connection.

First, run ip addr in the pod's network namespace using nsenter. Refer to the previous section Finding and Entering Pod Network Namespaces for details on how to do this:

 # nsenter –t your-container-pid –n ip addr



Output
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
10: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
    link/ether 02:42:0a:f4:03:04 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.244.3.4/24 brd 10.244.3.255 scope global eth0
       valid_lft forever preferred_lft forever

The command will output a list of the pod's interfaces. Note the if11 number after eth0@ in the example output. This means this pod's eth0 is linked to the node's 11th interface. Now run ip addr in the node's default namespace to list out its interfaces:

 # ip addr



Output
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

. . .

7: veth77f2275@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master docker0 state UP group default
    link/ether 26:05:99:58:0d:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::2405:99ff:fe58:db9/64 scope link
       valid_lft forever preferred_lft forever
9: vethd36cef3@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master docker0 state UP group default
    link/ether ae:05:21:a2:9a:2b brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::ac05:21ff:fea2:9a2b/64 scope link
       valid_lft forever preferred_lft forever
11: veth4f7342d@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master docker0 state UP group default
    link/ether e6:4d:7b:6f:56:4c brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::e44d:7bff:fe6f:564c/64 scope link
       valid_lft forever preferred_lft forever

The 11th interface is veth4f7342d in this example output. This is the virtual ethernet pipe to the pod we're investigating.

Inspecting Conntrack Connection Tracking

Prior to version 1.11, Kubernetes used iptables NAT and the conntrack kernel module to track connections. To list all the connections currently being tracked, use the conntrack command:

 # conntrack -L

To watch continuously for new connections, use the -E flag:

 # conntrack -E

To list conntrack-tracked connections to a particular destination address, use the -d flag:

 # conntrack –L –d 10.32.0.1

If your nodes are having issues making reliable connections to services, it's possible your connection tracking table is full and new connections are being dropped. If that's the case you may see messages like the following in your system logs:

/var/log/syslog
Jul 12 15:32:11 worker-528 kernel: nf_conntrack: table full, dropping packet.

There is a sysctl setting for the maximum number of connections to track. You can list out your current value with the following command:

 # sysctl net.netfilter.nf_conntrack_max



 Output
 Net.netfilter.nf_conntrack_max = 131072

To set a new value, use the -w flag:

 # sysctl –w net.netfilter.nf_conntrack_max=198000

To make this setting permanent, add it to the sysctl.conf file:

/etc/sysctl.conf
 . . .
 Net.ipv4.netfilter.ip_conntrack_max = 198000

Inspecting Iptables Rules

Prior to version 1.11, Kubernetes used iptables NAT to implement virtual IP translation and load balancing for Service IPs.

To dump all iptables rules on a node, use the iptables-save command:

 # iptables-save

Because the output can be lengthy, you may want to pipe to a file (iptables-save > output.txt) or a pager (iptables-save | less) to more easily review the rules.

To list just the Kubernetes Service NAT rules, use the iptables command and the -L flag to specify the correct chain:

 # iptables –t nat –L KUBE-SERVICE



 Output
Chain KUBE-SERVICES (2 references)
target     prot opt source               destination
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  anywhere             10.32.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  anywhere             10.32.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SVC-XGLOHA7QRQ3V22RZ  tcp  --  anywhere             10.32.226.209        /* kube-system/kubernetes-dashboard: cluster IP */ tcp dpt:https
. . .

Querying Cluster DNS

One way to debug your cluster DNS resolution is to deploy a debug container with all the tools you need, then use kubectl to exec nslookup on it.

Another way to query the cluster DNS is using dig and nsenter from a node. If dig is not installed, it can be installed with apt on Ubuntu based Linux distributions:

 # apt install dnsutils

First, find the cluster IP of the kube-dns service:

 # kubectl get service –n kube-system kube-dns



 Output
 NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
 kube-dns   ClusterIP   10.32.0.10   <none>        53/UDP,53/TCP   15d

The cluster IP is highlighted above. Next we'll use nsenter to run dig in the a container namespace.

 # nsenter –t 14346 –n dig kubernetes.default.svc.cluster.local @10.30.0.10

This dig command looks up the Service's full domain name of service-name.namespace.svc.cluster.local and specifics the IP of the cluster DNS service IP (@10.32.0.10).

Looking at IPVS Details

As of Kubernetes 1.11, kube-proxy can configure IPVS to handle the translation of virtual Service IPs to pod IPs. You can list the translation table of IPs with ipvsadm:

 # ipvsadm -Ln




 Output
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  100.64.0.1:443 rr
  -> 178.128.226.86:443           Masq    1      0          0
TCP  100.64.0.10:53 rr
  -> 100.96.1.3:53                Masq    1      0          0
  -> 100.96.1.4:53                Masq    1      0          0
UDP  100.64.0.10:53 rr
  -> 100.96.1.3:53                Masq    1      0          0
  -> 100.96.1.4:53                Masq    1      0          0

To show a single Service IP, use the -t option and specify the desired IP:

 # ipvsadm –Ln –t 100.64.0.10:53



Output
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  100.64.0.10:53 rr
  -> 100.96.1.3:53                Masq    1      0          0
  -> 100.96.1.4:53                Masq    1      0          0

That's all! In this way you can Inspect Kubernetes Networking on Ubuntu 16.04.

Conclusion

For years, Kubernetes and its ecosystem have run large-scale infrastructures. This has translated into a unique approach to how to handle networking between containers and the external services that rely on them.

In this article we've reviewed some commands and techniques for exploring and inspecting the details of your Kubernetes cluster's networking on your ECS instance.

More Resources

Running Microsoft Exchange Server with Alibaba Cloud ECS

This article covers setting up Exchange Server 2019 on Windows Server 2019 Datacenter edition image using an Alibaba Cloud ECS Instance.

Community

How to Inspect Kubernetes Networking

How Kubernetes Networking is Different

Advanced Networking Add-ons for Kubernetes

Getting Started

Prerequisites

Logging to your Console

Finding a Pod's Cluster IP

Finding a Service's IP

Finding and Entering Pod Network Namespaces

Finding a Pod's Virtual Ethernet Interface

Inspecting Conntrack Connection Tracking

Inspecting Iptables Rules

Querying Cluster DNS

Looking at IPVS Details

Conclusion

More Resources

Read previous post:

Read next post:

Alibaba Clouder

You may also like

Comments

Raja_KT February 8, 2019 at 7:33 am

Alibaba Clouder

Related Products

Container Service for Kubernetes

ACK One

Quick Starts

Cloud-Native Applications Management Solution