Troubleshoot network issues with ACK KubeSkoop - Container Service for Kubernetes

Install and configure ACK KubeSkoop

Install ACK KubeSkoop

Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.
On the Add-ons page, search for ACK KubeSkoop, find the component, and click Install.
In the Install ACK KubeSkoop panel, click Confirm.

Configure KubeSkoop

To configure the KubeSkoop component, edit its ConfigMap by running the following command:
```
kubectl edit cm kubeskoop-config -n ack-kubeskoop
```

Alternatively, configure the component in the ACK console:

Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Configurations > ConfigMaps.
On the ConfigMaps page, set Namespace to ack-kubeskoop, search for kubeskoop-config, and then click Edit in the Actions column to the right of kubeskoop-config.

In the Edit panel, configure the parameters and click OK. The following table describes the configuration options supported by KubeSkoop.

Parameter	Description	Default
debugmode	Specifies whether to enable debug mode. Valid values: false: Debug mode is disabled. true: Debug mode is enabled. When enabled, this option provides DEBUG-level logs, debugging interfaces, and Go pprof and gops diagnostic tools.	`false`
port	The port for the metrics service, which provides an HTTP endpoint.	`9102`
enableController	Specifies whether to enable the controller component. The controller interacts with the Kubernetes API to perform monitoring and management tasks.	`true`
controllerAddr	The address of the KubeSkoop controller component.	`dns:kubeskoop-controller:10263`
metrics.probes	A list of monitoring metric types to collect. Each probe corresponds to a metric category.	`metrics: probes: - name: conntrack - name: qdisc - name: netdev - name: io - name: sock - name: tcpsummary - name: tcp - name: tcpext - name: udp - name: rdma` For more information, see Probes, Metrics, and Events.

The ACK KubeSkoop component automatically hot-reloads configuration changes from the ConfigMap, so you do not need to restart it.

Configure the ARMS Prometheus dashboard

Log on to the ARMS console.
In the navigation pane on the left, click Integration Management.
On the Integration Management page, click Add Integration. In the search box, search for KubeSkoop and click ACK KubeSkoop Network Monitoring.
In the ACK KubeSkoop Network Monitoring dialog box, select the ACK cluster to integrate, enter an Integration Name, and then click OK to enable KubeSkoop monitoring.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Operations > Prometheus Monitoring.
Click the Others tab. In the dashboard list, you can find the KubeSkoop monitoring dashboards for nodes and Pods, named Ack KubeSkoop Network Monitor - Node and Ack KubeSkoop Network Monitor - Pod.

Note

For more information about Managed Service for Prometheus, see Integrate with Alibaba Cloud Managed Service for Prometheus.

Use KubeSkoop

Manually view KubeSkoop monitoring metrics

KubeSkoop provides monitoring data in Prometheus format. After installing KubeSkoop, you can access the service port of any KubeSkoop Pod instance to retrieve all metrics.

Run the following command to get all KubeSkoop instances.

 kubectl get pod -n ack-kubeskoop -o wide | grep kubeskoop-agent

Expected output:

kubeskoop-agent-2chvw                   1/1     Running   0             43m   172.16.16.xxx   cn-hangzhou.172.16.16.xxx   <none>           <none>
kubeskoop-agent-2qtbf                   1/1     Running   0             43m   172.16.16.xxx   cn-hangzhou.172.16.16.xxx   <none>           <none>
kubeskoop-agent-72pgf                   1/1     Running   0             43m   172.16.16.xxx   cn-hangzhou.172.16.16.xxx   <none>           <none>

Run the following command to get the metrics. Replace 172.16.16.xxx with the IP address of a KubeSkoop instance from the previous step.
```
curl http://172.16.16.xxx:9102/metrics
```

KubeSkoop provides monitoring metrics in the following format:

kubeskoop_netdev_rxbytes{k8s_namespace="",k8s_node="cn-hangzhou.172.16.16.xxx",k8s_pod=""} 2.970963745e+09

Troubleshoot intermittent network issues

The following sections provide guidance for troubleshooting typical cloud-native problems by using ACK KubeSkoop.

DNS timeout issues

In a cloud-native environment, DNS service timeout issues can cause service access failures. Common causes of DNS timeouts include:

The DNS server responds slowly and cannot complete a DNS query before the application times out.
The sender fails to send the DNS query packet promptly due to a client-side issue.
The server responds promptly, but the sender drops packets due to issues such as insufficient memory.

You can use the following metrics to help troubleshoot intermittent DNS timeout issues:

Metric name	Description
kubeskoop_pod_udpsndbuferrors	The number of errors that occur when sending UDP data through the network layer.
kubeskoop_pod_udpincsumerrors	The number of checksum errors that occur when receiving UDP packets.
kubeskoop_pod_udpnoports	The number of times a Socket for the corresponding port cannot be found when the network layer calls `__udp4_lib_rcv` to receive packets.
kubeskoop_pod_udpinerrors	The number of errors that occur when receiving UDP packets.
kubeskoop_pod_udpoutdatagrams	The number of packets successfully sent by UDP through the network layer.
kubeskoop_pod_udprcvbuferrors	The number of errors caused by an insufficient socket receive queue when copying data to the application layer.

Because many services in a cloud-native environment rely on CoreDNS for domain name resolution, you must also observe the preceding metrics for CoreDNS-related Pods if the DNS issue is related to CoreDNS.

Nginx Ingress HTTP 499/502/503/504 errors

In cloud-native environments, Ingress gateways and other proxy services commonly experience intermittent exceptions. For Nginx Ingress and other Nginx-based proxy services, 499, 502, 503, and 504 are the most common errors. They indicate the following:

499 Client Closed Request: The client closes the TCP connection before Nginx responds. Common causes include:
- The client establishes a connection but sends the request late, causing the client-side timeout to be reached while Nginx is responding. This is common in asynchronous request frameworks on Android clients.
- The server processes the connection slowly after it is established. This requires further investigation.
- The server is slow to process requests sent to the upstream backend.
502 Bad Gateway: This often indicates connection-level issues between Nginx and the upstream backend, such as a connection failure or an abnormal closure by the backend. Common causes include:
- Failed DNS resolution for the configured backend, which often occurs when using a Kubernetes Service as the backend.
- Failure to establish a connection with the upstream.
- The upstream request or response is too large, leading to memory allocation failures that disrupt normal business interactions.
503 Service Unavailable: In Nginx, this status code indicates that all upstream servers are unavailable. In cloud-native scenarios, this has specific meanings. Common causes include:
- No available backends, which is rare.
- Traffic is too heavy and is being throttled by the Ingress rate limit.
504 Gateway Timeout: This error indicates a timeout issue with business-related packets between Nginx and the upstream. The common cause is a delayed response from the upstream.

When you encounter these issues, first collect general information to determine the problem's scope and the next steps for troubleshooting:

The Nginx access log information, especially request_time, upstream_connect_time, and upstream_response_time.
Review Nginx error_log information for any abnormal error messages when the issue occurs.
If a liveness or readiness health check is configured, check its status.

Based on this information, monitor the following metrics for changes when a connection failure may have occurred:

Metric name	Description
kubeskoop_tcpext_listenoverflow	Incremented when the half-connection queue of a socket in the LISTEN state overflows.
kubeskoop_tcpext_listendrops	Incremented when a socket in the LISTEN state fails to create a socket in the SYN_RECV state.
kubeskoop_netdev_txdropped	The number of times the network interface card (NIC) drops packets due to a transmission error.
kubeskoop_netdev_rxdropped	The number of times the NIC drops packets due to a reception error.
kubeskoop_tcp_activeopens	The number of times a Pod successfully initiates a TCP handshake with a SYN packet. This does not include SYN retransmissions, but a failed connection also increases this metric.
kubeskoop_tcp_passiveopens	The cumulative number of times a Pod completes a TCP handshake and successfully allocates a socket. This can generally be understood as the number of successfully established connections.
kubeskoop_tcp_retranssegs	The total number of retransmitted segments in a single Pod. The value is calculated after segmentation by TCP Segmentation Offload (TSO).
kubeskoop_tcp_estabresets	The number of times a TCP connection is abnormally closed in a single Pod. This metric only counts the result.
kubeskoop_tcp_outrsts	The number of reset packets sent by TCP in a single Pod.
kubeskoop_conntrack_invalid	The number of times a connection tracking (conntrack) entry cannot be established for various reasons, but the packet is not dropped.
kubeskoop_conntrack_drop	The number of packets dropped because a conntrack entry could not be established.

If you encounter a situation where Nginx responses are slow, for example, when timeouts occur but the Nginx request_time is short, you can monitor the changes in the following metrics:

Metric name	Description
kubeskoop_tcpsummary_tcpestablishedconn	The current number of TCP connections in the ESTABLISHED state.
kubeskoop_tcpsummary_tcptimewaitconn	The current number of TCP connections in the TIME_WAIT state.
kubeskoop_tcpsummary_tcptxqueue	The total bytes of data in the send queue of TCP connections currently in the ESTABLISHED state.
kubeskoop_tcpsummary_tcprxqueue	The total bytes of data in the receive queue of TCP connections currently in the ESTABLISHED state.
kubeskoop_tcpext_tcpretransfail	Incremented when a retransmitted packet returns an error other than EBUSY, indicating that the retransmission failed.

The changes in these metrics during the incident can help you narrow the scope of your investigation.

TCP reset issues

A TCP reset packet is a response to unexpected situations in the TCP protocol. It typically causes the following errors in user programs:

connection reset by peer error, commonly seen in applications that depend on C libraries, such as Nginx.
Broken pipe error, commonly seen in applications that use TCP connection wrappers, such as Java or Python.

In a cloud-native network environment, there are many common reasons for reset packets. The following are some common causes:

Server-side exceptions prevent normal service, such as insufficient memory configured for TCP. This situation usually triggers a proactive reset.
When using a Service or load balancing, traffic is forwarded to an unexpected backend due to anomalies in stateful mechanisms, such as endpoint selection or conntrack.
Connection release due to security reasons.
In NAT environments or high-concurrency scenarios, Protection Against Wrapped Sequence Numbers (PAWS) or sequence number wraparound occurs.
Using TCP Keepalive to maintain connections, but with no normal business communication for a long time.

To quickly differentiate between these root causes, you can collect some basic information and metrics:

Analyze the network topology between the client and server when the reset packet is generated.

Monitor the changes in the following metrics:

Metric name	Description
kubeskoop_tcpext_tcpabortontimeout	Incremented when a reset is sent because the maximum number of keepalive, window probe, or retransmission calls is exceeded.
kubeskoop_tcpext_tcpabortonlinger	The number of resets sent to quickly reclaim connections in the FIN_WAIT2 state when the TCP Linger2 option is enabled.
kubeskoop_tcpext_tcpabortonclose	Incremented when a reset packet is sent because there is still unread data when a TCP connection is closed for reasons outside the state machine.
kubeskoop_tcpext_tcpabortonmemory	The number of resets sent to terminate a connection due to insufficient memory triggered by `tcp_check_oom` when allocating resources like `tw_sock` or `tcp_sock`.
kubeskoop_tcpext_tcpabortondata	The number of resets sent for fast connection reclamation when the Linger or Linger2 option is enabled.
kubeskoop_tcpext_tcpackskippedsynrecv	The number of times a socket in the SYN_RECV state does not reply with an ACK.
kubeskoop_tcpext_tcpackskippedpaws	The number of times an ACK packet is not sent due to Out-of-Window (OOW) rate limiting, even though a correction was triggered by the PAWS mechanism.
kubeskoop_tcp_estabresets	The number of times a TCP connection is abnormally closed in a single Pod. This metric only counts the result.
kubeskoop_tcp_outrsts	The number of reset packets sent by TCP in a single Pod.

Intermittent network latency jitter

Intermittent network latency jitter is a common and difficult problem to diagnose in cloud-native environments. It has many causes and can lead to the three types of problems mentioned earlier. In a container network scenario, network latency within a node usually has the following causes:

A real-time process managed by the RT scheduler runs for too long, causing business processes or network kernel threads to be queued for a long time or to be processed slowly.
The process itself experiences occasional long external calls, such as slow responses from cloud disks or intermittent increases in RDS Round-Trip Time (RTT), which slows request processing.
Node configuration issues lead to an uneven load between different CPUs or NUMA nodes, causing the heavily loaded system to lag.
Latency caused by stateful mechanisms in the kernel, such as conntrack's confirm operation, or many orphan sockets affecting normal socket lookups.

Although these problems manifest as network issues, their root cause is often related to other operating system factors. Monitor the following metrics to narrow down the scope of your investigation:

Metric name	Description
kubeskoop_io_ioreadsyscall	The number of times a process performs file system read operations, such as `read` and `pread`.
kubeskoop_io_iowritesyscall	The number of times a process performs file system write operations, such as `write` and `pwrite`.
kubeskoop_io_ioreadbytes	The number of bytes a process reads from the file system, usually from a block device.
kubeskoop_io_iowritebytes	The number of bytes a process writes to the file system.
kubeskoop_tcpext_tcptimeouts	Triggered when the Congestion Avoidance (CA) state has not entered recovery, loss, or disorder. Incremented when a SYN packet is not acknowledged and is retransmitted.
kubeskoop_tcpsummary_tcpestablishedconn	The current number of TCP connections in the ESTABLISHED state.
kubeskoop_tcpsummary_tcptimewaitconn	The current number of TCP connections in the TIME_WAIT state.
kubeskoop_tcpsummary_tcptxqueue	The total bytes of data in the send queue of TCP connections currently in the ESTABLISHED state.
kubeskoop_tcpsummary_tcprxqueue	The total bytes of data in the receive queue of TCP connections currently in the ESTABLISHED state.
kubeskoop_softnet_processed	The number of packets from the NIC's backlog processed by all CPUs within a single Pod.
kubeskoop_softnet_dropped	The number of packets dropped by all CPUs within a single Pod.

Case studies

The following case studies show how ACK KubeSkoop was used to troubleshoot complex network problems.

Case 1: Intermittent DNS timeout

Problem

A customer experienced intermittent DNS resolution timeouts. The customer's application was running on PHP, and the DNS service was configured with CoreDNS.

Troubleshooting process

Based on the customer's description, we obtained DNS-related monitoring data.
Analysis of the data during the error period revealed the following:
- The kubeskoop_udp_noports metric increased by 1 during the error period. The overall metric value was small.
- The kubeskoop_packetloss_total metric increased by 1. The change in packet loss was small.
The customer reported that the configured DNS address was a public service provider's address. This information, combined with the monitoring data, indicated that a slow DNS response was the root cause. The DNS response packet arrived after the user-side application had already timed out.

Case 2: Intermittent connection failures in Java

Problem

A customer reported that their Tomcat instance became unavailable intermittently, with each outage lasting 5 to 10 seconds.

Troubleshooting process

Log analysis confirmed that the customer's Java Runtime was performing a Garbage Collection (GC) operation when the issue occurred.
After deploying KubeSkoop monitoring, we found a significant increase in the kubeskoop_tcpext_listendrops metric at the time of the problem.
We concluded that when the customer's Java Runtime performed GC, the request processing speed slowed, delaying connection releases. However, new connection requests were not limited, which created a large number of connections. This filled the listen socket's backlog and caused an overflow, leading to the increase in kubeskoop_tcpext_listendrops.
The customer's connection buildup was short-lived, and the processing capacity was not an issue. We recommended that the customer adjust the relevant Tomcat parameters, which resolved the problem.

Case 3: Intermittent network latency jitter

Problem

A customer discovered that requests between their application and Redis experienced intermittent RTT increases, leading to business timeouts. However, the problem could not be reproduced.

Troubleshooting process

Log analysis showed that the customer experienced intermittent Redis requests with a total response time exceeding 300 ms.
After deploying KubeSkoop, the monitoring data showed an increase in the kubeskoop_virtcmdlatency_latency metric when the problem occurred. The le (Prometheus histogram bucket label) values that increased were 18 and 15. This indicated that two high-latency virtualization calls had occurred. The one with le=15 caused a delay of over 36 ms, and the one with le=18 caused a delay of over 200 ms.
Since kernel virtualization calls occupy the CPU and cannot be preempted, the intermittent latency was caused by long-running virtualization calls during batch Pod creation and deletion.

Case 4: Ingress Nginx health check failures

Problem

The Ingress machine had intermittent health check failures, accompanied by business request failures.

Troubleshooting process

After deploying monitoring, we found that several metrics showed abnormal changes at the time of the issue:
1. Both kubeskoop_tcpsummary_tcprxqueue and kubeskoop_tcpsummary_tcptxqueue increased.
2. kubeskoop_tcpext_tcptimeouts increased.
3. kubeskoop_tcpsummary_tcptimewaitconn decreased, and kubeskoop_tcpsummary_tcpestablishedconn increased.
Analysis confirmed that the kernel was working normally and connections were being established correctly. However, the process execution was abnormal, including processing packets from the receive socket and sending packets. We suspected a scheduling or resource limit issue with the user process.
A review of Cgroup monitoring revealed that the customer experienced CPU throttling at the time of the issue. This proved that Cgroup limitations intermittently prevented the user process from being scheduled.
By following the guide Enable the CPU Burst performance optimization policy, we configured the CPU Burst feature for Ingress, which resolved this type of issue.

Container Service for Kubernetes:Troubleshoot network issues with ACK KubeSkoop

Background information

Install and configure ACK KubeSkoop

Install ACK KubeSkoop

Configure KubeSkoop

Configure the ARMS Prometheus dashboard

Use KubeSkoop

Manually view KubeSkoop monitoring metrics

Troubleshoot intermittent network issues

DNS timeout issues

Nginx Ingress HTTP 499/502/503/504 errors

TCP reset issues

Intermittent network latency jitter

Case studies

Case 1: Intermittent DNS timeout

Case 2: Intermittent connection failures in Java

Case 3: Intermittent network latency jitter

Case 4: Ingress Nginx health check failures