ACK Net Exporter is a component that enhances the observability of cluster networks. You can deploy ACK Net Exporter in your cluster to collect various metrics of container networks. This allows you to identify and troubleshoot network issues at the earliest opportunity. This topic describes how to use ACK Net Exporter to troubleshoot container network issues.
Prerequisites
A Container Service for Kubernetes (ACK) managed cluster is created. For more information, see Create an ACK managed cluster.Background information
ACK Net Exporter runs in a daemon pod on each node. ACK Net the Exporter uses the Extended Berkeley Packet Filter (eBPF) technology to collect network information from the node and aggregates the information to the pod. ACK Net Exporter provides a standard interface to allow you to monitor high-level network information. The following figure shows the architecture of ACK Net Exporter.Install and configure ACK Net Exporter
Install ACK Net Exporter
- Log on to the ACK console and choose in the left-side navigation pane.
- Find and click ack-net-exporter on the Marketplace page.
- On the ack-net-exporter page, click Deploy in the upper-right corner.
- In the Basic Information step, specify Cluster and Namespace, and then click Next.
- In the Parameters step, configure the following parameters and click OK.
Parameter Description Default value name The name of the ACK Net Exporter component. ack-net-exporter-default namespace The namespace to which ACK Net Exporter belongs. kube-system config.enableEventServer Specify whether to enable event tracing. Valid values: - false: disables event tracing.
- true: enables event tracing.
false config.enableMetricServer Specify whether to enable metric collection. Valid values: - false: disables metric collection.
- true: enables metric collection.
true config.enableLegacyVersion Specify whether to enable the compatibility mode. Valid values: - false: disables the compatibility mode.
- true: enables the compatibility mode. After you enable this mode, ACK Net Exporter supports more operating systems. However, you cannot use new features provided by ACK Net Exporter.
true config.remoteLokiAddress The Grafana Loki service address to which events are pushed. By default, this parameter is empty. config.metricLabelVerbose Specify whether to enable metric verbose. Valid values: - false: disables metric verbose.
- true: enables metric verbose. After you enable this feature, pod IP addresses and labels are saved as the label information of metrics.
false config.metricServerPort The port that is used by the metric service to provide HTTP services. 9102 config.eventServerPort The port that is used by the event service to provide gRPC streaming services. 19102 config.metricProbes The metric probes that you want to enable. For more information, see ACK Net Exporter metrics. By default, this parameter is empty and only the required metric probes are enabled. config.eventProbes The event probes that you want to enable. For more information, see ACK Net Exporter events. By default, this parameter is empty and only the required event probes are enabled.
Configure ACK Net Exporter
- You can run the following command to modify the ConfigMap of ACK Net Exporter:
kubectl edit cm inspector-config -n kube-system
- You can also configure ACK Net Exporter in the ACK console.
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of a cluster and choose in the left-side navigation pane.
- On the ConfigMap page, set Namespace to kube-system, search for inspector-config, and then click Edit in the Actions of inspector-config.
- In the Edit panel, configure the parameters and click OK.
The following table describes the parameters supported by ACK Net Exporter.
Parameter Description Default value debugmode Specify whether to enable the debugging mode. Valid values: - false: disables the debugging mode.
- true: enables the debugging mode. After you enable this feature, debug-level logs, interface debugging, Go pprof, and gops are supported.
false event_config.loki_enable Specify whether to enable the feature of pushing events to Grafana Loki. For more information, see Use Grafana Loki to collect and visualize events. Valid values: - false: disables the feature.
- true: enables the feature.
false event_config.loki_address The Grafana Loki service address. By default, the system automatically discovers a service named grafana-loki in the specified namespace. By default, this parameter is empty. event_config.probes The event probes that you want to enable. For more information, see ACK Net Exporter events. By default, this parameter is empty and only the required event probes are enabled. event_config.port The port used by the event service to provide gRPC streaming services. 19102 metric_config.verbose Specify whether to enable metric verbose. Valid values: - false: disables metric verbose.
- true: enables metric verbose. After you enable this feature, pod IP addresses and labels are saved as the label information of metrics.
false metric_config.port The port that is used by the metric service to provide HTTP services. 9102 metric_config.probes The metric probes that you want to enable. For more information, see ACK Net Exporter metrics. By default, this parameter is empty and only the required metric probes are enabled. metric_config.interval The interval at which metrics are collected. Metric collection compromises performance. Therefore, ACK Net Exporter caches the periodically collected metrics in memory. 5
In earlier ACK Net Exporter versions, you need to trigger the system to recreate all ACK Net Exporter containers after you modify the configuration of ACK Net Exporter. The modified configuration takes effect after the containers are recreated. You no longer need to perform this operation in ACK Net Exporter 0.2.3 and later versions because these versions support hot updates.
Usage notes for ACK Net Exporter
Use ACK Net Exporter in operating systems other than Alinux
Some key features of ACK Net Exporter rely on eBPF programs to collect information. To meet the requirements of different operating system kernels, ACK Net Exporter uses CO-RE to distribute eBPF programs. When ACK Net Exporter starts up, it needs to load the BTF file that is associated with the operating system kernel. The BTF file stores the metadata of the kernel debug information. If no corresponding BTF file is loaded, the key features become unavailable. Most later operating system versions have built-in BTF files. For more information about the operating systems, see BPF Type Format.
- The kernel version of the operating system must be later than 4.10.
- One of the following files is installed:
- The kernel-debuginfo file, which stores the kernel debug information.
- The vmlinux file, which stores the debug information. The file is compiled by the operating system kernel but has not been compressed.
- The BTF file provided by the operating system.
- ACK Net Exporter is updated to 0.2.9 or later, and config.enableLegacyVersion is set to false when you install ACK Net Exporter.
- Store the BTF file in the /boot/ path of the node.
- If you installed a complete vmlinux file, you can store the vmlinux file in the /boot/ path of the operating system.
- If you installed the kernel-debuginfo package, find the vmlinux file in the /usr/lib/debug/lib/modules/ path of the node and copy it to the /boot/ path.
- Run the following command to check whether valid BTF information is loaded and ACK Net Exporter can run as expected:
# You can run commands such as docker, podman, and ctr to perform the test. nerdctl run -it -v /boot:/boot registry.cn-hangzhou.aliyuncs.com/acs/btfhack:latest -- btfhack discover
If the path of the BTF file is returned, the configuration is completed. You can trigger the system to recreate the containers of ACK Net Exporter and wait a period of time. Then, you can view the collected metrics and events.
Metrics and metric format supported by ACK Net Exporter
- If you install ACK Net Exporter from the Marketplace page of the ACK console, you can run the following command to query all ACK Net Exporter pods:
Expected output:kubectl get pod -l app=net-exporter -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES anp-*** 1/1 Running 0 32s 10.1.XX.XX cn-*** <none> <none>
- Run the following command to query metrics. Replace
10.1.XX.XX
with the IP address of ACK Net Exporter obtained in the preceding step.curl http://<10.1.XX.XX>:9102/metrics
inspector_pod_udprcvbuferrors{namespace="elastic-system",netns="ns402653****",node="iZbp179u0bgzhofjupc****",pod="elastic-operator-0"} 0 1654487977826
inspector_pod_udprcvbuferrors
indicates that the metric is provided by ACK Net Exporter and it is a pod metric. Metrics of both pods and nodes are collected. The name of the metric isudprcvbuferrors
, which indicates the number of UDP receive buffer errors that occur because the receive queue within a pod is full.namespace
,pod
,node
, andnetns
: the labels of metrics. You can use PromQL statements to filter labels. Thepod
label indicates the pod that the metric describes. Thenamespace
label indicates the namespace to which the pod belongs. Thenode
label indicates the name of the node that hosts the pod. The hostname specified in the /etc/hostname file is used as the default hostname. Thenetns
label indicates the network namespace ID of a container in the pod.0
and1654487977826
indicate the value of the metric and the point in time when the metric value is collected. The point in time is a UNIX timestamp.
Events and event format supported by ACK Net Exporter
ACK Net Exporter can collect events of network exceptions that occur on nodes. This section describes the network exceptions that you may encounter. These exceptions occasionally occur and are difficult to reproduce. Currently, no efficient methods can be used to troubleshoot these exceptions.
- Connection failures and request timeouts caused by data packet loss.
- Performance issues caused by time-consuming data processing.
- Business interruptions that occur due to the anomalies of the stateful connection mechanism, such as TCP or connection tracking.
ACK Net Exporter provides eBPF-based context observability for operating system kernels to help you troubleshoot the preceding issues. ACK Net Exporter can capture the status of the operating system in real time when an exception occurs and then generates an event log. For more information about the events and event probes supported by ACK Net Exporter, see ACK Net Exporter events.
type=TCPRESET_NOSOCK pod=storage-monitor-5775dfdc77-fj767 namespace=kube-system protocol=TCP saddr=100.103.42.233 sport=443 daddr=10.1.17.188 dport=33488
type=TCPRESET_NOSOCK
: indicates the TCPRESET_NOSOCK event. This type of event is captured by the tcp_reset probe. The event indicates that a reset packet is returned for a packet that is destined for an unknown port because no matching socket can be found. This event usually occurs when NAT fails. For example, this event occurs when an IPVS timeout occurs.pod/namespace
: the pod metadata that is associated with the event after ACK Net Exporter finds the matching IP address and network device serial number based on the network namespace of the packet.saddr/sport/daddr/dport
: the packet information obtained by ACK Net Exporter from the kernel. The packet information varies based on the event. For example, an event captured by the net_softirq probe does not contain IP addresses. Instead, the event contains the serial number of the CPU in which the interruption occurs and the delay.
For events that require valid operating system kernel stacking information, ACK Net Exporter captures the stacking context in the operating system kernel when these events occur, such as the following event:
type=PACKETLOSS pod=hostNetwork namespace=hostNetwork protocol=TCP saddr=10.1.17.172 sport=6443 daddr=10.1.17.176 dport=43018 stacktrace:skb_release_data+0xA3 __kfree_skb+0xE tcp_recvmsg+0x61D inet_recvmsg+0x58 sock_read_iter+0x92 new_sync_read+0xE8 vfs_read+0x89 ksys_read+0x5A
ACK Net Exporter allows you to view events by using multiple methods. For more information, see the Collect monitoring data from ACK Net Exporter topic.
Collect monitoring data from ACK Net Exporter
Scenario 1: Export monitoring data to Prometheus or Grafana and visualize the data
scrape_config
to enable the Prometheus server to collect monitoring data from ACK Net Exporter:# In the following example, only one endpoint is specified for data collection.
scrape_configs:
# The job=<job_name> label is added to each time series that is collected based on the configuration. In this example, the job name is set to net-exporter_sample.
- job_name: "net-exporter_sample"
static_configs:
- targets: ["{kubernetes pod ip}:9102"]
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: kube-system
data:
prometheus.yml: |-
# Add the following configuration to the Prometheus server:
- job_name: 'net-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'net-exporter'
action: keep
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
After you add the configuration, the inspector into the search box on the Graph page of the Prometheus server to view the ACK Net Exporter metrics.
page of the Prometheus server shows the ACK Net Exporter pods that run as normal. You can also enterYou can configure Grafana to visualize the monitoring data that is collected to Prometheus:
- In the left-side navigation pane of the Grafana page, choose .
- On the New dashboard page, click Add an empty panel.
- In the lower part of the Edit Panel page, enter Prometheus into the Data source field. Then, enter the address of the Prometheus server.
- Click Metric browser and enter inspector. Then, Grafana displays all available ACK Net Exporter metrics. Click Save in the upper-right part. In the dialog box that appears, click Save. Grafana then displays the visualized data, as shown in the following figure.
- You can configure how the metrics are displayed on a Grafana dashboard based on the configurations that are displayed in the preceding figure. For example, you can use the following configurations to display the increment trend of the
inspector_pod_tcppassiveopens
metric. This metric indicates the total number of sockets that are created due to handshake requests sent by clients to establish TCP connections within a network space after the system is started or the container is created. To view the increment trend of this metric, use the following configurations:// Use the rate() method provided by PromQL to calculate the increment trend of the metric. rate(inspector_pod_tcppassiveopens[1m]) // Use the labels provided by net-exporter to configure a legend to display the metric. {{namespace}}/{{pod}}/{{node}}
Scenario 2: Export monitoring data to ARMS and visualize the data
To export monitoring data from ACK Net Exporter to Application Real-Time Monitoring Service (ARMS) and visualize the data, perform the following steps.
- Enable Prometheus Service.
- Configure custom ACK Net Exporter metrics.
Scenario 3: Export monitoring data to Grafana Loki and visualize the data
You can push anomaly events collected by ACK Net Exporter to your pre-configured Grafana Loki service in real time. This helps you manage these events in a centralized manner. To export monitoring data from ACK Net Exporter to Grafana Loki, perform the following steps.
- Set up Grafana Loki. Note Deploy Grafana Loki in a network that is accessible to the ACK Net Exporter pods. ACK Net Exporter can automatically push event logs to Grafana Loki.
- On the configuration page of ACK Net Exporter, set enableEventServer to true and lokiServerAddress to the address of the Grafana Loki service. You can specify the IP address or domain name of the Grafana Loki service.
- Run the following command to access the service address and check whether Grafana Loki is ready:
curl http://[Address of Grafana Loki]:3100/ready
- When Grafana Loki is ready, add Grafana Loki as a Grafana data source. Open Grafana. In the left-side navigation pane, choose Save&test., enter the address of Grafana Loki, and then click
- In the left-side navigation pane, click Explore. On the top of the page, set the data source to Loki and view the events pushed to Grafana Loki. You can view the events of a node by selecting the node from the Label filters drop-down list or specify keywords in the Line containers field to search for specific events.
You can click Add to dashboard on the top of the page to add a configured event panel to the dashboard.
The content of the events provided by ACK Net Exporter varies based on the event type. You can check the event details to view the relevant content.For more information about the LogQL query language supported by Grafana Loki, see LogQL.
Scenario 4: Use the ACK Net Exporter CLI to collect events
The ACK Net Exporter CLI (inspector-cli) is a scenario-specific troubleshooting and analysis tool developed by the ACK team based on ACK Net Exporter. You can use inspector-cli to collect kernel exception events in real time. inspector-cli can help quickly identify the cause of common exceptions in cloud-native scenarios.
# Launch a temporary container to run inspector-cli. You can replace the image with a later version to update inspector-cli.
docker run -it --name=inspector-cli --network=host registry.cn-hangzhou.aliyuncs.com/acs/inspector:v0.0.1-12-gff0558c-aliyun
which inspector
# /bin/inspector is the working path of inspector-cli. You can directly run inspector-cli in the container.
# Set '-e' to specify the address of the event service of ACK Net Exporter.
inspector watch -e 10.1.16.255
# Expected output:
INFO TCP_RCV_RST_ESTAB Namespace=kube-system Pod=kube-proxy-worker-tbv5s Node=iZbp1jesgumdx66l8ym8j8Z Netns=4026531993 10.1.16.255:43186 -> 100.100.27.15:3128
...
You can also log on to the inspector container of ACK Net Exporter to troubleshoot issues.
# When you run the following command, set the -n parameter to the namespace of net-exporter and specify the net-exporter pod that you want to access.
kubectl exec -it -n kube-system -c inspector net-exporter-2rvfh -- sh
# Run the following command to view the distribution of network entities on the current node.
inspector list entity
# Run the following command to listen for network exception events and other relevant information in the local network.
inspector watch -d -v
#{"time":"2023-02-03T09:01:03.402118044Z","level":"INFO","source":"/go/src/net-exporter/cmd/watch.go:63","msg":"TCPRESET_PROCESS","meta":"hostNetwork/hostNetwork node=izbp1dnsn1bwv9oyu2gaupz netns=ns0 ","event":"protocol=TCP saddr=10.1.17.113 sport=6443 daddr=10.1.17.113 dport=44226 state:TCP_OTHER "}
# You can also specify multiple ACK Net Exporter nodes to view the time when the event occurs on these nodes.
inspector watch -s 10.1.17.113 -s 10.1.18.14 -d -v
How to use ACK Net Exporter to troubleshoot occasional container network issues
This section describes how to troubleshoot occasional network issues in cloud-native scenarios. With the help of ACK Net Exporter, you can quickly obtain information that is required for fixing these issues.
DNS timeout issues
- The DNS server fails to reply before the DNS query times out.
- The DNS client cannot deliver the DNS query promptly or fails to deliver the DNS query.
- The DNS server responds to the DNS query. However, the response is lost due to a DNS client issue, such as insufficient memory.
Metric | Description |
---|---|
inspector_pod_udpsndbuferrors | The number of UDP packet send errors. |
inspector_pod_udpincsumerrors | The number of UDP packet checksum errors. |
inspector_pod_udpnoports | The number of times that the __udp4_lib_rcv function fails to find the socket when the function is invoked to receive UDP packets. |
inspector_pod_udpinerrors | The number of UDP packet receive errors. |
inspector_pod_udpoutdatagrams | The number of UDP packets that are successfully sent. |
inspector_pod_udprcvbuferrors | The number of times that UDP fails to replicate protocol data from the application layer to a socket queue because the socket queue is full. |
A large number of services in cloud-native environments rely on the DNS resolution service provided by CoreDNS. If a DNS issue correlated to CoreDNS occurs, you need to check the metrics of the CoreDNS pod.
Nginx Ingress 499, 502, 503, and 504 issues
499
: This error is returned if the NGINX client closes the TCP connection without receiving a response from the NGINX server. Common reasons:- The NGINX client does not send the request immediately after the TCP connection is created. As a result, the client times out before the NGINX server replies. This issue commonly occurs to asynchronous requests sent by Android clients.
- The NGINX server requires a period of time to handle the TCP connection. In this scenario, you need to check all possible causes.
- The NGINX server is waiting for the response from the upstream backend.
502
: This error is usually caused by connection issues between the NGINX server and upstream backend, such as connection failures or unexpected connection disruptions. Common reasons:- A DNS resolution failure occurs to the backend. This issue commonly occurs when a Kubernetes Service is specified as the backend.
- The NGINX server fails to connect to the upstream backend.
- Business interaction is interrupted because the size of the upstream request or response is too large or no memory can be allocated.
503
: This error is returned to the client when all upstream backends are unavailable. Common reasons in cloud-native environments:- No backends are available. This issue only occasionally occurs.
- The Ingress triggers rate limiting due to the heavy traffic.
504
: This error is returned when packets exchanged between the NGINX server and upstream backend time out. One of the common reasons is that the response from the upstream backend fails to reach the NGINX server before the timeout period ends.
- The access_log information provided by NGINX, including
request_time
,upstream_connect_time
, andupstrem_response_time
. - The error_log information provided by NGINX. You need to check whether error messages are returned when the issue occurs.
- If you have configured liveness probing or readiness probing, you need to check the health check information.
Metric | Description |
---|---|
inspector_pod_tcpextlistenoverflows | The number of times that the SYN queue is full when the socket in the LISTEN state accepts connections. |
inspector_pod_tcpextlistendrops | The number of times that the socket in the LISTEN state fails to create a socket in the SYN_RECV state. |
inspector_pod_netdevtxdropped | The number of packet drops due to NIC send errors. |
inspector_pod_netdevrxdropped | The number of packet drops due to NIC receive errors. |
inspector_pod_tcpactiveopens | The number of times that TCP SYN succeeds within a pod, excluding SYN retransmissions. The value of this metric also increases when connection failures occur. |
inspector_pod_tcppassiveopens | The number of times that TCP handshake succeeds and a socket is allocated within a pod. In most cases, this metric indicates the number of new connections. |
inspector_pod_tcpretranssegs | The total number of packets that are retransmitted within a pod. TCP segments generated by TCP segmentation offload (TSO) are already counted. |
inspector_pod_tcpestabresets | The number of TCP connections that are exceptionally closed within a pod. The value is calculated only based on results. |
inspector_pod_tcpoutrsts | The number of TCP reset packets sent within a pod. |
inspector_pod_conntrackinvalid | The number of times that connection tracking fails to create connections but does not drop the packets. |
inspector_pod_conntrackdrop | The number of times that connection tracking drops packets due to connection failures. |
request_time
) is short but the request times out.Metric | Description |
---|---|
inspector_pod_tcpsummarytcpestablishedconn | The number of TCP connections in the ESTABLISHED state. |
inspector_pod_tcpsummarytcptimewaitconn | The number of TCP connections in the TIMEWAIT state. |
inspector_pod_tcpsummarytcptxqueue | The size of data packets in the send queue of TCP connections in the ESTABLISHED state. Unit: bytes. |
inspector_pod_tcpsummarytcprxqueue | The size of data packets in the receive queue of TCP connections in the ESTABLISHED state. Unit: bytes. |
inspector_pod_tcpexttcpretransfail | The number of errors other than EBUSY that are returned after a retransmission. The errors indicate that the retransmission fails. |
You can check the changes of the preceding metrics at the point in time when the issue occurs to narrow down the scope. If you still cannot locate the cause, Submit a ticket and include the preceding information in your ticket to request technical support.
TCP reset issues
connection reset by peer
: This error usually occurs on NGINX services that rely on the C library.Broken pipe
: This error usually occurs on Java and Python applications that are encapsulated with TCP.
- The server cannot provide services as normal. For example, the memory allocated to TCP is insufficient. In this scenario, TCP proactively sends reset packets.
- Requests are forwarded to an unexpected backend due to a stateful mechanism error, such as an endpoint or Conntrack error, when Services or load balancers are used.
- Connections are released due to security reasons.
- Protection Against Wrapped Sequence numbers (PAWS) or sequence number wrapping issues occur in NAT or high-concurrency scenarios.
- Connections remain idle for a long period of time when TCP keepalive is used.
- Analyze the network topology between the client and server when TCP reset packets are generated.
- Pay attention to the following metrics.
Metric Description inspector_pod_tcpexttcpabortontimeout The number of times that TCP reset packets are sent to close connections because the upper limit of keepalive, window probe, and retransmission calls is reached. inspector_pod_tcpexttcpabortonlinger The number of times that TCP reset packets are sent to close FIN_WAIT2 connections when the TCP Linger_2 option is enabled. inspector_pod_tcpexttcpabortonclose The number of times that TCP reset packets are sent to close TCP connections when data reception is still in progress due to a reason other than the status machine. inspector_pod_tcpexttcpabortonmemory The number of times that TCP reset packets are sent to close connections because tcp_check_oom triggers an out of memory error during memory allocation to tw_sock or tcp_sock. inspector_pod_tcpexttcpabortondata* The number of times that TCP reset packets are sent to close connections because the Linger or Linger2 option is enabled. inspector_pod_tcpexttcpackskippedsynrecv The number of times that the socket in the SYN_RECV state does not respond to ACK. inspector_pod_tcpexttcpackskippedpaws The number of times that ACK packets are limited by the Out-of-Window (OOW) rate limiting mechanism because PAWS is triggered. inspector_pod_tcpestabresets The number of TCP connections that are exceptionally closed within a pod. The value is calculated only based on results. inspector_pod_tcpoutrsts The number of TCP reset packets sent within a pod. - If TCP reset occurs in a specific pattern, you can enable the events feature of ACK Net Exporter to collect the corresponding events.
Event Event description TCP_SEND_RST This event is generated when TCP reset packets are sent to close connections unless the following TCP_SEND_RST_NOSock or TCP_SEND_RST_ACTIVE common event occurs. TCP_SEND_RST_NOSock This event is generated when TCP reset packets are sent because no local socket is found. TCP_SEND_RST_ACTIVE This event is generated when TCP reset packets are sent due to a resource issue or because the user mode is disabled. TCP_RCV_RST_SYN This event is generated when TCP reset packets are sent during the three-way handshake phase. TCP_RCV_RST_ESTAB This event is generated when TCP reset packets are sent after connections are established. TCP_RCV_RST_TW This event is generated when TCP reset packets are sent during the four-way handshake phase.
Occasional network latency and jitter issues
- A real-time process managed by the RT scheduler requires a long period of time to complete. As a result, user processes or network kernel processes are piled in the queue or run slowly.
- An external call made by the user process occasionally requires a long period of time to complete. For example, requests are processed slowly because the disk responds slowly or the round-trip time of an RDS instance increases.
- Some CPUs or NUMA nodes are overwhelmed due to the improper node configuration. As a result, system stuttering occurs.
- The stateful mechanism of the kernel causes the increased latency. For example, due to the confirm operation performed by connection tracking, a large number of orphan sockets adversely affect socket search.
Metric | Description |
---|---|
inspector_node_netsoftirqshed | The duration from the time when a software interrupt is initiated to the time when the ksoftirqd process starts to perform the software interrupt. |
inspector_node_netsoftirq | The duration from the time when the ksoftirqd process starts to perform the software interrupt to the time when the ksoftirqd process changes to the offcpu state. |
inspector_pod_ioioreadsyscall | The number of read operations performed by the process, such as the number of reads or preads. |
inspector_pod_ioiowritesyscall | The number of write operations performed by the process, such as the number of writes or pwrites. |
inspector_pod_ioioreadbytes | The number of bytes that the process reads from a file system (a block device in most cases). |
inspector_pod_ioiowritebyres | The number of bytes that the process writes into a file system. |
inspector_node_virtsendcmdlat | The duration of virtual calls for NIC operations. |
inspector_pod_tcpexttcptimeouts | The number of times that SYN packets are retransmitted because the SYN packets are not answered while the status of TCP_CA is not recovery, loss, or disorder. |
inspector_pod_tcpsummarytcpestablishedconn | The number of TCP connections in the ESTABLISHED state. |
inspector_pod_tcpsummarytcptimewaitconn | The number of TCP connections in the TIMEWAIT state. |
inspector_pod_tcpsummarytcptxqueue | The size of data packets in the send queue of TCP connections in the ESTABLISHED state. Unit: bytes. |
inspector_pod_tcpsummarytcprxqueue | The size of data packets in the receive queue of TCP connections in the ESTABLISHED state. Unit: bytes. |
inspector_pod_softnetprocessed | The number of backlog packets that all CPUs receive from the NIC within a pod. |
inspector_pod_softnettimesqueeze | The number of times that all CPUs fail to receive the complete packet or the receive operation times out within a pod. |
Case study
The following cases show how to use ACK Net Exporter to help troubleshoot container network issues.
Case 1: Occasional DNS resolution timeout
Symptom
Customer A submitted a ticket to request technical support to handle DNS resolution timeouts that occasionally occur. The application of Customer A is written in PHP. CoreDNS is configured to perform DNS resolution.
Troubleshooting
- Obtain DNS metrics from the monitoring system of Customer A.
- The following situations exist based on the obtained metrics:
- Each time a DNS resolution timeout occurs, the value of
inspector_pod_udpnoports
increases by 1. The value of this metric is small. - The number of
__udp4_lib_rcv
packet drops indicated by theinspector_pod_packetloss
metric increases by 1. However, the change in the number of packet drops is minor.
- Each time a DNS resolution timeout occurs, the value of
- Customer A specifies that the IP address of the DNS server is a public IP address provided by an Internet service provider (ISP). Based on the obtained metrics, the DNS timeouts occurred because the time required to send the response to the client is long. The response is received after the DNS query times out in user mode.
Case 2: Occasional Java application connection failure
Symptom
Customer B submitted a ticket to request technical support to resolve the following issue: Tomcat occasionally becomes unavailable and the issue lasts 5 to 10 seconds each time.
Troubleshooting
- The log analysis result shows that the Java runtime was performing a garbage collection operation when the issue occurred.
- Customer B deployed ACK Net Exporter and analyzed the monitoring data. Customer B found that the value of the
inspector_pod_tcpextlistendrops
metric increased significantly at the time when the issue occurred. - The analysis result shows that request processing was slowed down when the Java runtime performed the garbage collection operation. However, new requests are not throttled. As a result, a large number of connections are created and the backlog of the LISTEN socket is overflowed. This causes the value of the
inspector_pod_tcpextlistendrops
metric to increase. - The issue of TCP connections piling up lasts only a short period of time and the issue is not caused due to the request processing capability of TCP. In this scenario, Customer B modified the Tomcat parameters as we recommended and resolved the issue.
Case 3: Occasional network jitter
Symptom
Customer C submitted a ticket to request technical support to resolve the following issue: The round-trip time between the Redis instance and application significantly increases. As a result, timeout errors occur. The issue cannot be reproduced.
Troubleshooting
- After Customer C analyzed the log, Customer C identified that the response time of Redis requests occasionally exceeds 300 milliseconds.
- Customer C also identified that the value of the
inspector_node_virtsendcmdlat
metric increased at the time when the issue occurred. The affected monitoring levels in Prometheus Service are 15 and 18. After calculation, Customer C identified two virtual calls with a long response time. The response time of the call with a monitoring level of 15 exceeds 36 milliseconds and the response time of the call with a monitoring level of 18 exceeds 200 milliseconds. - The kernel occupies the CPU when processing virtual calls. In this case, CPU resources cannot be preempted by other operations. As a result, the execution of virtual calls is slowed down when pods are added or deleted in batches, which further causes the response time to increase.
Case 4: NGINX Ingress occasionally fails to pass health checks
Symptom
Customer D submitted a ticket to request technical support to resolve the following issue: The NGINX Ingress occasionally fails to pass health checks. As a result, request failures occur.
Troubleshooting
- After Customer D deployed ACK Net Exporter, Customer D identified that the following metrics are abnormal:
- The values of the
inspector_pod_tcpsummarytcprxqueue
andinspector_pod_tcpsummarytcptxqueue
metrics increased. - The value of the
inspector_pod_tcpexttcptimeouts
metric increased. - The value of the
inspector_pod_tcpsummarytcptimewaitconn
metric decreased and the value of theinspector_pod_tcpsummarytcpestablishedconn
metric increased.
- The values of the
- The analysis result shows that the kernel ran as normal when the issue occurred. Connections are created as normal. However, exceptions occurred when the user process handled the packets in the receive socket and sent packets. In this scenario, the health check failure may be caused by a scheduling or rate limiting issue.
- Customer D checked the monitoring data of the cgroups as we recommended and identified CPU throttling at the point in time when the health check failure occurred. This indicates that the user process occasionally failed to schedule CPU resources due to a cgroup issue.
- To resolve this issue, refer to CPU Burst and configure CPU Burst for the NGINX Ingress.
References
ACK Net Exporter metrics
The metrics supported by ACK Net Exporter are constantly updated. For more information, see the instructions on the Marketplace page of the ACK console. All metrics and events provide pod-specific information, except net_softirq and virtcmdlat, which are not related to pods.
Metric | Description | Probe name |
---|---|---|
inspector_pod_netdevrxbytes | The number of bytes received by the NIC. | netdev |
inspector_pod_netdevtxbytes | The number of bytes sent by the NIC. | netdev |
inspector_pod_netdevtxerrors | The number of NIC send errors. | netdev |
inspector_pod_netdevrxerrors | The number of NIC receive errors. | netdev |
inspector_pod_netdevtxdropped | The number of packet drops due to NIC send errors. | netdev |
inspector_pod_netdevrxdropped | The number of packet drops due to NIC receive errors. | netdev |
inspector_pod_netdevtxpackets | The number of packets that are successfully sent by the NIC. | netdev |
inspector_pod_netdevrxpackets | The number of packets that are successfully received by the NIC. | netdev |
inspector_pod_softnetprocessed | The number of backlog packets that all CPUs receive from the NIC within a pod. | softnet |
inspector_pod_softnetdropped | The number of backlog packets that are dropped by all CPUs after the CPUs receive the packets from the NIC within a pod. | softnet |
inspector_pod_softnettimesqueeze | The number of times that all CPUs fail to receive the complete packet or the receive operation times out within a pod. | softnet |
inspector_pod_tcpactiveopens | The number of times that TCP SYN succeeds within a pod, excluding SYN retransmissions. The value of this metric also increases when connection failures occur. | tcp |
inspector_pod_tcppassiveopens | The number of times that TCP handshake succeeds and a socket is allocated within a pod. In most cases, this metric indicates the number of new connections. | tcp |
inspector_pod_tcpretranssegs | The total number of packets that are retransmitted within a pod. TCP segments generated by TSO are already counted. | tcp |
inspector_pod_tcpestabresets | The number of TCP connections that are exceptionally closed within a pod. The value is calculated only based on results. | tcp |
inspector_pod_tcpoutrsts | The number of TCP reset packets sent within a pod. | tcp |
inspector_pod_tcpcurrestab | The number of active TCP connections within a pod. | tcp |
inspector_pod_tcpexttcpabortontimeout | The number of times that TCP reset packets are sent to close connections because the upper limit of keepalive, window probe, and retransmission calls is reached. | tcpext |
inspector_pod_tcpexttcpabortonlinger | The number of times that TCP reset packets are sent to close FIN_WAIT2 connections when the TCP Linger_2 option is enabled. | tcpext |
inspector_pod_tcpexttcpabortonclose | The number of times that TCP reset packets are sent to close TCP connections when data reception is still in progress due to a reason other than the status machine. | tcpext |
inspector_pod_tcpexttcpabortonmemory | The number of times that TCP reset packets are sent to close connections because tcp_check_oom triggers an out of memory error during memory allocation to tw_sock or tcp_sock. | tcpext |
inspector_pod_tcpexttcpabortondata* | The number of times that TCP reset packets are sent to close connections because the Linger or Linger2 option is enabled. | tcpext |
inspector_pod_tcpextlistenoverflows | The number of times that the SYN queue is full when the socket in the LISTEN state accepts connections. | tcpext |
inspector_pod_tcpextlistendrops | The number of times that socket in the LISTEN state fails to create a socket in the SYN_RECV state. | tcpext |
inspector_pod_tcpexttcpackskippedsynrecv | The number of times that the socket in the SYN_RECV state does not respond to ACK. | tcpext |
inspector_pod_tcpexttcpackskippedpaws | The number of times that ACK packets are limited by the OOW rate limiting mechanism because PAWS is triggered. | tcpext |
inspector_pod_tcpexttcpackskippedseq | The number of times that ACK packets are limited by the OOW rate limiting mechanism because sequence numbers are out of window. | tcpext |
inspector_pod_tcpexttcpackskippedchallenge | The number of times that challenge ack packets are limited by the OOW rate limiting mechanism. These packets are usually sent to confirm TCP reset packets. | tcpext |
inspector_pod_tcpexttcpackskippedtimewait | The number of times that ACK packets are ignored by the OOW rate limiting mechanism in the fin_wait_2 state. | tcpext |
inspector_pod_tcpexttcpackskippedfinwait2 | The number of times that ACK packets are ignored by the OOW rate limiting mechanism in the fin_wait_2 state. | tcpext |
inspector_pod_tcpextpawsestabrejected* | The number of times that TCP inbound packets are dropped because PAWS is triggered. | tcpext |
inspector_pod_tcpexttcprcvqdrop | The value of this metric increases when memory allocation fails and the TCP receive queue is full. | tcpext |
inspector_pod_tcpexttcpretransfail | The number of errors other than EBUSY that are returned after a retransmission. The errors indicate that the retransmission fails. | tcpext |
inspector_pod_tcpexttcpsynretrans | The number of SYN packets that are retransmitted. | tcpext |
inspector_pod_tcpexttcpfastretrans | The number of times that retransmission is triggered when the status of TCP_CA is not Loss. | tcpext |
inspector_pod_tcpexttcptimeouts | The number of times that SYN packets are retransmitted because the SYN packets are not answered while the status of TCP_CA is not recovery, loss, or disorder. | tcpext |
inspector_pod_tcpsummarytcpestablishedconn | The number of TCP connections in the ESTABLISHED state. | tcpsummary |
inspector_pod_tcpsummarytcptimewaitconn | The number of TCP connections in the TIMEWAIT state. | tcpsummary |
inspector_pod_tcpsummarytcptxqueue | The size of data packets in the send queue of TCP connections in the ESTABLISHED state. Unit: bytes. | tcpsummary |
inspector_pod_tcpsummarytcprxqueue | The size of data packets in the receive queue of TCP connections in the ESTABLISHED state. Unit: bytes. | tcpsummary |
inspector_pod_udpindatagrams | The number of UDP packets that are successfully received. | udp |
inspector_pod_udpsndbuferrors | The number of UDP packet send errors. | udp |
inspector_pod_udpincsumerrors | The number of UDP packet checksum errors. | udp |
inspector_pod_udpignoredmulti | The number of multicast packets that are ignored by UDP. | udp |
inspector_pod_udpnoports | The number of times that the corresponding socket cannot be found when the network layer invokes __udp4_lib_rcv to receive packets. | udp |
inspector_pod_udpinerrors | The number of UDP packet receive errors. | udp |
inspector_pod_udpoutdatagrams | The number of UDP packets that are successfully sent. | udp |
inspector_pod_udprcvbuferrors | The number of times that UDP fails to replicate protocol data from the application layer to a socket queue because the socket queue is full. | udp |
inspector_pod_conntrackentries* | The number of existing entries. | conntrack |
inspector_pod_conntrackfound | The number of times that connection tracking records are found. | conntrack |
inspector_pod_conntrackinsert | The metric is not in use. | conntrack |
inspector_pod_conntrackinvalid | The number of times that connection tracking fails to create connections but does not drop the packets. | conntrack |
inspector_pod_conntrackignore | The number of times that connection tracking is skipped before connections are already created or connection tracking is not required. | conntrack |
inspector_pod_conntrackinsertfailed | The metric is not in use. | conntrack |
inspector_pod_conntrackdrop | The number of times that connection tracking drops packets due to connection failures. | conntrack |
inspector_pod_conntrackearlydrop | The metric is not in use. | conntrack |
inspector_pod_conntracksearchrestart | The number of attempts to retry a search during connection tracking. | conntrack |
inspector_pod_fdopenfd | The number of file descriptors of all processes within a pod. | fd |
inspector_pod_fdopensocket | The number of file descriptors of socket type within a pod. | fd |
inspector_pod_slabtcpslabobjperslab | The number of objects included in a single page of a TCP slab. | slab |
inspector_pod_slabtcpslabpagesperslab | The number of pages in a TCP slab. | slab |
inspector_pod_slabtcpslabobjactive | The number of active objects in a TCP slab. | slab |
inspector_pod_slabtcpslabobjnum | The number of objects in a TCP slab. | slab |
inspector_pod_slabtcpslabobjsize | The size of each object in a TCP slab. The size varies based on the kernel version. | slab |
inspector_pod_ioioreadsyscall | The number of read operations performed by the process, such as the number of reads or preads. | io |
inspector_pod_ioiowritesyscall | The number of write operations performed by the process, such as the number of writes or pwrites. | io |
inspector_pod_ioioreadbytes | The number of bytes that the process reads from a file system (a block device in most cases). | io |
inspector_pod_ioiowritebyres | The number of bytes that the process writes into a file system. | io |
inspector_pod_net_softirq_schedslow100ms | The number of times that the amount of time to wait for scheduling exceeds 100 milliseconds when a network interruption occurs. | net_softirq |
inspector_pod_net_softirq_excuteslow100ms | The number of times that a network software interruption lasts more than 100 milliseconds. | net_softirq |
inspector_pod_abnormalloss(inspector_pod_packetloss_abnormal) | The number of times that packets are dropped by the kernel due to errors other than packet issues, such as packet integrity issues or packet checksum errors. | packetloss |
inspector_pod_totalloss(inspector_pod_packetloss_total) | The total number of packets dropped by the kernel. | packetloss |
inspector_pod_virtcmdlatency100ms | The number of times that virtualized communication performed by the NIC lasts more than 100 milliseconds. | virtcmdlat |
inspector_pod_socketlatencyread100ms | The number of times that the user program requires more than 100 milliseconds to read content from the network socket file. | socketlatency |
inspector_pod_socketlatencywrite100ms | The number of times that the user program requires more than 100 milliseconds to write content to the network socket file. | socketlatency |
kernellatency_rxslow100ms | The number of times that the operating system kernel requires more than 100 milliseconds to receive a packet. | kernellatency |
kernellatency_txslow100ms | The number of times that the operating system kernel requires more than 100 milliseconds to send a packet. | kernellatency |
ACK Net Exporter events
The following table describes the operating system network-related events that can be captured by using the latest ACK Net Exporter version.
Probe name | Description |
---|---|
netiftxlat | Queuing Disciplines (qdiscs) of traffic control needs to wait a long period of time before it can send data packets in the queue. |
packetloss | Normal data packets are dropped by the operating system kernel. |
net_softirq | Packet scheduling by NET_RX or NET_TX is interrupted or packet processing is severely delayed due to kernel process software interruption. |
socketlatency | Processes in a pod require a long period to time to complete socket-related read and write operations. |
kernellatency | The kernel requires a long period of time to process packets at the network layer. |
virtcmdlatency | Communication between Virtio-net and the host requires a long period of time. |
tcpreset | TCP reset packets are received or sent. |
tcptwrcv | TCP receives and processes packets when TCP is in the TIMEWAIT state. |
Recommended Grafana configuration file
- If you use a Grafana version later than 8.4.0, click ACK Net Exporter-0.2.9.json to download the Grafana configuration file.
- If you use Grafana 8.4.0 or earlier, click ACK Net Exporter-legacy.json to download the Grafana configuration file.