All Products
Search
Document Center

Container Service for Kubernetes:Observability system overview

Last Updated:Jan 22, 2024

Observability is a capability for analyzing the external inference results and measuring the internal status of a system. The observability capability of Kubernetes includes monitoring and logging. Monitoring allows developers to keep track of the operations of a system. Logging facilitates diagnostics and troubleshooting. This topic provides insights into the observability of Container Service of Kubernetes (ACK) and the observability of each layer. This helps you gain a comprehensive understanding of observability.

Observability of ACK

The architecture of a observability system built on top of ACK consists of four layers. The four layers from bottom to top are: infrastructure, container performance, application performance, and businesses.

image.png

The following section describes the observability of each layer.

Infrastructure observability

Observability of underlying resources of ACK. This allows you to locate the traces of resource pools that consist of pods and nodes, visualize topological relationships, and monitor infrastructure. For example, you can monitor the performance of hosts and basic network plug-ins.

Solution

Description

Scenario

References

Visualized architecture discovery

Businesses within an ACK cluster run in resource pools that are composed of nodes. It is difficult to locate the traces and topological relationships of pods. The challenges are how to monitor the status of Kubernetes workloads in a visualized manner and better visualize the traffic throughput of Kubernetes clusters.

Kubernetes monitoring used by ACK integrates Extended Berkeley Packet Filter (eBPF) and Managed Service for Prometheus to support metric collection, application tracing, log analysis, and event monitoring. Kubernetes monitoring allows you to monitor ACK clusters from end to end. It endows network monitoring and visualized architecture awareness capabilities to ACK clusters. Kubernetes monitoring provides developers and O&M engineers with intrusion-free observability solutions.

All scenarios are supported.

  • Monitor network traffic between nodes and pods in ACK clusters.

  • Monitor network traffic on top of Layer 4 between pods, monitor network connections established based on TCP, HTTP, and other protocols, and monitor DNS resolution.

For more information, see Cluster topology monitoring.

Kernel-level container monitoring

ACK provides operating system kernel-level container monitoring based on System Observer Monitoring (SysOM). This capability can help you better deploy and migrate containerized applications and monitor containers.

All scenarios are supported.

For more information, see Kernel-level container monitoring based on SysOM.

Collection of infrastructure metrics

Resource monitoring is the most commonly used method to monitor the underlying resources of ACK. You can monitor the usage of CPU, memory, and network resources. Resource monitoring in ACK is interfaced with CloudMonitor. The CloudMonitor agent is automatically installed and integrated in newly created ACK clusters.

All scenarios are supported.

For more information, see Monitor basic resources.

Container performance observability

Observability of container abstractions in the observability system built on top of ACK. Container performance observability allows you to monitor the performance of clusters, containers, and container components, and detect cluster events.

Collect the performance metrics of clusters and containers

Solution

Description

Scenario

References

Integration of CloudMonitor with ACK

By default, the CloudMonitor agent is installed and integrated in newly created ACK clusters. The agent can collect cluster and container metrics and allows you to view these metrics in the ACK console.

Only certain scenarios are supported.

Provide custom container performance metrics and observability.

For more information, see Monitor basic resources.

Managed Service for Prometheus

Prometheus is an open source service that is used to monitor containers based on cloud-native metrics. Managed Service for Prometheus is a managed monitoring service that is fully interfaced with the open source Prometheus ecosystem. Managed Service for Prometheus monitors a wide array of components and provides multiple ready-to-use dashboards. With Managed Service for Prometheus, you do not need to build a self-managed monitoring systems or worry about the underlying data storage, data display, or system O&M. We recommend that you use Managed Service for Prometheus.

All scenarios are supported, such as microservices scenarios, cluster component metric collection, and observability customization for advanced monitoring features.

For more information, see Managed Service for Prometheus.

Open source Prometheus

The open source version of Prometheus is available in the marketplace of the ACK console.

All scenarios are supported, such as microservices (Service Mesh) scenarios, cluster component metric collection, and observability customization for advanced monitoring features.

For more information, see Use open source Prometheus to monitor an ACK cluster.

Monitor the events of clusters and containers

Solution

Description

Scenario

References

Event monitoring

Event monitoring outperforms resource monitoring in timeliness and accuracy and supports more scenarios. Developers can diagnose cluster anomalies based on the events that are collected in real time. We recommend that you use Simple Log Service to monitor events.

All scenarios are supported.

For more information, see Overview of event monitoring.

Application performance observability

Observability that covers application metrics, tracing, and logging in the observability system built on top of ACK. For example, you can deploy a Java application in ACK and monitor the number of threads of the application.

Solution

Description

Scenario

References

Intrusion-free APM for monitoring Java applications

We recommend that you use Application Real-Time Monitoring Service (ARMS) to monitor application performance. ARMS is an Alibaba Cloud monitoring service for Application Performance Management (APM). To monitor a Java application deployed in an ACK cluster, you only need to install the ARMS component for the Java application. No intrusion to the code is needed. You can use the component to locate faulty interfaces and slow interfaces, tune parameters, detect memory leaks, and identify system performance bottlenecks. This greatly improves troubleshooting efficiency.

Only certain scenarios are supported, such as Java application monitoring. The solution is intrusion-free.

For more information, see Monitor application performance.

APM with code rewriting

Tracing Analysis provides developers with a set of tools to diagnose performance bottlenecks in a distributed application architecture. These tools include trace mapping, request counter, trace topology, and application dependency analytics. Tracing Analysis improves the efficiency of microservices development and diagnostics. Tracing Analysis supports various open source SDKs, and supports the standards of OpenTracing and OpenTelemetry.

All scenarios are supported, including microservices (Service Mesh) and applications that use different programming languages. The solution complies with the OpenTelemetry standards. You need to rewrite the code if you use this solution.

For more information, see Enable distributed tracing in ASM.

Tracing Analysis provides a set of tools for you to develop distributed applications. These tools include trace mapping, call request statistics, trace topology, and application dependency analysis. You can use these tools to analyze and diagnose performance bottlenecks in a distributed application architecture and make microservice development and diagnostics more efficient.

The solution complies with the standards of OpenTracing and supports open source tracing platforms, such as Jaeger and Zipkin. The solution supports applications developed based on the following programming languages: Java, PHP, Go, Python, Node.js, .NET, C++, Ruby, and Swift.

For more information, see What is Tracing Analysis OpenTelemetry Edition? and Connection Description.

Business observability

Observability of businesses in the observability system built on top of ACK. After you deploy a highly available and scalable website based on ACK, you can view statistics such as page views (PVs) and unique visitors (UVs). Observability of businesses also allows you to audit application costs.

Solution

Description

Scenario

References

Custom logging and monitoring

We recommend that you use Simple Log Service to observe custom metrics. You can customize the content and format of application logs, use Simple Log Service to collect logs, and then configure dashboards in Simple Log Service to monitor your businesses or perform system auditing.

All scenarios are supported, such as traffic monitoring, cost auditing and statistics, and order trend analysis.

For more information, see Collect log data from containers by using Simple Log Service.

Custom dashboards with Managed Service for Grafana

Managed Service for Grafana is a cloud-native O&M data visualization platform. This platform provides O&M-free Grafana runtime environments that can be quickly launched. By default, Managed Service for Grafana can ingest data from Alibaba Cloud services such as database services, Message Queue, Managed Service for Prometheus, and Simple Log Service. Managed Service for Grafana also provides a variety of dashboards to allow you to monitor and maintain systems in a fine-grained manner.

Managed Service for Grafana allows you to analyze and view metrics, logs, and traces. You do not need to worry about server configurations or software updates. This greatly simplifies your O&M work. Empowered by the cloud-native capabilities of Alibaba Cloud, Managed Service for Grafana also comes with higher security and availability.

All scenarios are supported.

You can use Managed Service for Grafana to configure dashboards based on your business requirements. For example, you can create real-time dashboards to monitor PVs and UVs.

For more information, see What is Grafana?

Business traffic and business health monitoring with ARMS Browser Monitoring

ARMS Browser Monitoring is intended for web application, Weex, and mini-program monitoring. It monitors the heath of web applications and mini-programs by detecting web page loading speeds (speed testing), web page stability (JS error diagnostics), and success rate of external service calls (APIs).

This solution is suitable for front-end applications that use JavaScript.

For more information, see

What is ARMS Browser Monitoring?

References