Observability is the ability to infer internal states of a system based on external outputs of the system. The observability of Kubernetes includes monitoring and logging. Monitoring allows developers to keep track of the operations of a system. Logging facilitates diagnostics and troubleshooting. This topic provides insights into the observability of Container Service of Kubernetes (ACK) and the observability of each layer. This helps you gain a comprehensive understanding of observability.

Observability of ACK

The observability of a system architecture that is built on top of ACK can be achieved at four layers. The four layers from bottom to top are: infrastructure, container performance, application performance, and business.
Observability

The following section describes the observability of each layer.

Infrastructure observability

Observability of underlying resources of ACK. This allows you to locate the traces of resource pools that are composed of pods and nodes, visualize topological relationships, and monitor infrastructure. For example, you can monitor the performance of hosts and basic network plug-ins.

SolutionDescriptionScenarioReferences
Visualization of architecture discoveryBusiness within an ACK cluster runs in resource pools that are composed of nodes. It is difficult to locate the traces and topological relationships of pods. Therefore, the challenges are how to monitor and visualize the loads of the ACK cluster and how to visualize the throughput of the ACK cluster. We recommend that you use the architecture discovery feature that is provided by Application High Availability Service (AHAS). Applicable to all scenarios.For more information, see Architecture-aware monitoring.
Collection of infrastructure metricsResource monitoring is the most commonly used method to monitor the underlying resources of ACK. You can monitor the usage of CPU, memory, and network resources. In ACK, the resource monitoring feature is integrated with CloudMonitor. By default, the CloudMonitor agent is installed for ACK clusters. Applicable to all scenarios.For more information, see Monitor basic resources.

Container performance observability

Observability of ACK containers on which the system is built. This allows you to monitor the performance of clusters, containers, and container components, and detect cluster events.

  • Collect the performance metrics of clusters and containers
    SolutionDescriptionScenarioReferences
    Integration of CloudMonitor into ACKACK is integrated with CloudMonitor. By default, the CloudMonitor agent is installed in ACK clusters to collect some of the performance metrics for clusters and containers. You can view the monitoring data in the ACK console. Applicable to some scenarios. This solution allows you to customize the observability of basic performance metrics for containers. For more information, see Monitor basic resources.
    Prometheus ServicePrometheus is an open source service that is used to observe cloud-native monitoring metrics of containers. Prometheus Service is a managed monitoring service that is fully interfaced with the open source Prometheus ecosystem. Prometheus Service monitors a wide array of components and provides multiple ready-to-use dashboards. Prometheus Service saves you the effort to manage underlying services, such as data storage, data presentation, and system maintenance. We recommend that you use Prometheus Service. Applicable to all scenarios. This solution allows you to monitor microservices and cluster components. You can also use Prometheus Service to customize monitoring metrics. For more information, see Use Alibaba Cloud Prometheus Service to monitor an ACK cluster.
    Open source PrometheusThe open source version of Prometheus is available in the marketplace of the ACK console. Applicable to all scenarios. This solution allows you to monitor microservices and cluster components. You can also use Prometheus Service to customize monitoring metrics. For more information, see Use open source Prometheus to monitor an ACK cluster.
  • Monitor the events of clusters and containers
    SolutionDescriptionScenarioReferences
    Event monitoringEvent monitoring is a monitoring method provided by Kubernetes. It makes up for the disadvantages of resource monitoring in terms of timeliness, accuracy, and scenarios. Developers can diagnose cluster anomalies based on the events that are collected in real time. We recommend that you use Log Service to monitor events. Applicable to all scenarios. For more information, see Event monitoring.

Application performance observability

Observability of applications that are deployed in ACK, which includes metrics, tracing, and logging. Observability of applications allows you to monitor the number of threads for a Java application after you deploy the application in an ACK cluster.

SolutionDescriptionScenarioReferences
APM without code rewritingWe recommend that you use Application Real-Time Monitoring Service (ARMS) to monitor application performance. ARMS is an Alibaba Cloud monitoring service for Application Performance Management (APM). After you deploy the ARMS agent in Java applications that are deployed in an ACK cluster, you can use ARMS to monitor Java and PHP applications without code rewriting. ARMS allows you to locate abnormal and slow API operations, reproduce the parameters of API calls, detect memory leaks, and discover application bottlenecks. This significantly improves the efficiency of online diagnostics. Applicable to Java and PHP applications. You can use this solution without the need to modify the code of the applications. For more information, see Monitor application performance.
APM with code rewritingTracing Analysis provides developers with a set of tools to diagnose performance bottlenecks in a distributed application architecture. These tools include trace mapping, request counter, trace topology, and application dependency analytics. Tracing Analysis improves the efficiency of microservice development and diagnostics. Tracing Analysis is compatible with open source SDKs and supports the OpenTracing standard. Applicable to all scenarios. You can use this solution to monitor microservices and applications in diverse programming languages. This solution rewrites the code of applications. For more information, see Enable distributed tracing in ASM.

Business observability

Observability of business in the system that is built on top of ACK. After you deploy a highly available and scalable website based on ACK, you can view statistics such as page views (PVs) and unique visitors (UVs). Observability of business also allows you to audit application costs.

SolutionDescriptionScenarioReferences
Custom logging and monitoringWe recommend that you use Log Service to observe custom metrics. You can customize the content and format of application systems, use Log Service to collect log data, and configure dashboards in Log Service. This way, you can observe your business or perform system audits. Applicable to all scenarios. You can use this solution to monitor traffic, audit costs, and perform business order statistics. For more information, see Collect log data from containers by using Log Service.

References