Build ubiquitous observable infrastructure

Gartner: Observability becomes the strongest support for data-driven decision-making

Recently, Gartner, a global authoritative IT research and consulting company, released the report of "Ten Strategic Technology Trends in 2023". The report focuses on the three themes of optimization, expansion and development. "Application observability" has once again become one of the hot trends.

Frances Karamouzis, Gartner Vice President of Outstanding Research, said: "In order to increase profits, while continuing to accelerate the digital transformation, enterprise IT executives need to shift their focus from cost saving to new and excellent operation and maintenance methods. Observability will feed back the observable data generated by users' digital operations in a highly coordinated and integrated way and create a decision-making cycle to improve the effectiveness of organizational decision-making. If it can be planned and implemented in the strategy, observability will become the strongest data-driven decision-making Support. "

However, with the rapid development of IT technology, enterprises will inevitably encounter three obstacles in the process of landing and observing. First of all, the booming open source/commercial observable product ecosystem and the traditional enterprise monitoring system that gradually failed to meet the cloud native IT operation and maintenance requirements have caused the separation of new and old tools, data and tools. How to choose and balance has become a choice that CTOs and CIOs must face. Secondly, when micro-service architecture and distributed architecture are more and more applied to enterprise business, the calculation cost and storage cost of typical observable data, such as logs, are increasing exponentially. In the increasingly severe industry situation, the observable cost investment is high and difficult to predict. The application scenario often stays in the single point troubleshooting or basic monitoring alarm, and the observable infrastructure is launched with great fanfare, and the return value is unknown. The above points are difficult to persuade CTOs and CIOs to invest in the increasingly tight operation and maintenance budget and manpower for the observable construction.

In order to solve the above problems, Alibaba Cloud, which is deeply involved in the field of observability, launched the Alibaba Cloud Native Observable Suite ACOS in June this year. The product suite consists of Alibaba Cloud Prometheus service, Alibaba Cloud Grafana service, and link tracking OpenTelemetry. The three fact standards with the highest popularity of open source and the widest ecological integration are the "core" of the cloud native observable suite ACOS, It aims to achieve full-link data standardization through open standards for all Alibaba Cloud observable products, connect the enterprise's stock observable data assets, and integrate with the Alibaba Cloud application hosting platform.

It comprehensively covers user experience (UEM), application observation (APM), cloud service observation, cost management, emergency coordination efficiency and other scenarios. Help enterprises efficiently build an open, high-quality and low-cost unified observable system.

Unique value of cloud native observable ACOS

Compared with other observable commercial or open source solutions, the cloud native observable suite is fully compatible and optimized with open source standards in the six links of acquisition, storage, calculation, alarm, query and visualization. At the same time, the considerable experience of Alibaba Group and Alibaba Cloud's massive users will be productively exported. This includes more than 50 Alibaba Cloud mainstream cloud service operation indicators, large scale and alarm rule preset templates. From infrastructure to container, from application to user experience, from cost analysis to operation and maintenance efficiency analysis, we can achieve high-quality observation of the whole link on the first day of access.

Since its release, many industry customers have rapidly built a unified observable system with the help of Alibaba Cloud's native observable suite ACOS. Taking AIA Life as an example, AIA Life transformed the application into containers and micro-services to meet the business and performance requirements. However, as the complexity of access links and deployment increases, it becomes a great challenge to observe the operation of microservices and K8s, and build a full stack of observable capabilities. With the help of ACOS, AIA Life will cover the whole R&D and production cycle with observability, associate and display the indicators of R&D status and operation and maintenance status, so as to effectively measure the R&D efficiency. At the same time, the observation of multi-container clusters and application services will be unified, and the application performance indicators, global call chain, and logs will be integrated for rapid root cause positioning. At the same time, the multi-dimensional observation capabilities of command and decision-making, dashboard display, and alarm push will be formed, greatly improving the efficiency of operation and maintenance services.

Cloud native observable ACOS is upgraded

The three components of Alibaba Cloud's native observable suite, ACOS, also received important upgrades at the cloud habitat conference.

First of all, Alibaba Cloud Prometheus monitoring, as the standard of container observation facts, extends the observation range from specialized containers to the whole stack. In order to help more enterprises build a unified observation system, Prometheus monitoring has become the default observation infrastructure for more than 50 cloud products of Alibaba Cloud, and is connected with the APM indicators, eBPF indicators, and OpenTelemetry indicators of the application real-time monitoring service ARMS, as well as the aggregation of enterprise ECS (non K8s cluster), K8s cluster, and non-Alibaba Cloud cluster for Prometheus instances, Help enterprises open a unified observable center under global and heterogeneous architectures with one click.

While serving external customers, Alibaba Cloud Prometheus monitoring is constantly polished through internal scenarios. At present, it has been able to support the container observation of tens of millions of cores and the time series storage capability of billions of time lines. For the core technical difficulties of timing monitoring scenarios, such as the ability to collect massive dynamic monitoring objects, the divergence and convergence of high-base time lines, long-cycle queries, false positives and false positives under sudden traffic, targeted optimization has made Alibaba Cloud Prometheus monitoring truly become a ubiquitous and available observable infrastructure for mass production.

While empowering the enterprise with strong observation capability, Prometheus launched a new monthly package billing form. Under the same business scale, the average cost is reduced by 60% compared with the self-built cost. Meet the observation needs of users of different business scales, and reduce the pressure of enterprise operation and maintenance costs as much as possible.

Secondly, as an observation interface, Alibaba Cloud's Grafana service will also usher in a new upgrade of 9.0. The new Prometheus and Loki query statement generators and the enhanced search explore function enable users to obtain stronger data query and analysis capabilities, and lower the threshold to create a visual market and alarm. At the same time, in order to cope with the increasingly rich heterogeneous observable data sources, Grafana service is integrated with more than 20 observable storage services, such as log service SLS and Elasticsearch, to help enterprises build a unified "operation and maintenance&business" observation interface more easily. One-click import/export of self-built instances, automatic data export reports, one-click data backup, recovery, user operation audit and other enterprise features have been further enhanced.

Finally, in order to help enterprises' cloud applications open a multi-dimensional observation perspective, the application real-time monitoring service ARMS also ushered in a huge upgrade. In terms of data collection, while fully supporting the OpenTelemetry SDK, indicator data can be completely stored and calculated with the Prometheus standard to supplement services and customize component buried points. Avoid vendor lock-in while improving the observation dimension. And realize the unified query of multi-source Trace with the help of TraceExplorer.

At the same time, the eBPF technology and Continuous Profiling are currently the most popular segments of the observable field, and the Alibaba Cloud observable team is also actively exploring. At this conference, the Alibaba Cloud Observation Team opened the preview of "lightweight application monitoring" based on eBPF technology to help enterprises quickly acquire non-intrusive, full-language application monitoring capabilities and timely perceive the global topology of the cluster.

At the same time, we jointly launched the Continuous Profiling function with the Alibaba Dragonwell team, which can continuously analyze the code-level performance overhead with extremely low power consumption, cover the details that cannot be covered by traditional links, indicators and logs, realize the code-level production environment performance problem location and active analysis around the clock, so that the application observation perspective is richer and the observation granularity is more detailed.

While continuously exploring more observable scenarios to serve Alibaba Group and a large number of enterprise users, Alibaba Cloud Observable has won high recognition from domestic and foreign industry institutions with its complete product capabilities, good ecological integration capabilities and excellent cost advantages. Alibaba Cloud Application Real-time Monitoring Service (ARMS) won the advanced certification of the first batch of observable products by the China Academy of Communications and Communications this year. At the same time, Alibaba Cloud has entered the Gartner APM and Observable Magic Quadrant for two consecutive years, and this year it has become the only Chinese manufacturer selected.

In the era of cloud computing, observability makes cloud computing easier to use and more efficient, and maximizes the value of business stability, security and economy. "Observation" has become the necessary core competitiveness of every IT person. Not only observation, but also observation can help enterprises analyze, insight and realize high-quality decision-making and business innovation. Alibaba Cloud will continue to promote the evolution and implementation of observable technologies, help enterprises achieve the most cost-effective observability, and truly realize high-quality digital transformation and innovation.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us