The Application Monitoring sub-service of Application Real-Time Monitoring Service (ARMS) is an application performance management (APM) service. By installing an ARMS agent for your application, you can comprehensively monitor the application without the need to modify your code. You can also keep track of the status of the application, quickly locate abnormal and slow interfaces, identify performance bottlenecks, and restore request parameters. This greatly improves the efficiency of error diagnostics. Application Monitoring provides the following benefits.
Out-of-the-box use | Guaranteed stability | Unlimited scale |
|
|
|
Advanced diagnostic capability | Integration capability and open source compatibility | Cost-effectiveness |
|
|
|
Comparison between Application Monitoring and open source APM services
Item | Application Monitoring | Open source APM service |
Resource purchase and system construction | Resources are fully managed by Alibaba Cloud. | You must purchase related resources and deploy systems on your own. |
O&M cost | No O&M operations are required. | Routine O&M operations are required. |
Application integration | Applications deployed in ACK or ECS can be integrated into Application Monitoring with simple configurations. The ARMS agent can be automatically upgraded. | Applications are manually integrated, and the agent is manually upgraded. This requires a heavy workload. |
Performance overhead | The performance overhead is less than 5%. By using methods such as lazy loading, lossless calculation, trace throttling, sampling protection, automatic URL convergence, long text compression and encoding, and memory control, Application Monitoring ensures the persistent stability of the ARMS agent. | In high-throughput scenarios, the performance overhead exceeds 10%, and the stability cannot be guaranteed. |
SLA guarantee | A service availability of 99.5% is provided based on the SLA. Measures such as multi-zone disaster recovery, service level objective (SLO) monitoring and alerting, and emergency response rotation. | Not supported. |
Performance and horizontal scaling | Automatic horizontal scaling is supported. A maximum of 100,000 nodes can be added. | Distributed horizontal scaling capabilities are not supported. |
Application and instance tags | You can query the topology, monitoring data, and trace data by tag. | Not supported. |
Dubbo instrumentation | The durations of routing, addressing, and encoding are recorded in detail. | Instrumentation is more coarse-grained. |
Lossless calculation | The end-side pre-aggregation and adaptive sampling technologies are used to collect the traces of applications. This ensures that the sampling rate does not affect the accuracy of data collection. | Not supported. You can only rely on sampling. |
Service interface monitoring | You can construct service requests in a visualized manner without modifying the service code. A wide range of performance metrics and diagnostic capabilities that fit your business are provided. | You need to modify the service code. |
Interface name convergence | Automatic convergence and manual convergence based on regular expressions can be directly configured without restarting the application. | You must manually modify the configuration file and restart the application. |
Local method stack analysis | Local method stack information related to slow calls is automatically saved. This helps you analyze performance bottlenecks that occur during the execution of local method stacks. | You can manually save local method stack information only for specific services. |
Thread profiling | Thread-specific statistics of CPU time consumption and the number of threads for each type are provided to simulate the code execution process. | Not supported. |
Thread pool monitoring and connection pool monitoring | You can monitor specific thread pools, such as Tomcat and Dubbo, and specific connection pools, such as Druid. | Not supported. |
Exception analysis and error analysis | Exception analysis and error analysis views are provided. | Not supported. |
End-to-end trace query | Integrated with the Browser Monitoring sub-service of ARMS, Application Monitoring connects the user interface to the server application. End-to-end trace query is supported. | Not supported. |
Insights | Based on the SRE experience cumulated in business scenarios, intelligent insight capabilities are built to troubleshoot complex issues, and traffic and latency spikes. | Not supported. |
Memory snapshot | You can create and analyze memory snapshots to troubleshoot memory issues such as memory leakage and memory waste. | Not supported. |
Arthas integration | Application Monitoring uses the bytecode enhancement technology to display the details of application runtime, such as method parameters, exceptions, and returned values, without restarting processes. | Not supported. |
Alert rule | Application Monitoring provides more than 50 preset alert rules for metrics about JVMs, hosts, and interfaces. You can configure common operators, perform period-over-period comparison, and specify threshold values in the ARMS console. | You must manually modify the configuration file. Only basic operators such as equal (=), less than (<), and greater than (>) are supported. |
Alert notification | Integrated with the Alert Management sub-service of ARMS, Application Monitoring supports multi-channel alert push, alert workflow, grouping, compression, and denoising capabilities to help you complete the closed loop of IT service management. | You must manually build components to configure alerting, which cannot effectively prevent from false positives or alert storms. |
Prometheus integration | Application metrics collected and processed by Application Monitoring are stored in Managed Service for Prometheus instances that belong to your Alibaba Cloud account. Default Grafana dashboards are provided. You can use PromQL to customize and develop the dashboards. | Not supported. |
Cost | You can start or stop using Application Monitoring at any time. Billing simultaneously takes or loses effect. If you use Application Monitoring to monitor applications deployed in ACK and purchase resource plans, you can get a discount and further reduce costs. | You need to build a complete set of components and properly manage the capacity. If a large number of requests are initiated, complete dependence on sampling results in huge costs. |
Technical support | You can use the ticket system to obtain technical support from SRE experts. | Not supported. |