The Application Real-Time Monitoring Service (ARMS) agent provides the continuous profiling technology for applications at runtime. Similar to other data collection tools, the technology introduces a certain level of performance overhead. However, ARMS has implemented several optimization techniques to minimize this overhead, ensuring stable application operation. In this test report, real-world use scenarios are simulated to test the performance overhead introduced by continuous profiling under different business traffic conditions. You can refer to this report to thoroughly evaluate the performance impact before utilizing continuous profiling.
Test scenario
Flowchart:
Developed in Spring Web model-view-controller (MVC), the Java application will access MySQL and Redis based on the requests sent from Alibaba Cloud Performance Testing (PTS). For the ${mall-gateway}/case/api/v1/mysql/execute request, the application accesses MySQL 1 to 4 times per request. For the ${mall-gateway}/case/api/v1/redis/execute request, it accesses Redis 1 to 10 times per request. Each request type accounts for 50% of the queries per second (QPS).
Test environment
The stress testing source is provided by PTS.
The Java application, MySQL, and Redis are all deployed in the same Alibaba Cloud Container Service for Kubernetes (ACK) cluster. The node instance type is ecs.u1-c1m2.8xlarge, running Alibaba Cloud Linux 2.1903 LTS 64-bit.
Application pods are configured with 2 cores, 4 GB of memory, and two replicas each.
An ARMS agent for Java v4.2.1 is used.
Test procedure
Install the ARMS agent. Configure a 10% sampling rate. Perform three stress tests at 500, 1,000, and 2,000 QPS. The duration of each stress test is 1 hour. Before each stress test, warm up the Java application at 50 QPS for 5 minutes. Then, run the application for 30 minutes. When all metrics become stabilized, dynamically enable all continuous profiling features (including CPU diagnostics, memory diagnostics, and code diagnostics), and continue running the application for another 30 minutes. Observe the differences in various performance metrics (CPU overhead, memory overhead, and response time) of the application before and after enabling continuous profiling.
Repeat the stress tests from step 1 with the agent enabled and sampling rate set to 100%. Compare the CPU and memory overhead, along with the response time of the application.
Performance metric data without continuous profiling
Item | 10% sampling rate | 100% sampling rate | ||||
CPU | Memory | Response Time | CPU | Memory | Response Time | |
500 QPS | 8.112% | 13.52% | 55.5 ms | 8.416% | 13.62% | 56.5 ms |
1000 QPS | 15.247% | 14.14% | 62.9 ms | 15.614% | 14.42% | 65.3 ms |
2000 QPS | 30.550% | 14.64% | 70.6 ms | 30.945% | 14.67% | 71.1 ms |
Performance metric data with continuous profiling
Item | 10% sampling rate | 100% sampling rate | ||||
CPU | Memory | Response Time | CPU | Memory | Response Time | |
500 QPS | 8.912% | 15.52% | 55.6 ms | 9.316% | 15.71% | 56.6 ms |
1000 QPS | 17.140% | 16.24% | 63.0 ms | 17.710% | 16.82% | 65.4 ms |
2000 QPS | 34.650% | 16.84% | 70.7 ms | 35.245% | 16.89% | 71.3 ms |
Continuous profiling performance overhead
Item | 10% sampling rate | 100% sampling rate | ||||
CPU | Memory | Response Time | CPU | Memory | Response Time | |
500 QPS | +0.80% | +2.00% | +0.1 ms | +0.90% | +2.09% | +0.1 ms |
1000 QPS | +1.893% | +2.10% | +0.1 ms | +2.096% | +2.40% | +0.1 ms |
2000 QPS | +4.10% | +2.20% | +0.1 ms | +4.30% | +2.22% | +0.2 ms |
Conclusion
Enabling all continuous profiling features results in a CPU and memory overhead within 5%. Enabling only specific features, such as code diagnostics, further reduces the overhead.
Continuous profiling marginally affects application latency. No significant changes in response time are observed under typical stress conditions before and after enabling continuous profiling.