All Products
Search
Document Center

Application Real-Time Monitoring Service:Performance test report of the ARMS agent for Go

Last Updated:Mar 11, 2026

The Application Real-Time Monitoring Service (ARMS) agent for Go instruments your application at compile time, providing observability without manual code changes. This report benchmarks the resulting performance overhead across three traffic levels (500, 1,000, and 2,000 QPS) and two sampling rates (10% and 100%). Review these results to assess the impact before you connect your application to Application Monitoring.

Key findings

  • CPU: Under 10% additional usage across all tested scenarios.

  • Memory: Under 1.3% increase at all traffic levels.

  • Response time: 1--2 ms increase, even at 2,000 queries per second (QPS).

  • Sampling rate: Switching from 10% to 100% sampling adds roughly 2% CPU overhead. Memory and response time differences are negligible.

Test setup

Architecture

Test architecture

The test application is a Go service built with Net/HTTP that handles two types of requests:

  • MySQL queries -- 50% of total QPS

  • Redis operations -- 50% of total QPS

Performance Testing (PTS) generates the load. The Go application, MySQL, and Redis all run in the same Container Service for Kubernetes (ACK) cluster.

Environment

ComponentSpecification
Load generatorPerformance Testing (PTS)
Kubernetes clusterACK with ecs.c6.2xlarge nodes
Node OSAlibaba Cloud Linux 3.2104 LTS 64-bit
Pod resources1 core, 2 GB memory, 2 replicas
Agent versionARMS agent for Go V1.0.0

Procedure

Each stress test runs for 1 hour, preceded by a 3-minute warm-up at 100 QPS.

  1. Establish baseline. Run stress tests at 500, 1,000, and 2,000 QPS without the agent. Record CPU, memory, and response time as baseline metrics.

  2. Test with 10% sampling. Install the ARMS agent for Go, set the sampling rate to 10% in the sampling policy, and repeat the same stress tests.

  3. Test with 100% sampling. Set the sampling rate to 100% and repeat the stress tests.

Note

All basic Application Monitoring features are enabled during testing: metrics, traces, and runtime monitoring. All plug-ins are also enabled. Runtime monitoring adds approximately 0.5% CPU utilization. To reduce overhead, disable runtime monitoring in application settings.

Results

Baseline (no agent)

Traffic levelCPUMemoryResponse time
500 QPS2.42%0.71%30 ms
1,000 QPS4.21%0.91%30 ms
2,000 QPS8.5%1.41%30 ms
Note
  • CPU: Percentage of total CPU consumed by the application pods.

  • Memory: Percentage of total memory consumed by the application pods. Because pod memory grows naturally until it reaches the requests value, this report uses the actual memory reading at the end of each 1-hour test.

  • Response time: Average across all requests, in milliseconds.

With the ARMS agent

Traffic levelSampling rateCPUMemoryResponse time
500 QPS10%5.15%1.25%30 ms
1,000 QPS10%8.42%1.52%31 ms
2,000 QPS10%16.2%2.5%31 ms
500 QPS100%5.25%1.85%31 ms
1,000 QPS100%10.48%2.02%32 ms
2,000 QPS100%18.45%2.63%32 ms

Overhead (difference from baseline)

Traffic levelSampling rateCPUMemoryResponse time
500 QPS10%+2.73%+0.54%0 ms
1,000 QPS10%+4.21%+0.61%+1 ms
2,000 QPS10%+7.7%+1.09%+1 ms
500 QPS100%+2.83%+1.14%+1 ms
1,000 QPS100%+6.27%+1.11%+2 ms
2,000 QPS100%+9.95%+1.22%+2 ms

Conclusions

  1. CPU overhead stays under 10%. At the highest tested load (2,000 QPS) with 100% sampling, CPU increases by 9.95%.

  2. Memory overhead is minimal. The largest observed increase is 1.22%, at 2,000 QPS with 100% sampling.

  3. Response time impact is negligible. At 2,000 QPS, latency increases by 1 ms (10% sampling) or 2 ms (100% sampling).

  4. Sampling rate trade-off is small. Moving from 10% to 100% sampling adds roughly 2% more CPU overhead, with marginal effects on memory and response time. To lower overhead further, reduce the sampling rate in the sampling policy or disable runtime monitoring in application settings (saves approximately 0.5% CPU).