Application Real-time Monitoring Service (ARMS) allows you to monitor the performance and availability of its console via its performance dashboard in Grafana. It also provides you with the ability to receive notifications when certain performance problems occur regarding the ARMS system itself.

Background information

System monitoring is essential when it comes to maintaining a highly available system and optimizing user experience. With the help of a monitoring service, you can keep track of the performance of your systems to avoid certain performance issues and have the ability to react promptly when problems occur so that the damage can be reduced to the minimum. This means that the monitoring service itself must be highly performant as well to make sure that the result it presents is accurate and reflects the actual state of the systems it monitors.

Therefore, ARMS provides you with the ability to monitor its performance. Through the performance dashboard that ARMS provides in Grafana, you are able to see the state of the ARMS console in real time. You can also receive notifications so that you can be notified when certain problems occur regarding the ARMS system itself.

ARMS performance dashboard

The ARMS performance dashboard presents different performance metrics to track the availability and performance of the ARMS console, including the console's QPS, API success rate, average RT, slow queries per minute, etc.

Dashboard 1Dashboard 2

Notification types for the ARMS system

ARMS provides the following notifications for you to monitor its performance. Once triggered, you are able to receive notifications via email, SMS message, and DingTalk.

Notification types Description
Data collection unavailable ARMS agent fails with runtime exception or stops collecting data.
Massive data collection triggering rate limiting ARMS agent collects large volumes of data, which exceeds the rate limit set in the ARMS console and triggers the rate limiting strategy.
Network error A network error occurs either in the ARMS system or on the user's side.
ARMS system in the release process The ARMS system is currently in the process of its regular maintenance or feature release.
Delayed message consumption Data cannot be processed due to low system resources and pile up causing delays in message consumption.
Lack of data completeness for alerts For existing alert rules, if the monitoring data collected by ARMS is incomplete, even if the current data meet the requirements of the alert rules, alert notifications will not be sent.