All Products
Search
Document Center

Application Real-Time Monitoring Service:Alert metrics for Application Monitoring

Last Updated:Mar 11, 2026

The Application Monitoring feature of Application Real-Time Monitoring Service (ARMS) collects metric data every minute. Use these metrics to configure alert rules that detect anomalies in your application's JVM, services, dependencies, infrastructure, and more.

JVM

Note

JVM metrics are for reference only. For authoritative definitions, see the official JVM documentation.

Metrics

Metric

Unit

Commonly used

Description

Number of JVM full GCs (instantaneous value)

-

Yes

Full garbage collections in the last N minutes. Frequent full GCs indicate potential memory pressure or application errors.

JVM full GC duration (instantaneous value)

ms

No

Time spent on full GCs in the last N minutes. Long full GC pauses cause application stuttering and degrade user experience.

Number of JVM young GCs (instantaneous value)

-

Yes

Young generation garbage collections in the last N minutes. A high count suggests rapid object creation or possible memory leaks.

JVM young GC duration (instantaneous value)

ms

No

Time spent on young GCs in the last N minutes. Longer durations indicate declining GC efficiency and may cause application pauses.

Total JVM heap memory

MB

No

Total heap memory allocated to the JVM, including young and old generations. Undersized heaps lead to frequent GCs; oversized heaps waste system resources.

Used JVM heap memory

MB

Yes

Heap memory currently in use. Monitor this metric to detect memory leaks or excessive memory consumption before they cause OutOfMemoryErrors.

Committed JVM non-heap memory

MB

No

Non-heap memory committed by the JVM. Excessive non-heap memory may indicate too many loaded classes or static variables.

Initial JVM non-heap memory

MB

No

Initial non-heap memory size. Dynamically calculated based on JVM version, operating system, and JVM parameters.

Maximum JVM non-heap memory

MB

No

Maximum non-heap memory. Controlled by MaxPermSize (Java 7 and earlier) or MaxMetaspaceSize (Java 8 and later).

Used JVM non-heap memory

MB

Yes

Non-heap memory in use, including Metaspace and PermGen.

Used JVM metaspace

MB

No

Memory used for class metadata (class structures, methods, and fields). Typically stable during normal operation.

Number of JVM blocked threads

-

No

Threads waiting to acquire a monitor lock. A high count may indicate lock contention and can degrade system performance.

Total number of JVM threads

-

Yes

Threads across all states. Too many threads can exhaust memory and CPU resources, affecting application stability.

Number of JVM deadlocked threads

-

No

Threads involved in deadlocks. When deadlocks occur, the affected threads cannot proceed, and the application may crash.

Number of new JVM threads

-

No

Threads recently created by the JVM. Excessive thread creation wastes system resources and adds scheduling overhead.

Number of JVM runnable threads

-

No

The maximum number of threads supported by the JVM at runtime. Excessive thread creation consumes significant memory resources and may cause the system to slow down or crash.

Number of JVM terminated threads

-

No

The number of threads that can run concurrently in the JVM at runtime. Control thread counts based on actual requirements to prevent resource waste or thread starvation.

Number of JVM timed-out waiting threads

-

Yes

Threads that timed out while waiting for a resource. A high count may indicate resource bottlenecks.

Number of JVM waiting threads

-

No

Threads in the waiting state. For high-concurrency applications, a rising count of waiting threads can signal performance degradation.

Number of JVM GCs (cumulative value)

-

No

Total garbage collections since JVM startup.

JVM mark-and-sweep garbage collection cycles (cumulative value)

-

No

Total mark-and-sweep GC cycles since JVM startup.

JVM heap memory usage (%)

-

No

Ratio of allocated heap memory to total heap memory. Keep this below 70% to avoid memory overflow risks.

Dimensions and filters

These metrics are collected per node IP address. Filter options:

Filter type

Description

Example

Traversal

Evaluate each node independently and create separate alerts per node.

-

Equals (=)

Alert on specific nodes only.

=172.20.XX.XX

No dimension

Aggregate data across all nodes into a single alert.

-

Scheduled tasks

Note

ARMS Application Monitoring supports only XXL-JOB, SchedulerX, and JDK-Timer scheduled task types.

Metrics

Metric

Unit

Commonly used

Description

Duration

ms

No

Average execution time of the scheduled task.

Total number of executions

-

No

Total times the scheduled task ran.

Number of execution errors

-

No

Times the scheduled task failed within the specified interval.

Scheduling latency

ms

No

Delay between the scheduled start time and actual task execution.

Dimensions and filters

These metrics are collected per scheduled task. Filter options:

Filter type

Description

Example

Traversal

Evaluate each scheduled task independently and create separate alerts.

-

Equals (=)

Alert on specific scheduled tasks only.

=LoadGenerator.mockUserApiLoad

No dimension

Aggregate data across all scheduled tasks into a single alert.

-

Exceptions

Metrics

Metric

Unit

Commonly used

Description

Number of exceptions

-

Yes

Runtime exceptions such as NullPointerException, ArrayIndexOutOfBoundsException, and IOException. Detects error spikes in call stacks.

Response time of abnormal interface calls

ms

Yes

Response time for interface calls that returned exceptions. Helps assess the performance impact of errors on specific interfaces.

Dimensions and filters

These metrics support two dimensions: interface name and exception.

By interface name:

Filter type

Description

Example

Traversal

Evaluate each interface independently.

-

Equals (=)

Alert on specific interfaces.

=/tb/api/users/{userId}

Not Equals (!=)

Exclude specific interfaces.

!=/tb/api/users/{userId}

Contains

Match interfaces containing a keyword.

Contains api

Does Not Contain

Match interfaces not containing a keyword.

Does Not Contain api

Regular expression

Match interfaces by regex.

=/(api)/i

No dimension

Aggregate across all interfaces.

-

By exception:

Filter type

Description

Example

Traversal

Evaluate each exception type independently.

-

Equals (=)

Alert on specific exceptions.

=FeignException$InternalServerError

Not Equals (!=)

Exclude specific exceptions.

!=FeignException$InternalServerError

Contains

Match exceptions containing a keyword.

Contains data

Does Not Contain

Match exceptions not containing a keyword.

Does Not Contain data

Regular expression

Match exceptions by regex.

=/(data)/i

No dimension

Aggregate across all exceptions.

-

Application dependency services

Metrics

Metric

Unit

Commonly used

Description

Number of application dependency service calls

-

No

Calls to downstream interfaces the application depends on. Monitor for unexpected changes in call volume.

Application dependency service call error rate (%)

-

No

Calculated as: abnormal downstream interface requests / total interface requests. An increasing error rate indicates dependency issues affecting your application.

Response time of application dependency service calls

ms

Yes

Average response time of downstream interface calls. Rising latency from dependency services may degrade your application's performance.

Number of slow calls of an application dependency service

-

No

Dependency service calls that exceeded the response time threshold. A high count suggests bottlenecks in downstream services.

Dimensions and filters

These metrics are collected per interface call type (such as HTTP, MySQL, and Redis). Filter options:

Filter type

Description

Example

Traversal

Evaluate each call type independently.

-

Equals (=)

Alert on specific call types.

=http

No dimension

Aggregate across all call types.

-

ECS instances

Metrics

Metric

Unit

Commonly used

Description

Node CPU utilization (%)

-

No

CPU utilization of the node. High utilization can cause slow response times and service unavailability.

Node CPU utilization in user mode (%)

-

No

CPU time spent on user-space processes such as web services and databases, as a percentage of total CPU time.

Idle node disk space

MB

Yes

Unused disk space. A full disk can cause system crashes or unexpected behavior.

Node disk utilization (%)

-

No

Ratio of used disk space to total disk space. Higher utilization means less available storage.

Node system load

-

Yes

System load average. For a node with N CPU cores, the maximum recommended load is N.

Idle node memory

MB

Yes

Unused memory. Insufficient memory may trigger out-of-memory (OOM) errors.

Node memory usage (%)

-

No

Percentage of memory in use. If usage exceeds 80%, reduce memory pressure by adjusting configurations or optimizing workloads.

Number of error packets received on the node

-

No

Error packets received during network communication, possibly caused by transmission or application issues.

Number of error packets sent from the node

-

No

Error packets sent during network communication. Helps check for network anomalies.

Number of JVM instances

-

Yes

JVM instances running in real time. Typically used to detect service downtime.

Number of bytes sent from the node

-

No

Data volume sent over the network, including application data, system messages, and error messages.

Number of packets sent from the node

-

No

Total packets sent over the network.

Number of bytes received on the node

-

No

Total data volume received over the network.

Number of packets received on the node

-

No

Total packets received over the network.

Dimensions and filters

These metrics are collected per node IP address. Filter options:

Filter type

Description

Example

Traversal

Evaluate each node independently.

-

Equals (=)

Alert on specific nodes.

=172.20.XX.XX

No dimension

Aggregate across all nodes.

-

Containers

Note

Container CPU and memory metrics require ARMS agent v4.1.0 or later.

Metrics

Metric

Unit

Commonly used

Description

CPU utilization in user mode

-

No

CPU time spent executing code in user space, including application logic and non-kernel library functions.

CPU utilization in kernel mode

-

No

CPU time spent on kernel operations such as system calls, interrupt handling, and kernel services.

Total CPU utilization

-

Yes

Sum of user-mode and kernel-mode CPU utilization.

Memory usage

Bytes

Yes

Memory actively used by the container at runtime, including non-swappable memory and active cached data.

Number of sent network packets

-

No

Packets sent from the container over the network.

Number of sent bytes

Bytes

Yes

Bytes sent from the container over the network.

Number of sent error packets

-

No

Error packets sent during network communication. Helps detect container network issues.

Number of sent discarded packets

-

No

Outbound packets dropped by the system or network stack since the container network interface started.

Number of received packets

-

No

Packets received by the container over the network.

Number of received bytes

Bytes

Yes

Total data received by the container over the network.

Number of received error packets

-

No

Error packets received during network communication. Received error packets may prevent the container from processing network traffic correctly.

Number of received discarded packets

-

No

Inbound packets dropped by the system or network stack since the container network interface started.

Dimensions and filters

These metrics are collected per node IP address. Filter options:

Filter type

Description

Example

Traversal

Evaluate each container independently.

-

Equals (=)

Alert on specific containers.

=172.20.XX.XX

No dimension

Aggregate across all containers.

-

Application providing services

Metrics

Metric

Unit

Commonly used

Description

Number of calls

-

Yes

Entry-point calls to the application, including HTTP and Dubbo calls. Useful for analyzing traffic volume and detecting anomalies.

Number of slow calls

-

No

Entry-point calls (HTTP and Dubbo) that exceeded the response time threshold.

Number of error calls

-

Yes

Entry-point calls that returned HTTP status code 400 or were intercepted by the top layer of Dubbo.

Call error rate (%)

-

Yes

Calculated as: error entry-point calls / total entry-point calls x 100%.

Call response time

ms

Yes

Average response time of entry-point calls (HTTP and Dubbo). Helps identify slow requests and exceptions.

Dimensions and filters

These metrics support two dimensions: interface name and interface call type.

By interface name:

Filter type

Description

Example

Traversal

Evaluate each interface independently.

-

Equals (=)

Alert on specific interfaces.

=/tb/api/users/{userId}

Not Equals (!=)

Exclude specific interfaces.

!=/tb/api/users/{userId}

Contains

Match interfaces containing a keyword.

Contains api

Does Not Contain

Match interfaces not containing a keyword.

Does Not Contain api

Regular expression

Match interfaces by regex.

=/(api)/i

No dimension

Aggregate across all interfaces.

-

By interface call type:

Filter type

Description

Example

Traversal

Evaluate each call type independently (HTTP, MySQL, Redis, etc.).

-

Equals (=)

Alert on specific call types.

=http

No dimension

Aggregate across all call types.

-

Thread pools

Metrics

Metric

Commonly used

Description

Number of core threads

Yes

Always-active threads in the pool.

Maximum number of threads

Yes

Upper limit of concurrent threads in the pool.

Number of active threads

Yes

Threads currently executing tasks. Evaluates thread pool performance and utilization.

Queue size

Yes

Task queue capacity. A queue that is too small causes long wait times; a queue that is too large can exhaust system resources.

Current number of threads

Yes

Threads that are running or waiting to run.

Number of executed tasks

Yes

Tasks completed by the thread pool. Evaluates throughput.

Thread pool usage (%)

Yes

Ratio of threads in use to the total thread pool size.

Dimensions and filters

These metrics support three dimensions: node IP address, thread pool name, and thread pool type.

By node IP address:

Filter type

Description

Example

Traversal

Evaluate each node independently.

-

Equals (=)

Alert on specific nodes.

=172.20.XX.XX

No dimension

Aggregate across all nodes.

-

By thread pool name:

Filter type

Description

Example

Traversal

Evaluate each thread pool independently.

-

Equals (=)

Alert on specific thread pools.

=pool-*-thread-*

No dimension

Aggregate across all thread pools.

-

By thread pool type:

Filter type

Description

Example

Traversal

Evaluate each thread pool type independently.

-

Equals (=)

Alert on specific thread pool types.

=FixedThreadPool

No dimension

Aggregate across all thread pool types.

-

HTTP status codes

Metrics

Metric

Commonly used

Description

Number of HTTP requests with 4xx status codes

Yes

Requests that returned a 4xx status code, indicating client errors such as missing resources or parameters. Common codes: 400, 404.

Number of HTTP requests with 5xx status codes

Yes

Requests that returned a 5xx status code, indicating server errors or system overload. Common codes: 500, 503.

Dimensions and filters

These metrics are collected per interface name. Filter options:

Filter type

Description

Example

Traversal

Evaluate each interface independently.

-

Equals (=)

Alert on specific interfaces.

=/tb/api/users/{userId}

Not Equals (!=)

Exclude specific interfaces.

!=/tb/api/users/{userId}

Contains

Match interfaces containing a keyword.

Contains api

Does Not Contain

Match interfaces not containing a keyword.

Does Not Contain api

Regular expression

Match interfaces by regex.

=/(api)/i

No dimension

Aggregate across all interfaces.

-

Databases

Metrics

Metric

Unit

Commonly used

Description

Number of database requests

-

Yes

Read or write requests sent to the database at runtime. High request volume affects application performance and response time.

Number of database request errors

-

Yes

Failed database requests, including connection failures, query errors, and permission issues. A high error count indicates problems with application-database interaction.

Database request response time

ms

Yes

Time between sending a database request and receiving a response. High response times cause application stuttering or slowdowns.

Number of slow database requests

-

No

Database requests that exceeded the response time threshold. Frequent slow requests degrade application performance.

Dimensions and filters

These metrics are collected per database name. Filter options:

Filter type

Description

Example

Traversal

Evaluate each database independently.

-

Equals (=)

Alert on specific databases.

=mysql-pod:3306(demo_db)

No dimension

Aggregate across all databases.

-