What is convergence?
In Application Real-Time Monitoring Service (ARMS), convergence refers to the process of streamlining metric data. The ARMS agent collects various types of metric data, such as the number of requests, response time, and error count. These metrics include dimension data, such as the server IP address and API operation name, to improve the diversity and accuracy of monitoring information. However, a magnitude of dimensions may exhibit high cardinality caused by dimension divergence, imposing significant challenges to the storage system such as write loss and slow queries, and inflating the user bill. To address these challenges, ARMS employs various convergence mechanisms for two purposes. One is to control cardinality by restricting the number of dimensions. The other is to identify the valuable dimensions for effective monitoring. Suppose that a RESTful API that contains the /api/v1/users/{ID}/info operation is used to provide services. In this operation, ID is a variable. If the system directly records the request URLs, numerous URLs must be recorded, causing high cardinality. In addition, users cannot easily check the interface performance based on the metric data. The following sections describe the convergence mechanisms available in ARMS to help you understand how the mechanisms work and the results of convergence.
Convergence results and triggers
The following table describes the results and triggers of convergence mechanisms in ARMS.
Convergence result | Trigger |
{ARMS_IP}:80 | The number of IP addresses accessing the same port exceeds the threshold (50 by default). |
{ARMS_STATIC_REQ} or {ARMS_S_XXX} | The URL pertains to static resources. |
{ARMS_ATTACK_REQ} | The URL contains strings that can be exploited by attackers. |
{ARMS_PARAMED_REQ} | The URL contains parameters. |
{ARMS_OTHERS} | The number of dimension values recorded within a period of time exceeds the threshold. Note For the default threshold value, see Cardinality space convergence. |
{ARMS_NUMBER} | Terms split by |
{ARMS_WORD} | Terms split by |
{ARMS_ANY} | Terms split by |
{XXX} | The URL uses annotations of SpringController. |
Convergence mechanism
All convergence mechanisms are enabled by default and can be manually disabled by users, except for cardinality space convergence.
The data types supported by different mechanisms vary. Therefore, the supported data types are specified for each mechanism.
Spring annotation convergence
For conventional Web APIs, using the request URL as a dimension value is acceptable. However, for RESTful APIs or URLs with variables, directly recording the request URLs can lead to dimension divergence. Therefore, ARMS extracts relevant annotations (such as @RequestMapping) as the API dimension value for applications using the Spring Web framework.
Convergence logic
Reads the path information from the route annotations of Spring URL.
Convergence result
The value configured in the route annotation.
Supported data type
URL: Only URLs that provide services to external systems are supported. External URLs are not supported.
Example
APIController defines the following API to obtain user information. The API contains a path variable. When collecting data from the API, ARMS extracts the annotation (/api/v1/user/{userId}/info) as the convergence result.
@RestController
@RequestMapping("/api/v1")
public class APIController {
@RequestMapping("/user/{userId}/info")
public String getUserInfo(@PathVariable("userId") String userId) {
return "hello " + userId;
}
}
Convergence result:
/api/v1/user/1234/info
is converged to /api/v1/user/{userId}/info
.
Custom convergence
This feature allows users to define custom convergence rules to meet specific needs.
Convergence logic
Matches the custom convergence rules one by one. The matched rule is applied.
Convergence result
The result is subject to user configuration.
Supported data type
URL
Example
Custom rule: Converge all requests that match /api/v1/user/[\d]+/info
to /api/v1/getUserInfo
.
All requests that match the /api/v1/user/[\d]+/info
regular expression are converged to /api/v1/getUserInfo
.
Static resource convergence
ARMS typically does not monitor static resource requests. However, some legacy probes still collect metrics of static resources. Convergence is enabled for these metrics by default because static resources typically lack monitoring value and their URLs change frequently.
Convergence logic
Checks whether the suffix of a URL matches the default static resource extensions. If so, converges the URL.
Default static resource extensions: .log .7z .tgz .jpg .jpeg .png .gif .css .js .ico .woff2 .xml .svg .pdf .txt .text .ppt .word .xlsx .tar.gz .tar.bz2 .sh .yml .yaml .zip .log .gz .ttf .woff .eot .rar .properties
Convergence result
The default result is {ARMS_STATIC_REQ}
. With the advanced options enabled (requiring a ticket), the result will include the resource suffix.
Supported data type
URL
Example
By default, /api/v1/hello.jpg
is converged to {ARMS_STATIC_REQ}
. With the advanced options enabled, it is converged to{ARMS_S_JPG}
.
Attack request convergence
User services may encounter inexplicable attack requests (such as attempts to read the /etc/passwd file). These requests are crafted by attackers and change frequently. Recording all of them can significantly strain storage resources.
Convergence logic
Checks whether a URL contains characters that can be exploited for attacks. If so, converges the URL.
Default attack characters: ' $ \ ' !
Convergence result
{ARMS_ATTACK_REQ}
Supported data type
URL
Example
/app/v1/user/info?cmd='more /etc/passwd'
is converged to {ARMS_ATTACK_REQ}
.
Query parameter convergence
By default, the ARMS agent does not obtain the parameter information when collecting URLs. However, in some scenarios, URLs contain query parameters which can lead to dimension divergence.
Convergence logic
Checks whether a URL contains query parameters. If so, converges the URL.
Default query parameter delimiters:; ? &
Convergence result
The default result is {ARMS_PARAMED_REQ}. With the advanced options enabled (requiring a ticket), the convergence result retains the URL and replaces the parameter with {ARMS_REQ_PARAMS}.
Supported data type
URL
Example
By default, /api/v1/user/info?userId=12345
is converged to {ARMS_PARAMED_REQ}
. With the advanced options enabled, it is converged to /api/v1/user/info?{ARMS_REQ_PARAMS}
.
Meaningless word convergence
URLs with excessively long words or digits are likely to diverge. By default, ARMS replaces words or digits that are too long.
Convergence logic
Splits a URL into an array of terms by slashes (/
) and checks each term in the array to see if any term exceeds the specified length threshold. If so, replaces the term.
Maximum length of words: 64
Maximum length of digits: 10
Maximum length of digits in words: 10
Convergence result
Excessively long digits are converged to {ARMS_NUMBER}.
Excessively long words are converged to {ARMS_WORD}.
Excessively long digits in words are converged to {ARMS_ANY}.
Supported data type
URL
Example
/api/2024040710/hello2024040710
is converged to /api/{ARMS_NUMBER}/{ARMS_ANY}
.
Intelligent convergence
After the preceding convergence mechanisms are applied, a large number of divergent URLs may still be recorded. ARMS periodically calculates convergence rules using algorithms to replace these URLs.
Convergence logic
The logic is complex. Only a brief description is provided here.
Groups sample URLs using algorithms.
Converges URLs based on the URL pattern of each group and generates convergence rules.
Merges convergence rules of different groups.
Convergence result
Divergent parts composed entirely of digits are replaced with {ARMS_NUMBER}.
Divergent parts composed entirely of letters are replaced with {ARMS_WORD}.
Divergent parts composed of digits and letters are replaced with {ARMS_ANY}.
Supported data type
URL
Example
/api/product/1/info
/api/product/2/info
....
/api/product/N/info
For URLs in the example, the server calculates and generates the following convergence rule: /api/product/[\d]+/info
(regular expression matching).
The convergence result is as follows: /api/product/{ARMS_NUMBER}/info
.
Subsequent requests that match the preceding regular expression are converged to /api/product/{ARMS_NUMBER}/info
.
SQL normalization
The ARMS agent may collect a large number of SQL statements due to factors such as database and table sharding, annotation, and plaintext. By default, ARMS processes each SQL statement by replacing the parts that are likely to diverge.
Normalization logic
The logic is complex. Only a brief description is provided here.
Removes annotations.
Replaces plaintext contents.
Replaces the names of sharded databases and tables.
...
Normalization result
The divergent parts are replaced.
Supported data type
SQL
Example
select * from cache_0 where ckey='23'
The preceding SQL statement is converged to:
select * from cache_{NUM} where ckey=?
IP address convergence
If applications monitored by ARMS depend on many external services that are accessed by using IP addresses, the ARMS agent may collect a large number of these IP addresses, which may lead to divergence.
Convergence logic
Groups IP addresses by port.
If the number of IP addresses in a group exceeds the specified threshold, applies convergence. Default threshold: 50
Convergence result
{ARMS_IP}:port
Supported data type
IP
Example
1.1.1.1:8080
...
1.1.1.255:8080
The preceding IP addresses are converged to {ARMS_IP}:8080
.
Cardinality space convergence
The preceding convergence mechanisms can manage high cardinality issues in URLs effectively. However, for SQL and other types, despite extensive divergence handling, a large number of dimension values may still be recorded. To address this issue, ARMS limits the number of dimension values that can be recorded within a period of time.
Convergence logic
The logic is complex. Only a brief description is provided here.
Periodically generates cardinality space of a fixed size.
Checks whether a dimension value exists in the cardinality space. If yes, returns the value as it is. If no, attempts to add the value to the cardinality space. If it can be added, returns the value as it is. If it cannot be added, returns {ARMS_OTHERS}.
Convergence result
Dimension values that exceed the cardinality space threshold are converged to {ARMS_OTHERS}.
Default threshold
Monitored Object | Threshold |
URL interface | 500 per hour |
Scheduling task | 1000 per hour |
RPC interface | 1000 per hour |
Upstream interface | 200 per hour |
Normal SQL query | 100 per hour |
Slow SQL query | 100 per hour |
External request URL | 200 per hour |
External request address | 100 per hour |
Supported data type
Any
Example
Assume that the cardinality space size is set to 100 records per hour and IP addresses of external service are as follows:
www.a1.com
www.a2.com
....
www.a1000.com
Only the first 100 addresses will be recorded per hour, with subsequent addresses converged to {ARMS_OTHERS}
.
Execution order
URL type
Spring annotation convergence > Custom convergence > Attack request convergence > Query parameter convergence > Static resource convergence > Meaningless word convergence > Intelligent convergence > Cardinality space convergence
SQL type
Custom convergence > SQL normalization > Cardinality space convergence
IP address and other types
Custom convergence > IP address convergence > Cardinality space convergence
Convergence is executed sequentially according to the preceding order. Once a dimension value is converged by any mechanism, the execution ends.
FAQ
Where can I find the original values before convergence?
Both the original and converged values are recorded in trace data. You can view the original values using the trace explorer feature. For more information, see Trace Explorer.