For time series data, data volumes increase over time. If you want to store large volumes of data, the storage costs will linearly increase. In this scenario, you can use the rollup mechanism of Elasticsearch to store data at a fraction of the cost. The following procedure demonstrates how to use the rollup mechanism to summarize Logstash traffic data.
Prerequisites
- You have the manage or manage_rollup permission.
To use the rollup mechanism, you must have the manage or manage_rollup permission. For more information, see Security privileges.
- You have created an Alibaba Cloud Elasticsearch instance.
For more information, see Create an Elasticsearch cluster. This topic uses an Alibaba Cloud Elasticsearch V7.4 instance of the Standard Edition as an example.Note The rollup commands listed in this topic are of Elasticsearch V7.4. For more information about commands of Elasticsearch V6.x, see rollup job descriptions.
Background information
- Elasticsearch provides hourly summaries of the networkoutTraffic and networkinTraffic fields at intervals of 15 minutes. The networkoutTraffic and networkinTraffic fields correspond to a specific instance ID.
- Elasticsearch uses charts presented on the Kibana console to visualize the data of the networkoutTraffic and networkinTraffic fields.
"monitordata-logstash-sls-2020-04-05" : {
"mappings" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},
"__source__" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"disk_type" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"host" : {
"type" : "keyword"
},
"instanceId" : {
"type" : "keyword"
},
"metricName" : {
"type" : "keyword"
},
"monitor_type" : {
"type" : "keyword"
},
"networkinTraffic" : {
"type" : "double"
},
"networkoutTraffic" : {
"type" : "double"
},
"node_spec" : {
"type" : "keyword"
},
"node_stats_node_master" : {
"type" : "keyword"
},
"resource_uid" : {
"type" : "keyword"
}
}
}
}
}
Procedure
- Step 1: Create a rollup job
- Step 2: Start the rollup job and view the job information
- Step 3: Query the data of the rollup index
- Step 4: Create a rollup index pattern
- Step 5: Create a chart for traffic monitoring in the Kibana console
- Step 6: Create a traffic monitoring dashboard in the Kibana console
Step 1: Create a rollup job
PUT _rollup/job
command to define rollup jobs within an hour. PUT _rollup/job/ls-monitordata-sls-1h-job1
{
"index_pattern": "monitordata-logstash-sls-*",
"rollup_index": "monitordata-logstash-rollup-1h-1",
"cron": "0 */15 * * * ?",
"page_size" :1000,
"groups" : {
"date_histogram": {
"field": "@timestamp",
"fixed_interval": "1h"
},
"terms": {
"fields": ["instanceId"]
}
},
"metrics": [
{
"field": "networkoutTraffic",
"metrics": ["sum"]
},
{
"field": "networkinTraffic",
"metrics": ["sum"]
}
]
}
Parameter | Required | Type | Description |
---|---|---|---|
index_pattern |
Yes | string | The index or index pattern of the rollup job. Wildcards (*) are supported. |
rollup_index |
Yes | string | The index of the rollup summary. Wildcards are not supported, and a complete name is required. |
cron |
Yes | string | The interval between rollup jobs. It is independent of the interval at which data is rolled up. |
page_size |
Yes | integer | The number of bucket results that are processed on each iteration of the rollup index. A larger value indicates faster processing and higher memory usage during the processing. |
groups |
Yes | object | Allows you to define the grouping fields and aggregation methods for jobs. |
└ date_histogram |
Yes | object | Allows you to roll up the date field to a time-based bucket. |
└field |
Yes | string | The date field you want to roll up. |
└fixed_interval |
Yes | time units | The interval at which data is rolled up. For example, if this parameter is set to 1h, the date field specified by the field parameter is rolled up on an hourly basis. This parameter specifies the minimum interval at which data is rolled up. |
terms |
No | object | None. |
└fields |
Yes | string | The terms field set. Fields in this array can be of the keyword or numberic type, and arranged with no order required. |
metrics |
No | object | None. |
└field |
Yes | string | The field of the metrics you want to collect. In the preceding code, this parameter is set to networkoutTraffic and networkinTraffic. |
└metrics |
Yes | array | The operator you want to use for aggregation. If this parameter is set to sum, the sum of the networkinTraffic field is calculated. This parameter can be set to min, max, sum, average, or value count. |
- If
index_pattern
is set to a wildcard pattern, make sure that the value of index_pattern is different from that ofrollup_index
. Otherwise, an error is returned. - The mapping of rollup_index is of the object type. Make sure that index_pattern is not set to the same value as rollup_index. Otherwise, an error is returned.
- The rollup job supports only date histogram aggregation, histogram aggregation, and terms aggregation. For more information, see Rollup aggregation limitations.
Step 2: Start the rollup job and view the job information
Step 3: Query the data of the rollup index
When the rollup job is executed, the structure of the rollup document is different from that of the raw data. The rollup query port rebuilds the Query DSL into a pattern that matches the rollup document, obtains the response, and restores the Query DSL to the pattern expected by the client that is used for the original query.
- Use match_all to obtain all data of the rollup index.
GET monitordata-logstash-rollup-1h-1/_search { "query": { "match_all": {} } }
- Only one rollup index can be specified for a query. Fuzzy match is not supported. Multiple indexes can be specified for a real-time data query.
- The following queries are supported: term queries, terms queries, range queries, match all queries, and any compound queries. Compound queries are combinations of queries, including Boolean queries, boosting queries, and constant score queries. For more limits, see Rollup search limitations.
- Use
_rollup_search
to obtain the sum of networkoutTraffic.GET /monitordata-logstash-rollup-1h-1/_rollup_search { "size": 0, "aggregations": { "sum_temperature": { "sum": { "field": "networkoutTraffic" } } } }
_rollup_search
supports subsets of common search operation features:- query: the Query DSL parameter with specific limits. For more information, see Rollup search limitations and Rollup aggregation limitations.
- aggregations: the aggregation parameter.
_rollup_search
does not support the following features:- size: Set this parameter to 0 or do not specify this parameter. This is because rollup is used only for data aggregation and the query result cannot be returned.
- Parameters such as highlighter, suggestors, post_filter, profile, and explain are not supported.
Step 4: Create a rollup index pattern
Step 5: Create a chart for traffic monitoring in the Kibana console
The following procedure demonstrates how to create networkinTraffic and networkoutTraffic charts for the rollup index in the Kibana console.