This topic describes the data types of indexes in Log Service. The supported data types include TEXT, LONG, DOUBLE, and JSON.
Data types
Query type | Data type | Description | Example |
---|---|---|---|
Basic query | text | The TEXT type. You can query indexes of this type by using keywords or fuzzy match. | uri:"login*" method:"post" |
long | The LONG type. You can specify numeric ranges to query indexes of this type. | status>200, status in [200, 500] | |
double | The floating-point type. | price>28.95, t in [20.0, 37] | |
Combined query | JSON | The index is a JSON field that supports nested queries. By default, the data type of the field is TEXT. You can set indexes for element b at layer a by using the path format, for example, a.b. The supported data types are TEXT, LONG, and DOUBLE. | level0.key>29.95 level0.key2:"action" |
text | Creates indexes for all fields in a log entry except the time field. The data type of the indexes is TEXT. | error and "login fail" |
TEXT
Logs of the TEXT type are matched by using keywords. You must configure delimiters and case sensitivity for indexes.
If you turn on the full-text index switch, indexes are established for all fields in a log entry except the time field. The data type of the indexes is TEXT. You do not need to specify keys for the indexes.
- Sample log entry
time:2018-01-02 12:00:00 level:"error" status:200 message:"some thing is error in this field"
- Implementation basics
- You do not need to enter prefixes for keywords when you use a query statement to query the log entry. For example, if the query statement is "error", the level and message fields are returned.
- Logs are matched based on the delimiters that you specify. For example, if you do not specify a delimiter, status:200 is considered as a word. If you specify the colon (:) as the delimiter, status:200 is delimited into status and 200.
- Numeric data is considered as data of the TEXT type. For example, if the query statement
is 200, this log entry is returned.
Note The time field is not considered as data of the TEXT type.
- If the query statement is a key, for example, status, the log entry is matched.
LONG and DOUBLE
- If you set the data type of the field index to LONG and the field value is a floating-point number, the field cannot be queried.
- If you set the data type of the field index to LONG or DOUBLE and the field value is a string, the field cannot be queried.
- If a field of the LONG or DOUBLE data type does not exist in a log entry, you can use the not key:* statement to query the log entry.
- If the numeric value of a field in a log entry is invalid, you can use the not key > -1000000 statement to query the log entry. The -100000 clause replaces the invalid numeric value with the smallest valid value in the log entry.
- Query statement by using numeric values
longKey > 1000 and longKey <= 2000
- Query statement by using a numeric range
longKey in (1000 2000]
JSON
JSON-formatted data includes data of multiple types, including string, Boolean, LONG, DOUBLE, array, and map. JSON-formatted data is self-parsed and flexible data that can be used in multiple scenarios. In most cases, log fields of multiple formats are recorded in the JSON format. For example, HTTP request and response parameters are recorded in a log entry in the JSON format.
- JSON objects and JSON arrays are not supported.
- Fields cannot be in JSON arrays.
- Fields of the Boolean type can be converted into the TEXT type.
- To query logs, JSON-formatted fields must be enclosed in double quotation marks (").
- If you parse a JSON-formatted field, data of the TEXT and Boolean types in the field
is automatically indexed.
json_string.key_map.key_text : test_value json_string.key_map.key_bool : true
- You can configure JSON paths for fields of the DOUBLE and LONG types to query the
fields.
- The data type of the index for the key_map.key_long field is LONG.
- The query statement is json_string.key_map.key_long > 50.
- You can enable the analytics feature for fields of the DOUBLE, LONG, and TEXT type.
Then you can use analytic statements to analyze the fields.
json_string.key_map.key_long > 10 | select count(*) as c , "json_string.key_map.key_text" group by "json_string.key_map.key_text"
- You can parse partially valid JSON-formatted data.
Log Service does not stop parsing logs until it detects an invalid field.
In the following example, data after the key_3 field is truncated and lost in the following text. Log Service can parse the json_string.key_map.key_2 field and the content before this field.
"json_string": { "key_1" : "value_1", "key_map" : { "key_2" : "value_2", "key_3" : "valu
- Sample log entry
The sample log entry includes the time field and the other four fields. The message field is a JSON-formatted field. The following table describes the five fields.
Number Key Type 0 time N/A 1 class text 2 status long 3 latency double 4 message json 0. time:2018-01-01 12:00:00 1. class:central-log 2. status:200 3. latency:68.75 4. message: { "methodName": "getProjectInfo", "success": true, "remoteAddress": "1.1.1.1:11111", "usedTime": 48, "param": { "projectName": "ali-log-test-project", "requestId": "d3f0c96a-51b0-4166-a850-f4175dde7323" }, "result": { "message": "successful", "code": "200", "data": { "clusterRegion": "ap-southeast-1", "ProjectName": "ali-log-test-project", "CreateTime": "2017-06-08 20:22:41" }, "success": true } }
- Configure indexes
- Number 1 indicates that you can query fields of the TEXT and Boolean type in the JSON-formatted field.
- Number 2 indicates that you can query data of the LONG type.
- Number 3 indicates that you can use analytic statements to analyze the fields.
- Query statements
- Query log data of the TEXT and Boolean types
Note
- You do not need to configure indexes for fields in a JSON-formatted field.
- JSON maps and JSON arrays can be nested and automatically expanded. You must use periods (.) to delimit fields when you query JSON maps or JSON arrays.
message.traceInfo.requestId : 92.137_1518139699935_5599 message.param.projectName : ali-log-test-project message.success : true message.result.data.ProjectStatus : Normal
- Query log data of the DOUBLE and LONG types
Note You must configure indexes for fields in JSON-formatted fields. The fields cannot be in JSON arrays.
message.usedTime > 40
- Analytic statements
Note
- Each JSON field must be separately specified and cannot be contained in a JSON array.
- You must use double quotation marks (") to enclose the fields or set aliases for the fields that you need to query.
* | select avg("message.usedTime") as avg_time , "message.methodName" group by "message.methodName"
- Combined query
class : cental* and message.usedTime > 40 not message.param.projectName:ali-log-test-project
- Query log data of the TEXT and Boolean types