This topic describes the data types of indexes in Log Service. The supported data types include TEXT, LONG, DOUBLE, and JSON.

Data types

The following table describes the supported data types.
Query type Data type Description Example
Basic query text The TEXT type. You can query indexes of this type by using keywords or fuzzy match. uri:"login*" method:"post"
long The LONG type. You can specify numeric ranges to query indexes of this type. status>200, status in [200, 500]
double The floating-point type. price>28.95, t in [20.0, 37]
Combined query JSON The index is a JSON field that supports nested queries. By default, the data type of the field is TEXT. You can set indexes for element b at layer a by using the path format, for example, a.b. The supported data types are TEXT, LONG, and DOUBLE. level0.key>29.95 level0.key2:"action"
text Creates indexes for all fields in a log entry except the time field. The data type of the indexes is TEXT. error and "login fail"

TEXT

Logs of the TEXT type are matched by using keywords. You must configure delimiters and case sensitivity for indexes.

If you turn on the full-text index switch, indexes are established for all fields in a log entry except the time field. The data type of the indexes is TEXT. You do not need to specify keys for the indexes.

The following example shows how to query logs if you turn on the full-text index switch. The following sample log entry includes the time, status, level, and message fields.
  • Sample log entry
    time:2018-01-02 12:00:00
    level:"error"
    status:200
    message:"some thing is error in this field"
  • Implementation basics
    • You do not need to enter prefixes for keywords when you use a query statement to query the log entry. For example, if the query statement is "error", the level and message fields are returned.
    • Logs are matched based on the delimiters that you specify. For example, if you do not specify a delimiter, status:200 is considered as a word. If you specify the colon (:) as the delimiter, status:200 is delimited into status and 200.
    • Numeric data is considered as data of the TEXT type. For example, if the query statement is 200, this log entry is returned.
      Note The time field is not considered as data of the TEXT type.
    • If the query statement is a key, for example, status, the log entry is matched.

LONG and DOUBLE

If a field is of the LONG or DOUBLE data type, you can use only numeric ranges to query the field.
Note
  • If you set the data type of the field index to LONG and the field value is a floating-point number, the field cannot be queried.
  • If you set the data type of the field index to LONG or DOUBLE and the field value is a string, the field cannot be queried.
  • If a field of the LONG or DOUBLE data type does not exist in a log entry, you can use the not key:* statement to query the log entry.
  • If the numeric value of a field in a log entry is invalid, you can use the not key > -1000000 statement to query the log entry. The -100000 clause replaces the invalid numeric value with the smallest valid value in the log entry.
For example, to query fields whose numeric values are in the interval (1000, 2000], you can use one of the following query statements:
  • Query statement by using numeric values
      longKey > 1000 and longKey <= 2000
  • Query statement by using a numeric range
      longKey in (1000 2000]

JSON

JSON-formatted data includes data of multiple types, including string, Boolean, LONG, DOUBLE, array, and map. JSON-formatted data is self-parsed and flexible data that can be used in multiple scenarios. In most cases, log fields of multiple formats are recorded in the JSON format. For example, HTTP request and response parameters are recorded in a log entry in the JSON format.

You can set the type of a field index to JSON. Log Service allows you to query JSON-formatted data. When you query a JSON-formatted field, you must add the parent path as the prefix. For more information about the query syntax for data of the TEXT, DOUBLE, and LONG type, see Query syntax.
Note
  • JSON objects and JSON arrays are not supported.
  • Fields cannot be in JSON arrays.
  • Fields of the Boolean type can be converted into the TEXT type.
  • To query logs, JSON-formatted fields must be enclosed in double quotation marks (").
  • If you parse a JSON-formatted field, data of the TEXT and Boolean types in the field is automatically indexed.
    json_string.key_map.key_text : test_value
    json_string.key_map.key_bool : true
  • You can configure JSON paths for fields of the DOUBLE and LONG types to query the fields.
    • The data type of the index for the key_map.key_long field is LONG.
    • The query statement is json_string.key_map.key_long > 50.
  • You can enable the analytics feature for fields of the DOUBLE, LONG, and TEXT type. Then you can use analytic statements to analyze the fields.
    json_string.key_map.key_long > 10 | select count(*) as c , 
        "json_string.key_map.key_text" group by 
        "json_string.key_map.key_text"
  • You can parse partially valid JSON-formatted data.

    Log Service does not stop parsing logs until it detects an invalid field.

    In the following example, data after the key_3 field is truncated and lost in the following text. Log Service can parse the json_string.key_map.key_2 field and the content before this field.

    "json_string": 
    {
         "key_1" :  "value_1",
         "key_map" : 
          {
                 "key_2" : "value_2",
                 "key_3" : "valu
  • Sample log entry

    The sample log entry includes the time field and the other four fields. The message field is a JSON-formatted field. The following table describes the five fields.

    Number Key Type
    0 time N/A
    1 class text
    2 status long
    3 latency double
    4 message json
    0. time:2018-01-01 12:00:00
      1. class:central-log
      2. status:200
      3. latency:68.75
      4. message:
      {  
          "methodName": "getProjectInfo",
          "success": true,
          "remoteAddress": "1.1.1.1:11111",
          "usedTime": 48,
          "param": {
                  "projectName": "ali-log-test-project",
                  "requestId": "d3f0c96a-51b0-4166-a850-f4175dde7323"
          },
          "result": {
              "message": "successful",
              "code": "200",
              "data": {
                  "clusterRegion": "ap-southeast-1",
                  "ProjectName": "ali-log-test-project",
                  "CreateTime": "2017-06-08 20:22:41"
              },
              "success": true
          }
      }
  • Configure indexesConfigure indexes
    • Number 1 indicates that you can query fields of the TEXT and Boolean type in the JSON-formatted field.
    • Number 2 indicates that you can query data of the LONG type.
    • Number 3 indicates that you can use analytic statements to analyze the fields.
  • Query statements
    • Query log data of the TEXT and Boolean types
      Note
      • You do not need to configure indexes for fields in a JSON-formatted field.
      • JSON maps and JSON arrays can be nested and automatically expanded. You must use periods (.) to delimit fields when you query JSON maps or JSON arrays.
      message.traceInfo.requestId : 92.137_1518139699935_5599
      message.param.projectName : ali-log-test-project
      message.success : true
      message.result.data.ProjectStatus : Normal
      Sample log entry
    • Query log data of the DOUBLE and LONG types
      Note You must configure indexes for fields in JSON-formatted fields. The fields cannot be in JSON arrays.
      message.usedTime > 40
    • Analytic statements
      Note
      • Each JSON field must be separately specified and cannot be contained in a JSON array.
      • You must use double quotation marks (") to enclose the fields or set aliases for the fields that you need to query.
      * | select avg("message.usedTime") as avg_time ,
      "message.methodName"  group by "message.methodName"
    • Combined query
      class : cental* and message.usedTime > 40 not message.param.projectName:ali-log-test-project