All Products
Search
Document Center

Simple Log Service:Create indexes

Last Updated:Feb 29, 2024

An index is an inverted storage structure that consists of keywords and logical pointers. The logical pointers can refer to actual data. You can use an index to quickly locate data rows based on keywords. An index is similar to a data catalog. You can query and analyze log data only after you create indexes. This topic describes the types of indexes that are supported by Simple Log Service and how to create indexes.

Prerequisites

Index types and index traffic

If you want to query all fields in logs, we recommend that you use full-text indexes. If you want to query only specific fields, we recommend that you use field indexes. This helps reduce index traffic. If you want to analyze fields, you must create field indexes. You must include a SELECT statement in your query statement for analysis.

  • Full-text indexes

    • Simple Log Service splits an entire log into multiple words based on specified delimiters to create indexes. For example, the error search statement returns the logs that contain the error keyword.

      全文索引

    • All field names and field values are stored as text. In this case, field names and field values are included in the calculation of index traffic.

  • Field indexes

    • After you create field indexes, you can specify field names and field values in the key:value format to search for logs. For example, the level:error search statement returns the logs in which the value of the level field contains error. 字段索引

    • The method that is used to calculate index traffic varies based on the data type of a field.

      • text: Field names and field values are both included in the calculation of index traffic.

      • long and double: Field names are not included in the calculation of index traffic. Each field value is counted as 8 bytes in index traffic.

        For example, if you create an index for the status field of the long type and the field value is 400, the string status is not included in the calculation of index traffic, and the value 400 is counted as 8 bytes in index traffic.

      • json: Field names and field values are both included in the calculation of index traffic. The subfields that are not indexed are also included. For more information, see How do I calculate index traffic for a JSON field?

        • If a subfield is not indexed, index traffic is calculated by regarding the data type of the subfield as text.

        • If a subfield is indexed, index traffic is calculated based on the data type of the subfield. The data type can be text, long, or double.

Billing description

Logstores support the following billing modes: pay-by-ingested-data and pay-by-feature. For more information, see Manage a Logstore, Billable items of pay-by-feature, and Billable items of pay-by-ingested-data.

Logstore that uses the pay-by-ingested-data billing mode

  • Indexes occupy storage space. For more information about storage types, see Overview of tiered storage.

  • Reindexing does not generate fees.

Logstore that uses the pay-by-feature billing mode

  • Indexes occupy storage space. For more information about storage types, see Overview of tiered storage.

  • When you create indexes, traffic is generated. Index traffic is billed based on the index traffic of log data and log index traffic of Query Logstores items. For more information, see Billable items of pay-by-feature. For more information about how to reduce index traffic, see the References section of this topic.

  • Reindexing generates fees. During reindexing, you are charged based on the same billable items and prices as when you create indexes.

Procedure

Step 1: Create indexes

  1. Go to the query and analysis page.

    1. Log on to the Simple Log Service console.

    2. In the Projects section, click the project that you want to manage.

    3. On the Log Storage > Logstores tab, click the Logstore that you want to manage.

    4. On the page that appears, choose Index Attributes > Attributes. If no indexes are created, click Enable first.

      配置索引

  2. Disable the automatic update of indexes. If a Logstore is a dedicated Logstore for a cloud service or an internal Logstore, Auto Update is turned on by default. The built-in indexes of the Logstore are automatically updated to the latest version. If you want to create indexes in the preceding scenario, turn off Auto Update in the Search & Analysis panel.

    Warning

    If you delete the indexes of a dedicated Logstore for a cloud service, features such as reports and alerting that are enabled for the Logstore may be affected.

    自动更新索引

  3. Create indexes.

    1. Configure index parameters. If you want to analyze fields, you must create field indexes. You must include a SELECT statement in your query statement for analysis. Field indexes have a higher priority than full-text indexes. After an index is created, it takes effect within 1 minute.

      Important
      • Simple Log Service automatically creates indexes for specific reserved fields. For more information, see Reserved fields.

        Simple Log Service leaves delimiters empty when it creates indexes for the __topic__ and __source__ reserved fields. Therefore, only exact match is supported when you specify keywords to query the two fields.

      • Fields that are prefixed with __tag__ do not support full-text indexes. If you want to query and analyze fields that are prefixed with __tag__, you must create field indexes. Sample query statement: *| select "__tag__:__receive_time__".

      • Query and analysis results vary based on index configurations. You must create indexes based on your business requirements. If you create both full-text indexes and field indexes, the field indexes take precedence.

      • If a log contains two fields whose names are the same, such as request_time, Simple Log Service displays one of the fields as request_time_0. The two fields are still stored as request_time in Simple Log Service. If you want to query, analyze, ship, transform, or create indexes on the fields, you must use request_time.

      Full-text indexes

      Turn on Full Text Index and configure the following parameters.

      Parameter

      Description

      LogReduce

      If you turn on LogReduce, Simple Log Service automatically clusters highly similar text logs during collection and extracts patterns from the logs. This way, you can have a comprehensive understanding of the logs. For more information, see LogReduce.

      Case Sensitive

      Specifies whether searches are case-sensitive.

      • If you turn on Case Sensitive, searches are case-sensitive. For example, if a log contains internalError, you can search for the log by using only the keyword internalError.

      • If you turn off Case Sensitive, searches are not case-sensitive. For example, if a log contains internalError, you can search for the log by using the keyword INTERNALERROR or internalerror.

      Include Chinese

      Specifies whether to distinguish between Chinese content and English content in searches.

      • If you turn on Include Chinese and a log contains Chinese characters, the Chinese content is split based on the Chinese grammar. The English content is split based on specified delimiters.

        Important

        When Chinese content is split, the write speed is reduced. Proceed with caution.

      • If you turn off Include Chinese, all content of a log is split based on specified delimiters.

      Delimiter

      The delimiters that are used to split the content of a log into multiple words. By default, Simple Log Service uses the following delimiters: , '";=()[]{}?@&<>/:\n\t\r. If the default delimiters do not meet your business requirements, you can specify custom delimiters. All ASCII codes can be specified as delimiters.

      If you leave the Delimiter parameter empty, Simple Log Service considers an entire log as a whole. In this case, you can search for the log only by using a complete string or by performing fuzzy match.

      For example, the content of a log is /url/pic/abc.gif.

      • If you do not specify a delimiter, the content of the log is considered as a single word /url/pic/abc.gif. You can search for the log only by using the keyword /url/pic/abc.gif or by using /url/pic/* to perform fuzzy match.

      • If you set the Delimiter parameter to a forward slash (/), the content of the log is split into the following three words: url, pic, and abc.gif. You can search for the log by using the keyword url, abc.gif, or /url/pic/abc.gif, or by using pi* to perform fuzzy match.

      • If you set the Delimiter parameter to a forward slash (/) and a period (.), the content of the log is split into the following four words: url, pic, abc, and gif. You can search for the log by using one of the preceding words or by performing fuzzy match.

      Field indexes

      1. Optional. Click Automatic Index Generation. Simple Log Service automatically generates field indexes based on the first log in the preview results of data collection.

        自动生成索引

      2. Click the image.png icon in the lower part of the Search & Analysis panel and configure the following parameters.

        Parameter

        Description

        Key Name

        The name of the log field. Example: client_ip.

        The name can contain only letters, digits, and underscores (_). It must start with a letter or an underscore (_).

        Important
        • If you want to create an index for a __tag__ field, such as a public IP address or a UNIX timestamp, you must set the Key Name parameter to a value in the __tag__:KEY format. Example: __tag__:__receive_time__. For more information, see Reserved fields.

        • __tag__ fields do not support numeric indexes. When you create an index for a __tag__ field, you must set the Type parameter to text.

        Type

        The data type of the field value. Valid values: text, long, double, and json. For more information, see Data types.

        If you set the data type for a field to long or double, you cannot configure the Case Sensitive, Include Chinese, or Delimiter parameter for the field.

        Alias

        The alias of the field. For example, you can set the alias of the client_ip field to ip.

        The alias can contain only letters, digits, and underscores (_). It must start with a letter or an underscore (_).

        Important

        You can use the alias of a field only in an analytic statement. You must use the original name of a field in a search statement. You must include a SELECT statement in your query statement for analysis. For more information, see Column aliases.

        Case Sensitive

        Specifies whether searches are case-sensitive.

        • If you turn on Case Sensitive, searches are case-sensitive. For example, if a log contains internalError, you can search for the log by using only the keyword internalError.

        • If you turn off Case Sensitive, searches are not case-sensitive. For example, if a log contains internalError, you can search for the log by using the keyword INTERNALERROR or internalerror.

        Delimiter

        The delimiters that are used to split the content of a log into multiple words. By default, Simple Log Service uses the following delimiters: , '";=()[]{}?@&<>/:\n\t\r. If the default delimiters do not meet your business requirements, you can specify custom delimiters. All ASCII codes can be specified as delimiters.

        If you leave the Delimiter parameter empty, Simple Log Service considers an entire log as a whole. In this case, you can search for the log only by using a complete string or by performing fuzzy match.

        For example, the content of a log is /url/pic/abc.gif.

        • If you do not specify a delimiter, the content of the log is considered as a single word /url/pic/abc.gif. You can search for the log only by using the keyword /url/pic/abc.gif or by using /url/pic/* to perform fuzzy match.

        • If you set the Delimiter parameter to a forward slash (/), the content of the log is split into the following three words: url, pic, and abc.gif. You can search for the log by using the keyword url, abc.gif, or /url/pic/abc.gif, or by using pi* to perform fuzzy match.

        • If you set the Delimiter parameter to a forward slash (/) and a period (.), the content of the log is split into the following four words: url, pic, abc, and gif. You can search for the log by using one of the preceding words or by performing fuzzy match.

        Include Chinese

        Specifies whether to distinguish between Chinese content and English content in searches.

        • If you turn on Include Chinese and a log contains Chinese characters, the Chinese content is split based on the Chinese grammar. The English content is split based on specified delimiters.

          Important

          When Chinese content is split, the write speed is reduced. Proceed with caution.

        • If you turn off Include Chinese, all content of a log is split based on specified delimiters.

        Enable Analytics

        You can perform statistical analysis on a field only if you turn on Enable Analytics for the field.

Step 2: Reindex logs

Simple Log Service provides the reindexing feature that you can use to configure or modify indexes for historical data. You can reindex the logs of a specified time range in a Logstore based on the most recent indexing rules. For more information, see Reindex logs for a Logstore and Function overview.

What to do next

Query and analyze logs

For more information about the procedure of query and analysis, see Query and analyze logs. For more information about the examples of query and analysis, see Query and analyze website logs, Query and analyze JSON logs, Collect and analyze NGINX monitoring logs, and Analyze Layer 7 access logs of SLB.

Specify the maximum length of a field value

The default maximum length of a field value that can be retained for analysis is 2,048 bytes, which is equivalent to 2 KB. You can change the value of the Maximum Statistics Field Length parameter. Valid values: 64 to 16384. Unit: bytes.

Important

If the length of a field value exceeds the value of this parameter, the field value is truncated, and the excess part is not involved in analysis.

设置字段最大长度

LogReduce

If you turn on LogReduce, Simple Log Service automatically clusters highly similar text logs during collection and extracts patterns from the logs. This way, you can have a comprehensive understanding of the logs. For more information, see LogReduce.

Disable indexing

After you disable the indexing feature for a Logstore, the storage space that is occupied by historical indexes is automatically released after the data retention period of the Logstore elapses.

References

Related operations