All Products
Search
Document Center

Log Service:Create indexes

Last Updated:Aug 03, 2023

An index is an inverted storage structure that consists of keywords and logical pointers. The logical pointers can be used to refer to actual data. You can use an index to quickly locate data rows based on keywords. An index is similar to a data catalog. You can query and analyze log data only after you create indexes.

Prerequisites

Logs are collected. For more information, see Data collection overview.

Important Before you can analyze logs, you must store the logs in a Standard Logstore. For more information, see Manage a Logstore.

Usage notes

  • If the billing mode of the current Logstore is pay-by-ingested-data, you are not charged for creating indexes. For more information, see Pay-by-ingested-data.

  • If the billing mode of the current Logstore is pay-by-feature, you are charged for the traffic that is generated when you create indexes and the storage space that is occupied by the indexes. For more information, see Billable items of pay-by-feature.

  • You can manage the configurations of your indexes. For example, you can add, change, and delete indexes for specific fields, and modify related parameters. The index configurations take effect only for the logs that are written after you update the index configurations. If you want to query and analyze historical logs, you must use the reindexing feature. For more information, see Reindex logs for a Logstore.

  • After you disable the indexing feature for a Logstore, the storage space that is occupied by historical indexes is automatically released after the data retention period of the Logstore elapses.

  • By default, indexes are created for specific reserved fields in Simple Log Service. For more information, see Reserved fields.

    No delimiters are specified for the indexes of the __topic__ and __source__ fields. When you search for the fields, only exact match is supported.

  • Fields that are prefixed with __tag__ do not support full-text indexes. If you want to query and analyze fields that are prefixed with __tag__, you must create field indexes. Sample query statement: *| select "__tag__:__receive_time__".

  • Query and analysis results vary based on the index configurations. We recommend that you create indexes based on your business requirements. If you create both full-text indexes and field indexes, the configurations of the field indexes take precedence.

  • If a log contains two fields whose names are the same, such as request_time, Simple Log Service displays one of the fields as request_time_0. The two fields are still stored as request_time in Simple Log Service. In this case, you must use request_time to query, analyze, ship, or transform data, or configure indexes.

Index types

Important

If you want to use the analysis feature by specifying a SELECT statement, you must create indexes for the required fields and turn on Enable Analytics for the fields. If you turn on Enable Analytics, no additional index traffic is generated, and no additional storage space is occupied.

The following section describes the index types that are supported by Simple Log Service:

  • Full-text index

    Simple Log Service splits an entire log into multiple words based on specified delimiters to create indexes. In a search statement, the field names (keys) and field values (values) are both plain text. For example, the search statement error returns the logs that contain the keyword error. Full-text index

  • Field index

    After you create field indexes, you can specify field names and field values in the Key:Value format to search for logs. For example, the search statement level:error returns the logs in which the value of the level field contains error. Field index

Index traffic

After you create indexes, index traffic is generated.

Index type

Description

Full-text index

All field names and field values are stored as text. The field names and field values are both included in index traffic.

Field index

The method that is used to calculate index traffic varies based on the data type of a field.

  • text: Field names and field values are both included in index traffic.

  • long and double: Field names are not included in index traffic. Each field value is eight bytes in length in index traffic.

    For example, if you configure an index for the status field of the long type and the field value is 400, the string status is not included in index traffic, and the 400 value is eight bytes in length in index traffic.

  • json: Field names and field values are both included in index traffic. The subfields that are not indexed are also included. For more information, see How do I calculate index traffic for a JSON field?

    • If a subfield is not indexed, index traffic is calculated by regarding the data type of the subfield as text.

    • If a subfield is indexed, index traffic is calculated based on the data type of the subfield. The data type can be text, long, or double.

Create full-text indexes

  1. Log on to the Log Service console.
  2. In the Projects section, click the project that you want to manage.
  3. On the Log Storage > Logstores tab, click the Logstore that you want to manage.
  4. Go to the Search & Analysis page.

    • If no indexes are created, click Enable on the Search & Analysis page of the Logstore. Enable the indexing feature

    • If you already created indexes, choose Index Attributes > Attributes on the Search & Analysis page of the Logstore. Configure indexes

  5. In the Search & Analysis panel, configure the parameters and click OK. The following table describes the parameters.

    Important

    The created indexes take effect within 1 minute.

    Parameter

    Description

    LogReduce

    If you turn on LogReduce, Simple Log Service automatically aggregates text logs that have the same pattern during log collection. This allows you to obtain the overall information about logs. For more information, see LogReduce.

    Full Text Index

    If you turn on Full Text Index, full-text indexes are created.

    Case Sensitive

    Specifies whether a keyword that you want to use to perform a search is case-sensitive.

    • If you turn on Case Sensitive, keywords are case-sensitive. For example, if a log contains internalError, you can search for the log only by using the keyword internalError.

    • If you turn off Case Sensitive, keywords are not case-sensitive. For example, if a log contains internalError, you can search for the log by using the keyword INTERNALERROR or internalerror.

    Include Chinese

    Specifies whether to distinguish between Chinese content and English content in searches.

    • If you turn on Include Chinese and a log contains Chinese characters, the Chinese content is split based on the Chinese grammar. The English content is split based on specified delimiters.

      Important

      When Chinese content is split, the write speed is reduced. Proceed with caution.

    • If you turn off Include Chinese, all content of a log is split based on specified delimiters.

    Delimiter

    The delimiters that are used to split the content of a log into multiple words. The default value of the Delimiter parameter is , '";=()[]{}?@&<>/:\n\t\r. If the default settings cannot meet your requirements, you can specify custom delimiters. All ASCII codes can be defined as delimiters.

    If you leave the Delimiter parameter empty, Simple Log Service considers the value of each field as a whole. In this case, you can search for a log only by using a complete string or a fuzzy search.

    For example, the content of a log is /url/pic/abc.gif.

    • If you do not specify a delimiter, the content of the log is regarded as a single word /url/pic/abc.gif. You can search for the log only by using the keyword /url/pic/abc.gif or by using /url/pic/* to perform a fuzzy search.

    • If you set the Delimiter parameter to a forward slash (/), the content of the log is split into the following three words: url, pic, and abc.gif. You can search for the log by using the keyword url, abc.gif, or /url/pic/abc.gif, or by using pi* to perform a fuzzy search.

    • If you set the Delimiter parameter to a forward slash (/) and a period (.), the content of the log is split into the following four words: url, pic, abc, and gif. You can search for the log by using one of the preceding words or by performing a fuzzy search.

Manually create field indexes

Note

When you create field indexes, you can add a maximum of 500 fields. When you configure indexes for a JSON field, you can add a maximum of 100 subfields.

  1. Log on to the Log Service console.
  2. In the Projects section, click the project that you want to manage.
  3. On the Log Storage > Logstores tab, click the Logstore that you want to manage.
  4. Go to the Search & Analysis page.

    • If no indexes are created, click Enable on the Search & Analysis page of the Logstore. Enable the indexing feature

    • If you already created indexes, choose Index Attributes > Attributes on the Search & Analysis page of the Logstore. Configure indexes

  5. In the Search & Analysis panel, configure the parameters and click OK. The following table describes the parameters.

    Important

    The created indexes take effect within 1 minute.

    Parameter

    Description

    Key Name

    The name of the log field. Example: client_ip.

    The field name can contain only letters, digits, and underscores (_). It must start with a letter or an underscore (_).

    Important
    • If you want to configure an index for a __tag__ field, such as an Internet IP address or a UNIX timestamp, you must set the Key Name parameter to a value in the __tag__:KEY format. Example: __tag__:__receive_time__. For more information, see Reserved fields.

    • __tag__ fields do not support numeric indexes. In this case, you must set the Type parameter of indexes for all __tag__ fields to text.

    Type

    The data type of the log field value. Valid values: text, long, double, and json. For more information, see Data types.

    If you set the data type of a field to long or double, you cannot configure the Case Sensitive, Include Chinese, or Delimiter parameter for the field.

    Alias

    The alias of the field. For example, you can set the alias of the client_ip field to ip.

    The field alias can contain only letters, digits, and underscores (_). It must start with a letter or an underscore (_).

    Important

    You can use the alias of a field only in an analytic statement that contains a SELECT statement. You must use the original name of the field in a search statement. For more information, see Column aliases.

    Case Sensitive

    Specifies whether keywords are case-sensitive.

    • If you turn on Case Sensitive, keywords are case-sensitive. For example, if a log contains internalError, you can search for the log only by using the keyword internalError.

    • If you turn off Case Sensitive, keywords are not case-sensitive. For example, if a log contains internalError, you can search for the log by using the keyword INTERNALERROR or internalerror.

    Delimiter

    The delimiters that are used to split the content of a log into multiple words. The default value of the Delimiter parameter is , '";=()[]{}?@&<>/:\n\t\r. If the default settings cannot meet your requirements, you can specify custom delimiters. All ASCII codes can be defined as delimiters.

    If you leave the Delimiter parameter empty, Simple Log Service considers the value of each field as a whole. In this case, you can search for a log only by using a complete string or a fuzzy search.

    For example, the content of a log is /url/pic/abc.gif.

    • If you do not specify a delimiter, the content of the log is regarded as a single word /url/pic/abc.gif. You can search for the log only by using the keyword /url/pic/abc.gif or by using /url/pic/* to perform a fuzzy search.

    • If you set the Delimiter parameter to a forward slash (/), the content of the log is split into the following three words: url, pic, and abc.gif. You can search for the log by using the keyword url, abc.gif, or /url/pic/abc.gif, or by using pi* to perform a fuzzy search.

    • If you set the Delimiter parameter to a forward slash (/) and a period (.), the content of the log is split into the following four words: url, pic, abc, and gif. You can search for the log by using one of the preceding words or by performing a fuzzy search.

    Include Chinese

    Specifies whether to distinguish between Chinese content and English content in searches.

    • If you turn on Include Chinese and a log contains Chinese characters, the Chinese content is split based on the Chinese grammar. The English content is split based on specified delimiters.

      Important

      When Chinese content is split, the write speed is reduced. Proceed with caution.

    • If you turn off Include Chinese, all content of a log is split based on specified delimiters.

    Enable Analytics

    You can perform statistical analysis on a field only if you turn on Enable Statistics for the field.

Automatically generate field indexes

If you click Automatic Index Generation when you create field indexes, Simple Log Service automatically generates field indexes based on the first entry in the preview data during data collection.

Automatically generate indexes

Enable the automatic update of indexes

If a Logstore is a dedicated Logstore for a cloud service or an internal Logstore, Auto Update is turned on by default. The built-in indexes of the Logstore are automatically updated to the latest version.

Warning

If you delete the indexes of a dedicated Logstore for a cloud service, features such as reports and alerting that are enabled for the Logstore may be affected.

If you want to configure custom indexes, turn off Auto Update in the Search & Analysis panel. Enable the automatic update of indexes

Specify the maximum length of a field value

The default maximum length of a field value that can be retained for analysis is 2,048 bytes (2 KB). You can change the value of the Maximum Statistics Field Length parameter. Valid values: 64 to 16384. Unit: bytes.

Important

If the length of a field value exceeds the value of this parameter, the field value is truncated, and the excess part is not involved in analysis.

Specify the maximum length of a field value

API operations

Action

API operation

Create indexes

CreateIndex

Delete indexes

DeleteIndex

Query indexes

GetIndex

Update indexes

UpdateIndex