Alibaba Cloud Elasticsearch allows you to specify a keyword and a time range in the Elasticsearch console to query specific logs of your Elasticsearch cluster. You can use the logs to identify cluster issues and perform cluster O&M in an efficient manner. This topic describes how to query logs and describes common types of logs.

Procedure

  1. Log on to the Elasticsearch console.
  2. In the left-side navigation pane, click Elasticsearch Clusters.
  3. Navigate to the desired cluster.
    1. In the top navigation bar, select the resource group to which the cluster belongs and the region where the cluster resides.
    2. In the left-side navigation pane, click Elasticsearch Clusters. On the Elasticsearch Clusters page, find the cluster and click its ID.
  4. In the left-side navigation pane of the page that appears, click Logs. Then, you can view the logs of the cluster.
    The Logs page contains the following tabs: Cluster Log, Search Slow Log, Indexing Slow Log, GC Log, Access Log, and Asynchronous Write Log.
    Note The Access Log tab is displayed only for Standard Edition clusters that are of V6.7.0 or later and use the latest kernel. For more information about how to update the kernel of a cluster, see Upgrade the version of a cluster.
  5. On a tab of the Logs page, enter a query string, select the start time and end time, and then click Search.
    You can query logs that are generated within the last seven days. By default, the logs are displayed by time in descending order. The Lucene query syntax is supported. For more information, see Query string syntax.

    In this example, the logs that meet the following conditions are queried on the Cluster Log tab: The value of the level field is info, the value of the host field is 172.16.xx.xx, and the value of the content field contains the health keyword. In this case, the query string is host:172.16.xx.xx AND content:health AND level:info.

    Query string
    Notice
    • AND in the query string must be uppercase.
    • If you do not specify an end time, the current system time is used as the end time. If you do not specify a start time, the start time is one hour earlier than the end time.
    • Alibaba Cloud Elasticsearch can return a maximum of 10,000 logs for each query. If the returned logs do not contain the logs that you want to view, you can shorten the specified time range and perform another query.
    After you click Search, the logs that match your query string are displayed.

Common types of logs

Operational logs

The Cluster Log tab displays the operational logs of the cluster. Each operational log contains the following information: Time, Node IP, and Content. Log query results
  • Time: the time when the log is generated.
  • Node IP: the IP address of the node that generates the log.
  • Content: consists of the level, host, time, and content fields.
    Field Description
    level The level of the log. Log levels include trace, debug, info, warn, and error.
    Note Garbage collection (GC) logs do not contain the level field.
    host The IP address of the node that generates the log.
    time The time when the log is generated.
    content The content of the log.

Slow logs

Slow logs include slow query logs and slow indexing logs. These logs are generated if the time that is required to complete an indexing or query operation exceeds the specified threshold. The Search Slow Log tab displays slow query logs, and the Indexing Slow Log tab displays slow indexing logs. By default, slows logs are enabled. If unbalanced loads, read or write exceptions, or slow data processing occurs on your cluster, you can locate issues based on the slow logs.

By default, Elasticsearch logs only the read and write operations that require 5s to 10s to complete as slow logs. This mechanism does not help identify issues. After you create a cluster, you can reduce the related time thresholds by using one of the following methods to capture more logs:
  • Use scenario-based templates. After a cluster is created, scenario-based templates are enabled and applied to the cluster. The index template defines the configurations of slow logs. We recommend that you retain the default configurations. The following code shows the default configurations of slow logs in the General scenario:
      "settings": {
        "index": {
          "search": {
            "slowlog": {
              "level": "info",
              "threshold": {
                "fetch": {
                  "warn": "200ms",
                  "trace": "50ms",
                  "debug": "80ms",
                  "info": "100ms"
                },
                "query": {
                  "warn": "500ms",
                  "trace": "50ms",
                  "debug": "100ms",
                  "info": "200ms"
                }
              }
            }
          },
          "refresh_interval": "10s",
          "unassigned": {
            "node_left": {
              "delayed_timeout": "5m"
            }
          },
          "indexing": {
            "slowlog": {
              "level": "info",
              "threshold": {
                "index": {
                  "warn": "200ms",
                  "trace": "20ms",
                  "debug": "50ms",
                  "info": "100ms"
                }
              },
              "source": "1000"
            }
          }
        }
      }
    Note If the value of the Scenario parameter is None in the Scenario-based Configuration section of the Cluster Configuration page, you can configure the parameter based on your business requirements. Then, submit the templates to apply the default configurations of slow logs to the cluster. For more information, see Use a scenario-based template to modify the configurations of a cluster.
  • Log on to the Kibana console of the cluster and run the following command to modify the configurations of slow logs.
    PUT _settings
    {
        "index.indexing.slowlog.threshold.index.warn" : "200ms",
        "index.indexing.slowlog.threshold.index.trace" : "20ms",
        "index.indexing.slowlog.threshold.index.debug" : "50ms",
        "index.indexing.slowlog.threshold.index.info" : "100ms",
        "index.search.slowlog.threshold.fetch.warn" : "200ms",
        "index.search.slowlog.threshold.fetch.trace" : "50ms",
        "index.search.slowlog.threshold.fetch.debug" : "80ms",
        "index.search.slowlog.threshold.fetch.info" : "100ms",
        "index.search.slowlog.threshold.query.warn" : "500ms",
        "index.search.slowlog.threshold.query.trace" : "50ms",
        "index.search.slowlog.threshold.query.debug" : "100ms",
        "index.search.slowlog.threshold.query.info" : "200ms"
    }
After the configurations of slow logs are modified, if the time that is required to complete a read or write operation exceeds the specified threshold, you can query the related logs on the Search Slow Log or Indexing Slow Log tab of the Logs page to identify the issue. Slow logs

GC logs

By default, GC logs are enabled. Each GC log contains the following information: Time, Node IP, and Content. For more information, see Operational logs. GC logs

References

ListSearchLog