Time series data increase over time. You can use the index lifecycle management (ILM) feature to periodically roll over the data to new indexes. This ensures high query efficiency and reduces query costs. As indexes age and fewer queries are required, you can migrate the indexes to a less expensive disk and reduce the numbers of primary and replica shards. This topic describes how to use ILM to manage Heartbeat indexes.

Background information

The ILM feature allows you to create, configure, enable, disable, or delete Elasticsearch indexes throughout their lifecycle. This feature is available in Elasticsearch V6.6.0 and later. It divides the lifecycle of an index into four phases: hot, warm, cold, and delete. In the hot phase, the feature rolls over data in existing indexes to new ones. In other phases, it processes the new indexes. The following table describes these phases.
Phase Description
hot In this phase, time series data is written in real time. In this phase, you can determine whether to roll over the data in existing indexes to new ones based on the number of documents, volume of data, and duration of these indexes. If you need to roll over the data, you can call the rollover API.
warm In this phase, data is no longer written to indexes, and data in indexes is frequently queried.
cold In this phase, indexes are no longer updated. Few queries are performed on these indexes, and the query process slows down.
delete In this phase, data is deleted.
You can use one of the following methods to attach an ILM policy to one or more indexes:
  • Attach an ILM policy to an index template: If you use this method, the ILM policy applies to all indexes that have the same alias. In this topic, this method is used.
  • Attach an ILM policy to a single index: If you use this method, the ILM policy applies only to the current index. New indexes generated after a rollover are not affected by the ILM policy.

Test scenario:

A large number of time series indexes whose names start with heartbeat- exist in your Elasticsearch cluster, and the size of a single index is about 4 MB each day. The number of shards increases with the data volume. This may cause cluster overload. In this case, you must configure different rollover policies for the four phases. In the hot phase, data in historical monitoring indexes whose names start with heartbeat- is rolled over to new indexes. In the warm phase, indexes are shrunk, and segments in each index are merged. In the cold phase, data is migrated from hot nodes to warm nodes. In the delete phase, data is deleted on a regular basis.

Precautions

  • An ILM policy can be attached to an index only after an index template and an alias are configured for the index.
  • If you modify an ILM policy during a rollover, the new policy takes effect from the next rollover.

Procedure

  1. Preparations
    Create an Alibaba Cloud Elasticsearch cluster. Then, enable the Auto Indexing feature for the cluster and configure a public IP address whitelist for the cluster.
  2. Step 1: Enable and configure the ILM feature in the heartbeat.yml file
    In the heartbeat.yml file, enable and configure the ILM feature for the cluster. After the configuration is completed, the system automatically generates a Heartbeat index template for the cluster.
  3. Step 2: Create an ILM policy
    You can call the ILM policy operation to create an ILM policy. This policy defines the conditions to roll over data and archive indexes.
  4. Step 3: Attach the ILM policy to an index template
    Attach the ILM policy to the Heartbeat index template.
  5. Step 4: Attach the ILM policy to an index
    Attach the ILM policy to the first index that is created by using the Heartbeat index template. This way, the policy can apply to all indexes that are created by using this template.
  6. Step 5: View indexes in different phases
    View the indexes that are archived in the hot, warm, cold, and delete phases.

Preparations

  1. Create an Elasticsearch cluster and enable the Auto Indexing feature for the cluster.
    For more information, see Create an Alibaba Cloud Elasticsearch cluster and Configure the YML file.
    Note In this topic, an Alibaba Cloud Elasticsearch V6.7.0 cluster is used. All operations described and figures provided in this topic are suitable only for clusters of this version. If you use a cluster of another version, operations required in the Elasticsearch console prevail.
  2. Configure a public IP address whitelist for the cluster. You must add the IP address of the server on which Heartbeat is installed to the public IP address whitelist of the cluster.

Step 1: Enable and configure the ILM feature in the heartbeat.yml file

To manage Heartbeat indexes by using the ILM feature of Elasticsearch, you can configure the feature in the heartbeat.yml file. For more information, see Set up index lifecycle management.

  1. Download the Heartbeat installation package and decompress it.
  2. Specify the heartbeat.monitors, setup.template.settings, setup.kibana, and output.elasticsearch configurations in the heartbeat.yml file.
    The following configurations are used in this example:
    heartbeat.monitors:
    - type: icmp
      schedule: '*/5 * * * * * *'
      hosts: ["47.111.xx.xx"]
    
    setup.template.settings:
      index.number_of_shards: 3
      index.codec: best_compression
      index.routing.allocation.require.box_type: "hot"
    
    setup.kibana:
    
      # Kibana Host
      # Scheme and port can be left out and will be set to the default (http and 5601)
      # In case you specify and additional path, the scheme is required: http://localhost:5601/path
      # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
      host: "https://es-cn-4591jumei00xxxxxx.kibana.elasticsearch.aliyuncs.com:5601"
    
    output.elasticsearch:
      # Array of hosts to connect to.
      hosts: ["es-cn-4591jumei00xxxxxx.elasticsearch.aliyuncs.com:9200"]
      ilm.enabled: true
      setup.template.overwrite: true
      ilm.rollover_alias: "heartbeat"
      ilm.pattern: "{now/d}-000001"
    
      # Enabled ilm (beta) to use index lifecycle management instead daily indices.
      #ilm.enabled: false
    
      # Optional protocol and basic auth credentials.
      #protocol: "https"
      username: "elastic"
      password: "<your_password>"

    The following table describes some parameters in the preceding configurations. For more information about other parameters, see open source Heartbeat configuration documentation.

    Parameter Description
    index.number_of_shards The number of primary shards. Default value: 1.
    index.routing.allocation.require.box_type Specifies whether to write data to hot nodes.
    ilm.enabled Specifies whether to enable the ILM feature. If this parameter is set to true, the feature is enabled.
    setup.template.overwrite Specifies whether to overwrite the original index template. If you have loaded an index template of a specific version to Elasticsearch, you must set this parameter to true to overwrite the original index template with the loaded template.
    ilm.rollover_alias Specifies the alias of the index that is generated during a rollover. Default value: heartbeat-\{beat.version\}.
    ilm.pattern The index pattern that is generated during a rollover. date math is supported. Default value: {now/d}-000001. If a rollover condition is met, the system increments the last digit in the index name by one to generate a new index name.

    For example, an index generated after the first rollover is named heartbeat-2020.04.29-000001. If another rollover condition is met, Elasticsearch creates a index named heartbeat-2020.04.29-000002.

    Notice If you change the setting of ilm.rollover_alias or ilm.pattern after an index template is loaded, you must set setup.template.overwrite to true to overwrite the original index template with the loaded index template.
  3. Start the Heartbeat service.
    sudo ./heartbeat -e

Step 2: Create an ILM policy

Elasticsearch allows you to use API calls or the Kibana console to create an ILM policy. This step describes how to call the ILM policy operation to create an ILM policy.
Note Heartbeat allows you to run the ./heartbeat setup --ilm-policy command to load the default policy and write it to Elasticsearch. You can run the ./heartbeat export ilm-policy command to export the default policy to stdout. Then, modify the default policy to manually create an ILM policy.

Log on to the Kibana console and run the following command:

PUT /_ilm/policy/hearbeat-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "5mb",
            "max_age": "1d",
            "max_docs": 100
          }
        }
      },
      "warm": {
        "min_age": "60s",
        "actions": {
          "forcemerge": {
                "max_num_segments":1
              },
          "shrink": {
                "number_of_shards":1
              }
        }
      },
      "cold": {
        "min_age": "3m",
        "actions": {
          "allocate": {
            "require": {
              "box_type": "warm"
            }
          }
        }
      },
      "delete": {
        "min_age": "1h",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}
Parameter Description
hot A rollover is triggered if an index attached to the ILM policy meets one of the following conditions: The size of the written data exceeds 5 MB. The index has been used for more than one day. The number of documents in the index exceeds 100. During the rollover, the system creates an index and starts the ILM policy again. The original index enters the warm phase 60 seconds after the rollover.
Notice If the value of max_docs, max_size, or max_age is reached during a rollover, Elasticsearch archives the index.
warm After the index enters the warm phase, the system shrinks it down to a new index that has only one primary shard and merges segments in the index into one segment. The index enters the cold phase 3 minutes after the rollover starts.
cold The system migrates the index from a hot node to a warm node. The index enters the delete phase 1 hour later.
delete The index is deleted 1 hour later.

Step 3: Attach the ILM policy to an index template

After you start Heartbeat, the system automatically creates a Heartbeat index template in your Elasticsearch cluster. You must attach the ILM policy created in Step 2: Create an ILM policy to this index template.

  1. Log on to the Kibana console of the Elasticsearch cluster.
    For more information, see Log on to the Kibana console.
  2. In the left-side navigation pane, click Management.
  3. In the Elasticsearch section, click Index Lifecycle Policies.
  4. In the Index lifecycle policies section, find the ILM policy you created, and choose Actions > Add policy to index template.
    Add policy
  5. In the dialog box that appears, select heartbeat from the Index template drop-down list and specify Alias for rollover index.
    Add policy
  6. Click Add policy.

Step 4: Attach the ILM policy to an index

After you start Heartbeat, the system automatically creates Heartbeat indexes in your Elasticsearch cluster. You must attach the ILM policy that is attached to the index template you created to the first index. For more information, see Step 3: Attach the ILM policy to an index template.

  1. In the Elasticsearch section of the Management page, click Index Management.
  2. In the Index management section, find your index and click its name.
  3. On the Summary tab of the pane that appears, choose Manage > Remove lifecycle policy to remove the default policy of Heartbeat.
    Remove the default policy of Heartbeat
  4. In the dialog box that appears, click Remove policy.
  5. Choose Manage > Add lifecycle policy.
  6. In the dialog box that appears, select the ILM policy you created in Step 2: Create an ILM policy from the Lifecycle policy drop-down list and set Index rollover alias to the alias that you specify in Step 3: Attach the ILM policy to an index template. Then, click Add policy.
    Add policy
    If the ILM policy is attached to the index, the information shown in the following figure appears. The policy is attached

Step 5: View indexes in different phases

To view indexes in the hot phase, select Hot from the Lifecycle phase drop-down list in the Index management section. View indexes in the hot phase

You can use this method to view indexes in other phases.

FAQ

Q: How do I configure a check interval for an ILM policy?

A: The system periodically checks for indexes that match an ILM policy. The default interval is 10 minutes. Then, the system rolls over the data in matched indexes. For example, you set max_docs to 100 when you create an ILM policy. In this case, if the system finds that the number of documents in an index exceeds 100 during a check, it triggers a rollover for the index. You can change the value of the indices.lifecycle.poll_interval parameter to control the check interval. This ensures that data in indexes is rolled over in a timely manner.
Notice Set this parameter to an appropriate value. A small value may cause node overload. In this example, this parameter is set to 1m.
PUT _cluster/settings
{
  "transient": {
    "indices.lifecycle.poll_interval":"1m"
  }
}