aliyun-qos is a throttling plug-in developed by Alibaba Cloud Elasticsearch to improve cluster stability. It implements node-level read/write throttling and reduces the priority of a specified index if required. You can use the aliyun-qos plug-in to reduce the priorities of services based on the rules predefined by the plug-in. This applies, if you cannot implement throttling on your upstream services, especially on read requests.

Precautions

aliyun-qos is a built-in plug-in. By default, the throttling feature is disabled, and this plug-in cannot be removed. aliyun-qos is designed to improve cluster stability. It does not perform a precise measurement of the read and write traffic.aliyun-qos plug-in
Note If your Elasticsearch cluster is created before the plug-in is released, you must install the plug-in on the plug-in configuration page. For more information, see Install and remove a built-in plug-in. The plug-in cannot be uninstalled after it is installed.

Evaluate thresholds

To ensure the execution efficiency of read and write requests, the aliyun-qos plug-in performs throttling only on a single node, and does not perform a precise measurement of the read and write traffic on all nodes in the cluster. This may cause an inconsistency between the calculated threshold and the actual threshold. Before you can use the aliyun-qos plug-in, evaluate throttling thresholds as follows:

  • Query requests

    Throttling threshold for query requests = Number of limited query requests in a cluster/Number of client nodes or data nodes in the cluster

    For example, if a cluster has five client nodes and its query requests are limited to 1,000, the throttling threshold for query requests is 200.
    Notice The aliyun-qos plug-in does not synchronize query traffic between nodes, and 200 is only an approximate value. In actual situations, the query traffic is not evenly distributed between nodes, and the value needs to be adjusted.
  • Write requests

    The throttling threshold for write requests is calculated by using a similar method to that for query requests, and it needs to be adjusted based on the number of replicas.

    For example, a cluster has two data nodes and one index, the index has one shard and one replica, and 10 MiB of data is written in each time. In this case, each data node is written with 10 MiB of data for each time because the index has a replica. In addition, X-Pack Monitor, Audit, and Watcher tasks also generate write traffic. You must consider these tasks when you set the throttling threshold.

Enable throttling

  1. Log on to the Kibana console of your Elasticsearch cluster.
    For more information, see Log on to the Kibana console.
  2. In the left-side navigation pane, click Dev Tools.
  3. On the Console tab, run the following command to enable the throttling feature of aliyun-qos:
    PUT _cluster/settings
    {
       "transient" : {
          "apack.qos.ratelimit.enabled":"true"
       }
    }
    Note By default, the throttling feature of aliyun-qos is disabled.

    After you enable the throttling feature, proceed with the following operations.

Set QPS

You can define the index_patterns parameter and set the queries per second (QPS) for a specific index or indexes specified by using a wildcard.
  • Set the QPS for a specific index
    PUT _qos/_ratelimit/<limitName>
    {
       "search.index_patterns" : "twitter",
       "search.max_times_sec" : 1000
    }
  • Set the QPS for indexes with a specified prefix
    PUT _qos/_ratelimit/<limitName>
    {
       "search.index_patterns" : "nginx-log-*",
       "search.max_times_sec" : 1000
    }
  • Set the QPS for all indexes
    PUT _qos/_ratelimit/<limitName>
    {
       "search.index_patterns" : "*",
       "search.max_times_sec" : 2000
    }
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.
When you query data on your client or in the Kibana console, the system displays the following error message if the QPS exceeds the value specified by search.max_times_sec. You must reduce the QPS.
{
  "error": {
    "root_cause": [
      {
        "type": "rate_limited_exception",
        "reason": "request indices:data/read/search rejected, limited by [l1:t*:1.0]"
      }
    ],
    "type": "rate_limited_exception",
    "reason": "request indices:data/read/search rejected, limited by [l1:t*:1.0]"
  },
  "status": 429
}

Set bulk.max_bytes_sec

You can set bulk.max_bytes_sec to limit the maximum number of bytes to write and receive per second by client nodes for all bulk requests. For more information, visit Bulk API.

PUT _qos/_ratelimit/<limitName>
{
   "bulk.max_bytes_sec" : 1000000
}
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.
When you write data on your client or in the Kibana console, the system displays the following error message if the number of bytes to write per second exceeds bulk.max_bytes_sec. You must reduce the number.
{
  "error": {
    "root_cause": [
      {
        "type": "rate_limited_exception",
        "reason": "request indices:data/write/bulk rejected, limited by [b2:ByteSizePreSeconds:992.0]"
      }
    ],
    "type": "rate_limited_exception",
    "reason": "request indices:data/write/bulk rejected, limited by [b2:ByteSizePreSeconds:992.0]"
  },
  "status": 413
}

Set bulk.max_bytes_pre

You can set bulk.max_bytes_pre to limit the maximum number of bytes to write and receive by client nodes for a single bulk request. For more information, visit Bulk API.

PUT _qos/_ratelimit/<limitName>
{
   "bulk.max_bytes_pre" : 1000
}
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.

Obtain throttling rules

  • Obtain all throttling rules
    GET _qos/_ratelimit
  • Obtain a specified throttling rule
    GET _qos/_ratelimit/<limitName>
  • Obtain specified throttling rules
    GET _qos/_ratelimit/<limitName1,limitName2>

Delete a throttling rule

DELETE _qos/_ratelimit/<limitName>

Disable throttling

PUT _cluster/settings
{
   "transient" : {
      "apack.qos.ratelimit.enabled":"false"
   }
}
PUT _cluster/settings
{
   "transient" : {
      "apack.qos.ratelimit.enabled":null
   }
}