aliyun-qos is a throttling plug-in developed by the Alibaba Cloud Elasticsearch team. This plug-in is designed to improve cluster stability. The plug-in implements cluster-level read or write throttling and reduces the priorities of specific indexes based on actual situations. You can use the aliyun-qos plug-in to reduce the priorities of services based on the rules predefined by the plug-in. This is helpful, if you cannot implement throttling on your upstream services, especially on read requests.

Prerequisites

The aliyun-qos plug-in installed on your Elasticsearch cluster is upgraded to the latest version.

You can log on to the Kibana console of your Elasticsearch cluster and run the GET /_cat/plugins?v command to check the version of the plug-in. Check the version of the aliyun-qos plug-in
The latest version of the plug-in for an Elasticsearch V7.10 cluster is 7.10.0_ali1.6.0.2. The latest version of the plug-in for an Elasticsearch cluster of another version is in the format of <Version of the Elasticsearch cluster>-rc4. If the plug-in is not of the latest version, you can use one of the following methods to upgrade the version of the plug-in:
  • Plug-in installed on an Elasticsearch V7.10 cluster: Update the kernel version to V1.6.0. For more information, see Upgrade the version of a cluster.
  • Plug-in installed on an Elasticsearch cluster of another version: Submit a ticket to contact Elasticsearch technical engineers to upgrade the version of the plug-in. After the version of the plug-in is upgraded, you must restart your Elasticsearch cluster for the change to take effect.
Note
  • If the version of the aliyun-qos plug-in is earlier than rc4, the system reports the unsupported_operation_exception error when you use the plug-in.
  • You can upgrade the version of the aliyun-qos plug-in that is installed on an Elasticsearch cluster of V6.7.0 or later. If you want to upgrade the version of the aliyun-qos plug-in that is installed on an Elasticsearch cluster of a version earlier than V6.7.0, you must upgrade the version of the cluster to V6.7.0 or later before you can upgrade the version of the aliyun-qos plug-in.

Precautions

  • aliyun-qos is a built-in plug-in and cannot be uninstalled. The throttling feature provided by this plug-in is disabled by default. This plug-in is designed to protect clusters and improve cluster stability. It does not precisely measure read or write traffic. aliyun-qos plug-in
    Notice Before you use the plug-in, you can check whether the plug-in is installed on the Plug-ins page in the Elasticsearch console. If the plug-in is not installed, install it first. For more information, see Install and remove a built-in plug-in. The plug-in cannot be uninstalled after it is installed.
  • Before you upgrade aliyun-qos to the latest version, take note of the following items:
    • Due to differences between the implementation mechanisms of the throttling feature provided by the aliyun-qos plug-in of an earlier version and that of a later version, the throttling feature may be ineffective for a short period of time during the upgrade. The throttling feature recovers after the plug-in installed on the dedicated master node of the cluster is upgraded to the latest version.
    • When you upgrade the aliyun-qos plug-in to the latest version, some limiters configured in the plug-in may fail to be upgraded. If this situation occurs, you must run the following command to upgrade the plug-in again. If an error is reported after you run the command, you can run the command multiple times until the value of hasError is false.
      POST /_qos/limiter/ops/upgrade
      Note If no results are returned after you run the preceding command, all limiters configured in the aliyun-qos plug-in are of the latest version.

Evaluate thresholds

To ensure the processing efficiency of read and write requests, the aliyun-qos plug-in performs throttling on your Elasticsearch cluster but does not precisely measure the read or write traffic of all the nodes in the cluster. This may cause inconsistencies between the measured traffic and actual traffic. Before you use the aliyun-qos plug-in, you can evaluate throttling thresholds based on the following rules:

  • Query requests

    Throttling threshold for query requests = End-to-end queries per second (QPS) from a client to Elasticsearch

    Notice End-to-end QPS indicates the number of query requests that are sent to client nodes per second.
  • Write requests

    The method that is used to calculate the throttling threshold for write requests is similar to the method that is used to calculate the throttling threshold for query requests. However, you must adjust the calculated threshold for write requests based on the number of replica shards.

    For example, a cluster contains two data nodes and stores one index that has one primary shard and one replica shard, and 10 MB of data is written each time. In this case, 10 MB of data is written to each data node each time because the index has one replica shard. In addition, X-Pack Monitor, Audit, and Watcher tasks also generate write traffic. You must consider these tasks when you configure the throttling threshold.

Enable throttling

The throttling feature provided by the aliyun-qos plug-in is disabled by default. You must enable the feature before you use the plug-in. Code used to enable the throttling feature varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.
Note You can run all the commands provided in this topic in the Kibana console. For more information, see Log on to the Kibana console.
Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
PUT _cluster/settings
{
  "persistent": {
    "apack.qos.limiter.enabled": true
  }
}
PUT _cluster/settings
{
   "persistent" : {
      "apack.qos.ratelimit.enabled":"true"
   }
}

Disable throttling

You can set the parameter related to the throttling feature to false or null to disable the feature. Code used to disable the throttling feature varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.

Method Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
Set the parameter related to the throttling feature to false
PUT _cluster/settings{
  "persistent": {
    "apack.qos.limiter.enabled": false
  }
}
PUT _cluster/settings
{
   "persistent" : {
      "apack.qos.ratelimit.enabled":"false"
   }
}
Set the parameter related to the throttling feature to null
PUT _cluster/settings
{
  "persistent": {
    "apack.qos.limiter.enabled": null
  }
}
PUT _cluster/settings
{
   "persistent" : {
      "apack.qos.ratelimit.enabled":null
   }
}

Configure a limiter for the aliyun-qos plug-in installed on an Elasticsearch V7.10 cluster

Note
  • This section is suitable only for the aliyun-qos plug-in that is installed on an Elasticsearch V7.10 cluster.
  • The configuration of a limiter contains two items: limiters and tags. The tags configuration item defines resources on which throttling is performed. The limiters configuration item defines the throttling type and throttling threshold.
  • Limiters are classified into common limiters and default limiters. You can configure ** in the tags item to implement the feature of a default limiter. For example, you can configure ** in the tags item to use a default limiter to perform throttling on the traffic transferred to each primary or replica shard or on queries per second (QPS) of each application.
  • For more information about the configuration example of a limiter, see Example for configuring a limiter.
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
     ${action}.${limiter_type}:${threshold}
  },
  "tags": {
    ${tagName}:${tagValue}
  },
  "priority":0,               
  "params":{  
      "watchMode":true
  }
}
Parameter Description Valid value
action The type of operation on which throttling is performed.
  • write: write operation for a document, including indexing or creating a document.
  • update: update operation for a document.
  • delete: deletion operation for a document.
  • search: query operation.
  • search_shards: query operation that is used to query the total number of primary and replica shards for an index.
limiter_type The throttling type. The following throttling types are supported:
  • Rate
  • Number of concurrent threads
  • Single request
  • rate: the rate. If you set this parameter to rate, you can set the threshold parameter only to an integer.
  • qps: the QPS. If you set this parameter to qps, you can set the threshold parameter only to an integer.
  • tps: the transactions per second (TPS). If you set this parameter to tps, you can set the threshold parameter only to an integer.
  • throughput: the throughput. You can set this parameter to throughput only if you set the action parameter to write, update, or delete. If you set this parameter to throughput, the units supported by the threshold parameter include GB, MB, and KB, and the maximum value of the threshold parameter is 2 GB.
  • thread_count: the number of concurrent threads that can be used for a request. By default, one thread is used for one request.
  • concurrent_count: the number of concurrent threads that can be used for a request. If you set this parameter to concurrent_count, the value of the threshold parameter is calculated based on the operation specified in the request. For example, if you configure search_shards.concurrent_count:20 in the limiters configuration item, a maximum of 20 concurrent threads can be used to query the number of primary or replica shards for an index.
  • max_per_request: the maximum number of times that a type of operation is allowed to be performed in a single request. For example, if you configure update.max_per_request:10, a maximum of 10 update operations are allowed in a write request.
  • max_size_per_request: the maximum number of operations that are allowed in a single request. You can set this parameter to max_size_per_request only if you set the action parameter to write, update, or delete.
threshold The throttling threshold. An integer that is greater than or equal to -1.
Note You can set this parameter to a string that contains a unit for some limiter types. For more information, see the description of the limiter_type parameter.
tagName The tag key.
  • node: the name of the current node.
  • is_master: specifies whether the current node is a dedicated master node. If you set the tagName parameter to this value, the value of the tagValue parameter is true or false.
  • index: the name of an index. If you want to specify multiple index names, set the tagValue parameter to an array that is composed of the index names. If you set the tagValue parameter to the alias of an index, the aliyun-qos plug-in parses the actual name of the index and performs throttling on the index based on the actual name. You can set the tagName parameter to this value only for a subrequest of the IndicesRequest type.
  • shard: the name of a primary or replica shard. If you set this parameter to shard, set the value of the tagValue parameter in the index[id] format, such as test[0]. You can set the tagName parameter to this value only for a subrequest of the ReplicationRequest type.
  • index_in_url: a string for the name of an index in a URL. If you set the tagValue parameter to the alias of an index, the aliyun-qos plug-in performs throttling on the index indicated by the alias. You can set the tagName parameter to this value only for a subrequest of the IndicesRequest type.
tagValue The tag value. A string or an array. If an array is specified, the throttling applies to the resources that match any element in the array. Exact match, fuzzy match, and all values are supported for the tag value. Examples:
  • Exact match: "abc"
  • Fuzzy match: "ab*"
  • All values: "**"
    Notice If you set the tagValue parameter to "**", a default limiter is used. This indicates that a limiter is generated for each resource that matches the tag. For example, if you configure index:"**",search.tps:1, the search speed of each index, rather than that of all indexes, is limited to 1 by default.
priority The priority. An integer. Default value: 0.
Note A limiter with a higher priority is more likely to take effect. If multiple default limiters are hit, the limiter that has the highest priority takes effect.
params The advanced parameters. watchMode: specifies whether to enable the watch mode. Valid values: true and false. Default value: false. If you set the watchMode parameter to true, Elasticsearch only records the number of requests that are denied in the related metric but does not perform throttling. You can view the number of requests that are denied in a monitoring chart and check the throttling effect in advance. This prevents unexpected throttling effect caused by inappropriate configurations.

Example for configuring a limiter

Configure throttling for QPS

You can specify a QPS threshold for an index to limit the number of query requests that a client node can receive per second. Complete index names and wildcards for index names are supported for the tag values that correspond to the tag keys index and index_patterns. Code used to configure throttling for QPS varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used to configure throttling for QPS in the aliyun-qos plug-in of different versions.

Operation Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
Configure throttling for the QPS of a specific index
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "search.qps": "1000"
  },
  "tags": {
    "index": "twitter"
  }
}
PUT _qos/_ratelimit/<limiterName>
{
   "search.index_patterns" : "twitter",
   "search.max_queries_per_sec" : 1000
}
Configure throttling for the QPS of indexes whose names have a specified prefix
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "search.qps": "1000"
  },
  "tags": {
    "index": "nginx-log-*"
  }
}
PUT _qos/_ratelimit/<limiterName>
{
   "search.index_patterns" : "nginx-log-*",
   "search.max_queries_per_sec" : 1000
}
Configure throttling for the QPS of any index
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "search.qps": "1000"
  },
  "tags": {
    "index": "**"
  }
}
Notice index:** indicates any index. For example, an Elasticsearch cluster has three indexes: A, B, and C. The QPS threshold is 1,000 for each one of A, B, and C.
PUT _qos/_ratelimit/<limiterName>
{
   "search.index_patterns" : "*",
   "search.max_queries_per_sec" : 1000
}
Configure throttling for the total QPS of all indexes
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "search.qps": "1000"
  },
  "tags": {
    "index": "*"
  }
}
Notice index:* indicates any index. For example, an Elasticsearch cluster has three indexes: A, B, and C. The total QPS threshold of A, B, and C is 1,000. You can omit the tags configuration item to achieve the same effect.
Not supported
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.
If your QPS exceeds a throttling threshold that you configured in the aliyun-qos plug-in when you use a client to query data or query data in the Kibana console of your Elasticsearch cluster, the system reports one of the following error messages. In this case, you must reduce your QPS. The error messages that are reported vary based on the version of the aliyun-qos plug-in.
  • Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster
    {
      "error": {
        "root_cause": [
          {
            "type": "status_exception",
            "reason": "search blocked, limited by [<limiterName>][search.qps](<limiterId>) threshold:[x]"
          }
        ],
        "type": "status_exception",
        "reason": "search blocked, limited by [<limiterName>][search.qps](<limiterId>) threshold:[x]"
      },
      "status": 429
    }
  • Code for the plug-in installed on an Elasticsearch cluster of another version
    {
      "error": {
        "root_cause": [
          {
            "type": "rate_limited_exception",
            "reason": "request indices:data/read/search rejected, limited by [l1:t*:1.0]"
          }
        ],
        "type": "rate_limited_exception",
        "reason": "request indices:data/read/search rejected, limited by [l1:t*:1.0]"
      },
      "status": 429
    }

Configure throttling for TPS

You can specify a TPS threshold for an index to limit the number of write requests that a client node can receive per second. Complete index names and wildcards for index names are supported for the tag values that correspond to the tag keys index and index_patterns. Code used to configure throttling for TPS varies based on the version of aliyun-qos plug-in. The following table describes the code that can be used to configure throttling for TPS in the aliyun-qos plug-in of different versions.

Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "write.tps": "100000"
  },
  "tags": {
    "index": "nginx-log-*"
  }
}
Not supported

Configure throttling for the volume of data that can be written per second for all bulk requests

You can specify the total number of bytes that can be written for all bulk requests to limit the maximum number of bytes that a client node can receive per second. For more information about bulk requests, see Bulk API. Complete index names and wildcards for index names are supported for the tag values that correspond to the tag keys index and index_patterns. Code used to configure throttling for the volume of data that can be written per second for all bulk requests varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.
Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "write.throughput": "100MB"
  },
  "tags": {
    "index": "nginx-log-*"
  }
}
PUT _qos/_ratelimit/<limiterName>
{
   "bulk.index_patterns": "nginx-log-*",
   "bulk.max_throughput_in_bytes" : 104857600
}
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.

Configure throttling for the volume of data that can be written per second for a bulk request

You can specify the maximum number of bytes that can be written for a bulk request to limit the maximum number of bytes that a client node can receive per second. For more information about bulk requests, see Bulk API. Complete index names and wildcards for index names are supported for the tag values that correspond to the tag keys index and index_patterns. Code used to configure throttling for the volume of data that can be written per second for a bulk request varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.
Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "write.max_size_per_request": "1000"
  },
  "tags": {
    "index": "nginx-log-*"
  }
}
PUT _qos/_ratelimit/<limiterName>
{
   "bulk.index_patterns": "nginx-log-*",
   "bulk.max_request_size_in_bytes" : 1000
}
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.
When you use a client to write data or write data in the Kibana console of an Elasticsearch cluster, if the number of bytes to write in a bulk request exceeds the throttling threshold that you configured in the aliyun-qos plug-in installed on the cluster, the system reports one of the following error messages. In this case, you must reduce the number of bytes to write in a bulk request based on the error message. The error messages that are reported vary based on the version of the aliyun-qos plug-in.
  • Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster
    {
      "error" : {
        "root_cause" : [
          {
            "type" : "status_exception",
            "reason" : "write_size blocked, limited by [<limiterName>][write.max_size_per_request](<limiterId>) threshold:[x] try acquire [x]"
          }
        ],
        "type" : "status_exception",
        "reason" : "write_size blocked, limited by [<limiterName>][write.max_size_per_request](<limiterId>) threshold:[x] try acquire [x]"
      },
      "status" : 400
    }
  • Code for the plug-in installed on an Elasticsearch cluster of another version
    {
      "error": {
        "root_cause": [
          {
            "type": "rate_limited_exception",
            "reason": "request indices:data/write/bulk rejected, limited by [b2:ByteSizePreSeconds:992.0]"
          }
        ],
        "type": "rate_limited_exception",
        "reason": "request indices:data/write/bulk rejected, limited by [b2:ByteSizePreSeconds:992.0]"
      },
      "status": 413
    }

Configure throttling for the number of concurrent threads used to query the number of primary or replica shards

You can specify a threshold for the number of concurrent threads used to query the number of primary or replica shards to reduce the load of your Elasticsearch cluster. Complete index names and wildcards for index names are supported for the tag values that correspond to the tag keys index and index_patterns. Code used to configure throttling for the number of concurrent threads used to query the number of primary or replica shards varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.
Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "search_shards.concurrent_count": "10"
  },
  "tags": {
    "index": "nginx-log-*"
  }
}
Not supported
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.

Configure multiple settings at a time for a limiter

You can configure multiple settings at a time for a limiter. Complete index names and wildcards for index names are supported for the tag values that correspond to the tag keys index and index_patterns. Code used to configure multiple settings at a time for a limiter varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.
Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
PUT /_qos/limiter/<limiterName>
{
  "limiters": {
    "search.qps": "1000",
    "write.tps": "100000",
    "write.throughput": "1000000",
    "write.max_size_per_request": "1000",
    "search_shards.concurrent_count": "10"
  },
  "tags": {
    "index": "nginx-log-*"
  }
}
Not supported
Note You can define multiple rules to trigger throttling. If a request hits one of the rules, throttling is triggered.

Query limiters

Code used to query limiters varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.

Operation Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
Query all limiters
GET _qos/limiter
GET _qos/_ratelimit
Query a specific limiter
GET _qos/limiter/<limiterName>
GET _qos/_ratelimit/<limiterName>
Query multiple limiters
Note Separate multiple limiter names with commas (,). Wildcards are not supported.
GET _qos/limiter/<limiterName1,limiterName2>
GET _qos/_ratelimit/<limiterName1,limiterName2>

Delete limiters

Code used to delete limiters varies based on the version of the aliyun-qos plug-in. The following table describes the code that can be used for the aliyun-qos plug-in of different versions.

Operation Code for the latest version of the plug-in installed on an Elasticsearch V7.10 cluster Code for the plug-in installed on an Elasticsearch cluster of another version
Delete a specific limiter
DELETE _qos/limiter/<limiterName>
DELETE _qos/_ratelimit/<limiterName>
Delete multiple limiters
Note Separate multiple limiter names with commas (,). Wildcards are not supported.
DELETE _qos/limiter/<limiterName1,limiterName2>
DELETE _qos/_ratelimit/<limiterName1,limiterName2>