If you cannot identify and handle large keys or hotkeys in a timely manner when you use ApsaraDB for Redis, you may encounter performance degradation, deteriorated user experience, and large-scale failures. This topic describes the causes of large keys and hotkeys, the issues that may be caused by large keys and hotkeys, and how to identify and optimize large keys and hotkeys in a timely manner.

Definitions of large key and hotkey

Term Description
large key The size of a key and the number of members in the key determine whether the key is considered a large key. The following section lists some examples:
  • A key that is large in size is considered a large key, such as a string key that is 5 MB in size.
  • A key that has a large number of members is considered a large key, such as a ZSET key that has 10,000 members.
  • A key whose member data is large in size is considered a large key. For example, if a hash key has only 1,000 members but these members have a total size of 100 MB, the key is considered a large key.
hotkey The frequency at which a key is requested determines whether the key is considered a hotkey. The following section lists some examples:
  • A key that receives a large number of queries per second (QPS) is considered a hotkey. For example, if an ApsaraDB for Redis instance has a total QPS of 10,000 and one key in the instance receives 7,000 QPS, the key is considered a hotkey.
  • A key that has a high bandwidth usage is considered a hotkey. For example, if a hash key that has thousands of members and is 1 MB in size sends a large number of HGETALL commands per second, the key is considered a hotkey.
  • A key that has a high CPU utilization is considered a hotkey. For example, if a ZSET key that has tens of thousands of members sends a large number of ZRANGE commands per second, the key is considered a hotkey.
Note The numbers used in the preceding examples are for reference only. You must determine whether a key is a large key or a hotkey based on actual use scenarios of ApsaraDB for Redis.

Issues caused by large keys and hotkeys

Category Description
Large key
  • The amount of time it takes for a client to run a command is longer.
  • An operation may be blocked, an important key may be evicted, or an out of memory (OOM) error may occur when the memory usage of an ApsaraDB for Redis instance reaches the upper limit specified by the maxmemory parameter.
  • The memory usage of a data shard in an ApsaraDB for Redis cluster instance exceeds that of other data shards, which results in imbalanced memory usage across data shards in the instance.
  • When a read request is made for a large key, the response time may increase and other services may be affected because the bandwidth of an ApsaraDB for Redis instance to which the key belongs is completely exhausted.
  • The primary database may be blocked for an extended period of time when a large key is being deleted. This may lead to a synchronization failure or a master-replica switchover.
Hotkey
  • Hotkeys take up large amounts of resources, which slows down the response time of other requests and degrades the performance of an ApsaraDB for Redis instance.
  • Skewed requests may take place for ApsaraDB for Redis cluster instances. A skewed request is a scenario where a data shard in an instance is receiving a large number of requests while other data shards in the instance remain idle. In a skewed request, the maximum number of connections to the data shard may be reached and as a result, new connections may be rejected.
  • In flash sales, overselling occurs if more requests for the key corresponding to a commodity are received than can be handled by ApsaraDB for Redis.
  • A cache breakdown occurs if more requests are made for a hotkey than can be handled by ApsaraDB for Redis. In this case, a large number of requests are directly sent to the backend storage and may cause a backend storage breakdown, which affects other business.

Causes for large keys and hotkeys

large keys and hotkeys may occur due to a variety of reasons, such as incorrect use of ApsaraDB for Redis, insufficient workload planning, accumulation of invalid data, and traffic spikes.

  • Large key
    • Incorrect use of ApsaraDB for Redis: If ApsaraDB for Redis is used in an improper scenario, the size of a key may be larger than necessary. For example, if a string key is used to store a binary file that is large in size, the size of the key may be larger than necessary.
    • Insufficient workload planning: Before a feature is released, no sufficient workload planning is conducted. For example, members are improperly split for each key and as a result, some keys have more members than required.
    • Accumulation of invalid data: Invalid data is not deleted on a regular basis, which causes the number of members for a hash key to constantly increase.
    • Code failures: Code failures occur on a consumer business application that uses a list key, which causes the members in the key to only increase.
  • Hotkey
    • Unexpected traffic spikes: Unexpected traffic spikes may occur for a variety of reasons, such as a sudden appearance of popular products and hot news, a large number of "likes" brought by activities of an anchor in a live-streaming room, and a battle between multiple large teams in the same area of a game.

Identify large keys and hotkeys

ApsaraDB for Redis provides a variety of methods for you to identify large keys and hotkeys.

Method Benefit and drawback Description
Use the real-time key statistics feature (recommended)
  • Benefits: This method has high precision and minimal impacts on performance.
  • Drawbacks: The number of keys displayed is limited, but sufficient for most common scenarios.
You can use the key analysis feature to display the statistics of large keys and hotkeys in an instance in real time. You can also query the historical statistics of large keys and hotkeys that are generated within the last four days. You can use this feature to obtain key statistics, such as memory usage and access frequency. Then, you can troubleshoot issues and optimize operations based on the statistics.
Offline key analysis
  • Benefits: This method allows you to analyze historical backup files without affecting online services.
  • Drawbacks: This method has poor timeliness and takes longer to analyze large Redis Database (RDB) files.
The cache analysis feature allows you to analyze RDB backup files of ApsaraDB for Redis instances in a customized manner and identify large keys in these instances. You can view the statistics of keys in an instance, such as the memory usage, distribution, and expiration time of keys. You can use these statistics to optimize operations and prevent issues such as insufficient memory and performance degradation that are caused by improper distribution of keys.
Identify large keys and hotkeys by using the bigkeys and hotkeys parameters in redis-cli.
  • Benefits: This method is convenient, fast, and secure.
  • Drawbacks: This method does not support customized analysis and provides limited precision and poor timeliness.
The bigkeys parameter provided by ApsaraDB for Redis allows redis-cli to traverse all keys in an ApsaraDB for Redis instance and return the statistics of keys and the largest keys of each data type. The bigkeys parameter can return statistics for keys of six data types: string, list, hash, set, ZSET, and stream.
Note If you want to analyze only all large string keys or identify the hash keys that have more than 10 members, the bigkeys parameter cannot fulfill your needs.

As of Redis 4.0, the hotkeys parameter is provided to help you quickly identify hotkeys. For more information, see Query hotkeys in Redis 4.0.

Analyze a specified key by using built-in commands of ApsaraDB for Redis
  • Benefits: This method is convenient and has little impacts on online services.
  • Drawbacks: The returned serialized length of a key is not equal to the actual length of the key in the storage. This method has limited precision and is for reference only.
The following section lists low-risk commands for analyzing keys of various data types to determine whether a key is a large key.
  • For a string key, you can run the STRLEN command. This command returns the length (number of bytes) of a string value stored at the key.
  • For a list key, you can run the LLEN command. This command returns the length of a list value stored at the key.
  • For a hash key, you can run the HLEN command. This command returns the number of members in the key.
  • For a set key, you can run the SCARD command. This command returns the number of members in the key.
  • For a ZSET key, you can run the ZCARD command. This command returns the number of members in the key.
  • For a stream key, you can run the XLEN command. This command returns the number of members in the key.
Note The DEBUG OBJECT and MEMORY USAGE commands consume large amounts of resources when they are run. Moreover, the time complexity of these commands is O(N), which indicates that these commands may block ApsaraDB for Redis instances. Therefore, we recommend that you do not use these commands.
Identify hotkeys at the business layer
  • Benefits: This method can identify hotkeys in a timely manner with high precision.
  • Drawbacks: To implement this method, you must write business code that has increased complexity. Moreover, this method may degrade performance.
This method allows you to add the corresponding code to the business layer to record requests that are sent to ApsaraDB for Redis instances and asynchronously analyze the collected statistics.
Identify large keys in a customized manner by using the redis-rdb-tools project
  • Benefits: This method supports customized analysis without affecting online services.
  • Drawbacks: This method has poor timeliness and takes longer to analyze large RDB files.
The redis-rdb-tools project is written in the Python programming language. redis-rdb-tools is an open source tool that can analyze RDB files in a customized manner. You can analyze the memory usage of all keys in an ApsaraDB for Redis instance, and query and analyze statistics of the keys in a fine-grained manner.
Identify hotkeys by using the MONITOR command
  • Benefits: This method is convenient and secure.
  • Drawbacks: This method consumes CPU, memory, and network resources and has poor timeliness and limited precision.
The MONITOR command that is available in ApsaraDB for Redis can display the statistics of all requests related to an instance, including statistics about time, clients, commands, and keys.

In case of an emergency, you can run the MONITOR command and export the output into a file. You can then analyze and classify the requests in the output to identify hotkeys generated during the emergency period after you disable the MONITOR command.

Note However, the MONITOR command significantly degrades the performance of ApsaraDB for Redis instances. We recommend that you use the MONITOR command only in special cases.

Optimize large keys and hotkeys

Category Solution
Large key
  • Split large keys

    For example, you can split a hash key that contains tens of thousands of members into multiple hash keys that have proper numbers of members. For ApsaraDB for Redis cluster instances, you can split large keys to balance the memory usage cross multiple data shards.

  • Delete large keys
    You can store data that is not applicable to ApsaraDB for Redis in other devices and delete the data from ApsaraDB for Redis.
    Note
    • Redis 4.0 and later: You can run the UNLINK command to safely delete large keys or super large keys. This command can be used to gradually delete keys from ApsaraDB for Redis to prevent ApsaraDB for Redis from being blocked.
    • Earlier than Redis 4.0: You can run the SCAN command to read some data and then delete the data. To prevent ApsaraDB for Redis from being blocked, we recommend that you do not delete a large number of keys at a time.
  • Monitor the memory usage of ApsaraDB for Redis

    You can specify proper memory usage thresholds in the monitoring system for ApsaraDB for Redis to send alerts. You can specify 70% as the memory usage threshold and 20% as the threshold of memory usage increase over a one-hour period. This way, you can prevent potential problems. For example, you can send alerts in advance to prevent an increase in the number of keys caused by the failure of a consumer application that uses list data. For more information, see Alert settings.

  • Delete expired data on a regular basis
    The accumulation of expired data leads to large keys. For example, you may incrementally write a large amount of data to hash keys and ignore the timeliness of the data. You can use scheduled tasks to delete invalid data.
    Note To prevent ApsaraDB for Redis from being blocked when you delete invalid hash data, we recommend that you run the HSCAN and HDEL commands.
  • Use ApsaraDB for Redis Enhanced Edition (Tair)

    If you have a large number of hash keys and want to delete a large number of invalid members from some keys, invalid members cannot be deleted in a timely manner by using scheduled tasks. In this case, you can use ApsaraDB for Redis Enhanced Edition (Tair).

    ApsaraDB for Redis Enhanced Edition (Tair) provides all the features of open source Redis and a variety of new and advanced features.

    ApsaraDB for Redis Enhanced Edition (Tair) provides the TairHash data structure. TairHash is a hash data type that allows expiration time and version numbers to be specified for fields. TairHash, as with Redis Hash, provides a variety of data interfaces and high processing performance. However, Redis Hash allows only expiration time to be specified for keys. TairHash extends this limit. TairHash is more flexible in use and simplifies business development in most scenarios. In addition, TairHash uses the active expiration algorithm to check the expiration time of fields and delete expired fields. This process does not increase the database response time.

    The preceding advanced features can be used to simplify business code and reduce Q&M and troubleshooting workloads of ApsaraDB for Redis. For more information, see TairHash commands.

Hotkey
  • Replicate hotkeys for ApsaraDB for Redis cluster instances
    Requests made for a hotkey in a data shard cannot be redistributed to other data shards because the minimum granularity of hotkey migration in a cluster instance is key. This results in a constant high workload for the single data shard. In this case, you can replicate the hotkey in the data shard to generate new hotkeys and migrate the new hotkeys to other data shards. For example, you can replicate a hotkey named foo in a data shard to generate three identical hotkeys named foo2, foo3, and foo4. Then, you can migrate foo2, foo3, and foo4 to other data shards to reduce the pressure on the single data shard.
    Note The drawback of this method is that you must modify the corresponding code and data inconsistency may occur because you must update multiple keys instead of one key. For this reason, we recommend that you consider this method only as a temporary solution.
  • Use a read/write splitting architecture

    If the accumulation of read requests causes hotkeys, you can change your instance into a read/write splitting instance to reduce the read pressure imposed on each data shard of the instance, or increase the number of replica nodes for the instance. However, the read/write splitting architecture increases the complexity of both the business code and the instance. You must provide server load balancing tools such as proxies and Linux Virtual Server (LVS) for multiple replica nodes and prepare to deal with the increased failure rate caused by an significant increase in the number of replica nodes. If you change your instance into a cluster instance, you may encounter bigger challenges in monitoring, Q&M, and troubleshooting.

    In response to these challenges, ApsaraDB for Redis provides out-of-the-box solutions. You can modify your instance architecture as your needs evolve by making a configuration change, such as changing a master-replica instance into a read/write splitting instance, a read/write splitting instance into a cluster instance, or a Community Edition instance into an Enhanced Edition (Tair) instance that provides a large number of advanced features. For more information, see Change the configurations of an instance.

    Note The read/write splitting architecture also has its drawbacks. If a large number of requests are sent, inevitable latency exists for read/write splitting instances and dirty data may be read from these instances. Therefore, the read/write splitting architecture is not the optimal solution for scenarios that have high requirements for read and write capabilities and data consistency.
  • Use the proxy query cache feature of ApsaraDB for Redis Enhanced Edition (Tair)

    ApsaraDB for Redis uses effective sorting and statistical algorithms to identify hotkeys, which are keys that receive more than 3,000 queries per second (QPS). After you enable the proxy query cache feature, proxy nodes cache request and response data of hotkeys based on the rules you set. Proxy nodes cache only request and response data of a hotkey, instead of the entire key. If a proxy node receives a duplicate request within the validity period of the cached data, the proxy server directly returns the response of the request to the client without the need to interact with backend data shards. This improves the read speed, reduces the impacts of hotkeys on the performance of data shards, and prevents skewed requests.

    After this feature is enabled for an ApsaraDB for Redis instance, duplicate requests from clients are directly sent to proxy nodes instead of backend data shards. The proxy nodes then return responses to the clients. Requests made for hotkeys can be processed by multiple proxy nodes instead of a single data node, which significantly reduces the hotkey workload on data nodes. The proxy query cache feature of ApsaraDB for Redis Enhanced Edition (Tair) also provides a variety of commands for you to query and manage proxy query cache. For example, you can run the querycache keys command to query all cached hotkeys and run the querycache listall command to query all cached commands. For more information, see Use proxy query cache to address issues caused by hotkeys.