Tair instances run on the data layer that is closer to the application layer. Therefore, data is frequently written to or read from Tair instances. This consumes large amounts of bandwidth resources. The maximum bandwidth available to a Tair instance varies based on the instance type. If the maximum bandwidth of a Tair instance is exceeded, applications may be unable to access data that resides on the instance.

Step 1: Analyze traffic usage

Check the traffic usage of a Tair instance within a specific period of time. For more information, see View monitoring data.

In this example, both the inbound and outbound traffic usage stays at 100%, as shown in the following figure.

Note
  • In most cases, if the average traffic usage stays around 80%, bandwidth resources may be exhausted. We recommend that you pay attention to and troubleshoot the issue.
  • You must check the Intranet In Ratio and Intranet Out Ratio metrics, which separately indicate the inbound and outbound traffic usages.
Figure 1. Example of traffic usage
Example of traffic usage

Step 2: Optimize traffic usage

  1. Adjust the bandwidth of the Tair instance to reduce the impact on your business. This also provides you with more time to troubleshoot the issue. For more information, see Manually adjust the bandwidth of a Tair instance.
  2. The amount of user traffic may not match the expected bandwidth consumption. For example, the trend of traffic usage growth and the trend of queries per second (QPS) growth are inconsistent. In this case, use the offline key analysis feature to identify large keys on the Tair instance. For more information, see Offline key analysis.
    Optimize large keys. Keys are typically classified as large keys when their size exceeds 10 KB. For example, you can split large keys, reduce access to large keys, or delete large keys that you no longer need.
    Figure 2. Example of large key analysis
    Example of large key analysis
  3. For Tair DRAM-based instances that use the cluster architecture, enable proxy query cache to address heavy traffic or skewed requests that are caused by hotkeys. For more information, see Use the real-time key statistics feature and Use proxy query cache to address issues caused by hotkeys.
  4. Optional:Connect to cluster instances in direct connection mode to deal with heavy network traffic. For more information, see Enable the direct connection mode.
    Note In direct connection mode, the bandwidth limit of a cluster instance is equal to the bandwidth limit of each data shard multiplied by the number of data shards. For example, if a cluster instance contains 128 data shards and the bandwidth limit of each data shard is 96 Mbit/s, the bandwidth limit of the cluster instance is 12,288 Mbit/s after you enable the direct connection mode.
  5. If the traffic usage is still high after you perform the preceding optimizations, upgrade your instance to an instance type that has more memory. An upgrade improves instance performance and allows the instance to handle more traffic. For more information, see Change the configurations of an instance.
    Note Before you upgrade your Tair instance, you can purchase a pay-as-you-go instance to test whether the upgrade specifications meet your workload requirements. You can release the pay-as-you-go instance after you complete the test. For more information, see Release pay-as-you-go instances.