Features and routing methods of proxy nodes - Tair (Redis® OSS-Compatible)

Tair（Redis OSS-compatible） cluster instances and read/write splitting instances use proxy nodes to route commands, balance loads, and perform failovers. Proxy nodes can help simplify the client-side logic while providing support for advanced features such as handling connections to multiple databases and caching hotkey data. Gaining a solid understanding of how proxy nodes route commands and handle specific commands empowers you to design more efficient business systems.

Overview of proxy nodes

A proxy node is a component that runs in the standalone architecture in a Tair instance. Proxy nodes do not occupy resources of data shards. A Tair instance uses multiple proxy nodes to balance loads and perform failovers.

Feature	Description
Architecture transformation	Proxy nodes allow you to use a cluster instance in the same manner as you use a standard instance. Proxy nodes support multi-key operations across different hash slots for commands such as DEL, EXISTS, MGET, MSET, SDIFF, and UNLINK. For more information, see Limits on commands supported by cluster instances and read/write splitting instances. If your business requirements grow beyond the capabilities that can be provided by a standard instance, you can migrate the data of the standard instance to a cluster instance that has proxy nodes without having to modify code to reduce costs.
Load balancing and command routing	Proxy nodes establish persistent connections with backend data shards to balance loads and route commands. For more information, see Routing methods of proxy nodes.
Management of traffic to read replicas	Proxy nodes monitor the status of each read replica in real time. If a read replica is in one of the following states, proxy nodes stop routing traffic to the read replica: Abnormal: If a read replica is abnormal, proxy nodes reduce the weight of traffic to the read replica. If the read replica fails to be connected for a specified number of times, the proxy nodes stop routing traffic to the read replica until the read replica is normal again. Full data synchronization: If proxy nodes detect that full data is being synchronized on a read replica, the proxy nodes stop routing traffic to the read replica until the synchronization is complete.
Hotkey data caching	After you enable the proxy query cache feature, proxy nodes cache the request and response data of hotkeys. If a proxy node receives the same request multiple times within the validity period of the cache, the proxy node returns the cached result directly to the client without interacting with backend data shards. This prevents data access skew caused by a large number of read requests for hotkeys. For more information, see Use proxy query cache to address issues caused by hotkeys. Note This feature is available only for Tair DRAM-based and persistent memory-optimized instances.
Support for multiple databases	In cluster mode, multiple databases are not supported by open source Redis or Redis Cluster clients. In this case, only the default database `0` can be used and the `SELECT` command is not supported. You can use proxy nodes to access cluster instances. In this case, multiple databases can be used and the `SELECT` command is supported. By default, 256 databases can be specified for a single cluster instance. Note If you use the StackExchange.Redis client, use StackExchange.Redis 2.7.20 or later. Otherwise, an error occurs. For more information, see StackExchange.Redis update notice.

Note

As the proxy technology evolves, the number of proxy nodes in a cluster instance does not solely determine the processing capacity of proxy nodes. Alibaba Cloud ensures that the allocation and configuration of proxy nodes meet the requirements stated in the specifications.

Routing methods of proxy nodes

Note

For information about commands, see Overview.

Architecture	Routing method	Description
Cluster architecture	Basic routing method	When proxy nodes receive a command that manages an individual key, the proxy nodes identify the hash slot to which the key belongs and route the command to the data shard where the hash slot resides. When proxy nodes receive a command that manages two or more keys stored in different shards, the proxy nodes split the command into multiple commands and route the commands to the corresponding shards.
Cluster architecture	Routing method for specific commands	Pub/Sub commands When proxy nodes receive Pub/Sub commands such as PUBLISH and SUBSCRIBE, the proxy nodes hash the channel names and route the commands to the corresponding data shards. Note To view the Pub/Sub monitoring data in a data shard, you can click the Data Node tab on the Performance Monitor page in the Tair console and select Pub/Sub Monitoring Group as the custom metrics. By default, the Pub/Sub data of the first data shard is displayed. For more information, see View performance monitoring data. Self-developed commands When proxy nodes receive commands that are developed in-house by Alibaba Cloud, such as IINFO and ISCAN, and the data shard ID is specified by the idx parameter, the proxy nodes route the commands to the specified data shard. For more information, see In-house commands for ApsaraDB for Redis instances in proxy mode.
Read/write splitting architecture	Basic routing method	Proxy nodes route write commands to the master node. The system evenly distributes read requests among the master node and read replicas. You cannot change the weights of these nodes. For example, if you purchase an instance that has three read replicas, the weights of the master node and three read replicas are all 25%. Note SLOWLOG and DBSIZE are read commands.
Read/write splitting architecture	Routing method for specific commands	SCAN commands When proxy nodes receive the HSCAN, SSCAN, or ZSCAN command, the proxy nodes calculate the slot in which the involved key resides, perform the modulo operation on the master node and all read replicas, and then route the command to the corresponding node. Self-developed commands When proxy nodes receive commands that are developed in-house by Alibaba Cloud, such as RIINFO and RIMONITOR, the proxy nodes route the commands to the read replica specified by the ro_slave_idx parameter and the data shard specified by the idx parameter. For more information, see In-house commands for ApsaraDB for Redis instances in proxy mode. Other commands Proxy nodes route transaction commands (such as MULTI and EXEC), Lua script commands (such as EVAL and EVALSHA), SCAN commands, INFO commands, and Pub/Sub commands (such as PUBLISH and SUBSCRIBE) to the master node.

Number of connections

Typically, proxy nodes establish persistent connections with data shards to process requests. When the requests include one of the following commands, proxy nodes establish additional connections with data shards as needed. In this case, the maximum number of connections and the maximum number of new connections per second for the instance are subject to the maximum number of connections for a single data shard in direct connection mode. This is because the connections cannot be aggregated in this scenario. You can refer to your instance specifications to view the maximum number of connections for a single data shard. When you run the following commands, make sure that the number of connections to each data shard does not exceed the upper limit.

Note

In proxy mode, Community Edition instances allow up to 10,000 connections to each data shard, and Enhanced Edition () instances allow up to 30,000 connections to each data shard.

Blocking commands: BRPOP, BRPOPLPUSH, BLPOP, BZPOPMAX, BZPOPMIN, BLMOVE, BLMPOP, and BZMPOP.
Transaction commands: MULTI, EXEC, and WATCH.
Monitoring commands: MONITOR, IMONITOR, and RIMONITOR.
Pub/Sub commands: SUBSCRIBE, UNSUBSCRIBE, PSUBSCRIBE, PUNSUBSCRIBE, SSUBSCRIBE, and SUNSUBSCRIBE.

FAQ

Can I forward Lua scripts that perform only read operations to read replicas?
Yes, you can forward Lua scripts that perform only read operations to read replicas. However, the following requirements must be met:
- A read-only account is used. For more information, see Create and manage database accounts.
- The readonly_lua_route_ronode_enable parameter is set to 1 for your Tair instance. A value of 1 indicates that Lua scripts that perform only read operations are routed to read replicas. For more information, see Configure instance parameters.
What is the difference between the proxy mode and the direct connection mode? Which mode is recommended?
We recommend that you use the proxy mode. The following items describe the difference between the proxy mode and the direct connection mode:
- Proxy mode: Proxy nodes forward requests from clients to data shards. This mode provides features such as load balancing, read/write splitting, failover, proxy query cache, and persistent connection.
- Direct connection mode: In this mode, you can use private endpoints to bypass proxy nodes and directly connect to backend data shards of your instance in a similar manner as you connect to an open source Redis cluster. Compared with the proxy mode, the direct connection mode reduces the routing time and accelerates the response of Redis.

How does an abnormal data shard affect data reads and writes?

A: Each data shard runs in a high-availability master-replica architecture. If the master node fails, the system switches the workloads to the replica node to ensure high availability. The following table describes the impacts of abnormal data shards on data reads and writes in specific scenarios and provides optimization methods for each scenario.

Scenario	Impact and optimization
Figure 2. Commands that manage multiple keys	Impact: The client sends four requests over four connections. If Data Shard 2 is abnormal, timeout errors are returned for the requests that are routed to Data Shard 2. The queried data is returned only for Request 1 (GET Key1). Optimization methods: Reduce the usage frequency of the commands that manage multiple keys such as MGET or reduce the number of keys that are managed by a request. This ensures that not all requests fail simply because a single data shard is abnormal. Reduce the usage frequency of transaction commands or reduce the transaction size. This way, when a subtransaction fails, it does not cause the entire transaction to fail.
Figure 3. Single connection	Impact: The client sends two requests over the same connection. If Data Shard 2 is abnormal, timeout errors are returned for Request 1 (GET Key1) and Request 2 (GET Key2). In this example, Request 1 fails because it uses the same connection as Request 2. Optimization methods: Minimize the use of pipelines. Do not use clients that support only a single connection such as Lettuce. We recommend that you use clients that support connection pools such as Jedis. If you use Jedis, you must configure a reasonable timeout period and connection pool size. For more information about Jedis, see Use a client to connect to an instance.