All Products
Search
Document Center

Tair (Redis® OSS-Compatible):Read/write splitting

Last Updated:Jan 26, 2025

Tair (Redis OSS-compatible) allows you to dynamically enable or disable the read/write splitting feature for read-heavy scenarios. This feature provides high-availability, high-performance, and flexible read/write splitting services that can meet your business requirements for centralized and highly concurrent reads of hot data. Read/write splitting instances use the proxy component developed by the Alibaba Cloud Tair team to automatically identify read and write requests, route requests appropriately, and handle failover. You do not need to handle business logic related to read/write splitting or consider handling failovers at the application layer, which significantly reduces access complexity.

Enable read/write splitting for standard instances

A standard read/write splitting instance consists of a master node, multiple read replicas, a proxy node, and a high availability (HA) system, as shown in the following figures.

Figure 1. Cloud-native read/write splitting architecture

image

Figure 2. Classic read/write splitting architecture (retired)

image

Component

Cloud-native read/write splitting instance (recommended)

Classic read/write splitting instance

Master node

The master node processes all write requests. It also processes specific read requests together with read replicas.

Read replica

Read replicas handle read requests and have the following features:

  • All read replicas can be used as replica nodes to back up data and ensure disaster recovery.

  • Read replicas synchronize data from the master node by using star replication. Therefore, the synchronization latency of cloud-native instances is far less than that of classic instances that use chained replication.

  • The number of read replicas in a read/write splitting instance is adjustable within the range of 1 to 9.

Read replicas handle read requests and have the following features:

  • Read replicas use chained replication. If your instance contains a large number of read replicas, the read replicas located at the end of the chain experience higher latency.

  • The number of read replicas can be set to 1, 3, or 5.

Replica node

Any read replica can be used as a replica node. When an exception occurs on the master node, the HA system selects the read replica that has the most complete data as the new master node, and adds a new read replica immediately after the switchover is complete.

Cloud-native read/write splitting instances cost less than classic read/write splitting instances that have the same specifications because of the lack of replica nodes.

A replica node serves as a cold standby node to back up data and does not provide services. If the master node fails, requests are switched to the replica node.

Proxy node

When a client is connected to the proxy node, the proxy node automatically identifies the type of request initiated by the client and then distributes traffic based on predefined weights assigned to each data node. In a typical configuration, all data nodes have equal weights. You cannot change the weights. For example, write requests are forwarded to the master node, and read requests are forwarded to the master node and read replicas.

Note
  • Clients must connect only to proxy nodes.

  • The system evenly distributes read requests among the master node and read replicas. You cannot change the weights. For example, if you purchase an instance that has three read replicas, the weights of the master node and three read replicas are all 25%.

HA system

  • The HA system monitors the status of each node. If the master node fails, the HA system performs a master-replica switchover. If a read replica fails, the HA system creates another read replica to process read requests. During a switchover, the HA system updates the routing and weight information.

  • The logic for selecting a new master node when a failure occurs prioritizes data integrity and completeness. When the master node fails, the HA system selects the read replica that has the most complete and up-to-date data to serve as the new master node.

Description of read/write splitting instances in dual-zone deployment mode

Cloud-native read/write splitting instance (recommended)

Classic read/write splitting instance

The primary and secondary zones provide services by using the following minimum configurations:

  • Primary zone: one master node and one read replica

  • Secondary zone: one read replica

Separate endpoints are available for the primary and secondary zones. Each endpoint supports both read and write operations. Read requests are routed to the master node or read replicas within the same zone from which the requests originated. This ensures that the requests are served by the geographically closest nodes. Write requests are always routed to the master node in the primary zone. The following figure shows the architecture.

image
Note

We recommend that you configure at least two nodes in the primary and secondary zones:

  • Primary zone: one master node and one read replica

  • Secondary zone: two read replicas

Both the master node and read replica are deployed in the primary zone. Only the replica node is deployed in the secondary zone. A replica node serves as a cold standby node to back up data and does not provide services. If the master node fails, requests are switched to the replica node.

Features:

  • On-demand availability and ease of use

    You can directly enable read/write splitting for standard instances. The read and write requests from clients are intelligently identified and forwarded by the proxy node. After you enable read/write splitting, you can use a Redis-compliant client to access the read/write splitting instance. This improves read performance without the need to change your business logic. Read/write splitting instances are fully compatible with Redis commands. However, the usage of specific commands is restricted. For more information, see Limits on commands supported by read/write splitting instances.

  • High availability

    • Alibaba Cloud developed an HA system for read/write splitting instances. The HA system monitors the status of all nodes of an instance to ensure high availability. If the master node fails, the HA system switches workloads from the master node to the replica node and updates the instance topology. If a read replica fails, the HA system creates another read replica. The HA system synchronizes data, forwards read requests to the new read replica, and suspends the faulty read replica.

    • The proxy node monitors the status of each read replica in real time. If a read replica becomes unavailable due to an exception, the proxy node reduces the weight of this read replica. If a read replica fails to be connected for the specified number of times, the system suspends the read replica and forwards read requests to available read replicas. The proxy node continues to monitor the status of the unavailable read replica. After the read replica recovers, the proxy node adds the recovered replica to the list of available read replicas and starts forwarding requests to the replica.

  • High performance

    Read/write splitting instances can scale out read replicas to increase the read capacity. The replication process is optimized based on the Redis source code to maximize workload stability during replication and fully utilize the physical resources of each read replica.

Scenarios:

This feature is suitable for scenarios that feature high queries per second (QPS). If your business requires more reads than writes, the standard architecture may be unable to meet the high QPS requirements. In this case, you can deploy multiple read replicas to resolve the performance bottleneck of a single node. After you enable read/write splitting for an instance, the QPS that the instance can handle can increase by up to nine times.

Note

Due to the asynchronous synchronization mechanism of Redis, data synchronization latency may occur when a large amount of data is written. Read/write splitting instances can be applied to applications that can tolerate a specific amount of dirty data.

Enable read/write splitting for cluster instances

The read/write splitting feature can be enabled only for cloud-native cluster instances in proxy mode. The following figure shows the service architecture.

image

The following table describes the components.

Component

Description

Proxy node

After a client connects to the proxy node, the proxy node automatically detects client requests and forwards the requests to read and write nodes in each data shard. For example, write requests are forwarded to the master node and read requests are forwarded to the master node and read replicas.

Data shard

Each data shard consists of one master node and up to four read replicas.

  • The master node processes all write requests. It also processes specific read requests together with read replicas. The master node is deployed in the primary zone.

  • Read replicas process read requests and synchronize data from the master node by using star replication. The number of read replicas is adjustable within the range of 1 to 4. You can also deploy read replicas in the secondary zone, which ensures that all read replicas have disaster recovery capabilities.

HA system

  • The HA system monitors the status of each node. If the master node fails, the HA system performs a master-replica switchover. If a read replica fails, the HA system creates another read replica to process read requests. During a switchover, the HA system updates the routing and weight information.

  • The logic for selecting a new master node when a failure occurs prioritizes data integrity and completeness. When the master node fails, the HA system selects the read replica that has the most complete and up-to-date data to serve as the new master node.

Note
  • If the instance is deployed in single-zone mode, the master node and read replicas are all deployed in the primary zone. Endpoints are available only for the primary zone.

  • If the instance is deployed in dual-zone mode, separate endpoints are available for the primary and secondary zones. Each endpoint supports both read and write operations. Write requests are routed to the master node in the primary zone. Read requests are routed to the master node or read replicas within the same zone from which the requests originated. This ensures that the requests are handled by the geographically closest nodes. If all read replicas in the secondary zone become unavailable, read requests from the secondary zone are routed to the master node without business interruption.

Usage notes

  • If a read replica fails, requests are forwarded to other available read replicas. If all read replicas are unavailable, requests are forwarded to the master node. Read replica failures may result in increased workloads on the master node and prolonged response time. To process a large number of read requests, we recommend that you use multiple read replicas.

  • If an error occurs on a read replica, the HA system suspends the read replica and creates another read replica. This process involves resource allocation, data synchronization, and service loading. The amount of time that is required for a switchover depends on the system workloads and data volume. ApsaraDB for Redis does not guarantee a specific amount of time required for data restoration by using read replicas.

  • Full data synchronization among read replicas is triggered in specific scenarios. For example, it can be triggered when a switchover occurs on the master node. During full data synchronization, read replicas are unavailable. If your requests are forwarded to the read replicas, the following error message is returned: -LOADING Redis is loading the dataset in memory\r\n.

  • For more information about routing methods, see Features of proxy nodes.

References

  • If no instance is created, you can enable read/write splitting when you create an instance. For more information, see Step 1: Create an instance.

  • If you have already created a cloud-native instance, you can enable read/write splitting for the instance. For more information, see Enable read/write splitting.