All Products
Search
Document Center

Tair:Read/write splitting architecture

Last Updated:Mar 04, 2024

Tair introduces the read/write splitting architecture to handle read-heavy workloads. This architecture offers a high level of availability, performance, and flexibility in delivering read/write splitting services. This architecture allows a large number of clients to concurrently read hot data from read replicas and minimizes O&M costs.

Components

A read/write splitting instance contains a master node, multiple read replicas, multiple proxy nodes, and a high availability (HA) system.

Figure 1. Cloud-native (cloud disk-based) read/write splitting instance云盘读写分离版

Figure 2. Classic (local disk-based) read/write splitting instance本地盘读写分离版

Component

Cloud-native read/write splitting instance (recommend)

Classic read/write splitting instance

Master node

The master node processes all write requests. It also processes specific read requests together with read replicas.

Read replica

Read replicas handle read requests and have the following benefits:

  • All read replicas can be used as replica nodes to back up data and ensure disaster recovery.

  • Read replicas synchronize data from the master node by using star replication. Therefore, the synchronization latency of cloud-native instances is far less than that of classic instances that use chained replication.

  • The number of read replicas can be specified. Valid values: 1 to 5.

  • Optimized binlog files are used to replicate data. This way, you do not need to perform full data synchronization.

Read replicas handle read requests and have the following benefits:

  • Read replicas use chained replication. If your instance contains a large number of read replicas, the read replicas located at the end of the chain experience higher latency.

  • The number of read replicas can be set to 1, 3, or 5.

  • Optimized binlog files are used to replicate data. This way, you do not need to perform full data synchronization.

Replica node

No replica nodes are provided. Read replicas are used as replica nodes. If the master node fails, requests are switched to a random read replica.

Cloud-native read/write splitting instances cost less than classic read/write splitting instances that have the same specifications because of the lack of replica nodes.

A replica node serves as a cold standby node to back up data and does not provide services. If the master node fails, requests are switched to the replica node.

Proxy node

When a client is connected to a proxy node, the proxy node automatically identifies request types and forwards requests to different nodes based on the node weights. You cannot change the weights. For example, write requests are forwarded to the master node, and read requests are forwarded to the master node and read replicas.

Note
  • Clients must connect to proxy nodes instead of other nodes.

  • The system evenly distributes read requests among the master node and read replicas. You cannot change the weights. For example, if you purchase an instance that has three read replicas, the weights of the master node and three read replicas are all 25%.

HA system

The HA system monitors the status of each node. If the master node fails, the HA system performs a switchover between the master node and the replica node. If a read replica fails, the HA system creates another read replica to process read requests. During a switchover, the HA system updates the routing and weight information.

Benefits

  • Compatibility

    You can upgrade standard instances to read/write splitting instances that use proxy nodes to forward requests. After the upgrade, you can connect to the instances from any Redis-compatible client without modifying your application. Read/write splitting instances are fully compatible with Redis commands. For information about the limits on commands supported by read/write splitting instances, see Limits on commands supported by read/write splitting instances.

  • HA

    • Alibaba Cloud has developed an HA system for read/write splitting instances. The HA system monitors the status of all nodes of an instance to ensure HA. If the master node fails, the HA system switches the workloads from the master node to the replica node and updates the instance topology. If a read replica fails, the HA system creates another read replica. The HA system synchronizes data, forwards read requests to the new read replica, and suspends the failed read replica.

    • A proxy node monitors the status of each read replica in real time. If a read replica is unavailable due to an exception, the proxy node reduces the weight of this read replica. If a read replica fails to be connected for a specified number of times, the system suspends the read replica and forwards read requests to available read replicas. The proxy node continues to monitor the status of the unavailable read replica. After the read replica recovers, the proxy node adds it to the list of available read replicas and forwards requests to it.

  • High performance

    The read/write splitting architecture supports chained replication. This allows you to scale out read replicas to increase the read capacity. The replication process is optimized based on the Redis source code to maximize workload stability during replication and make full use of the physical resources for each read replica.

Scenarios

High QPS

Standard instances of Tair are not designed for high queries per second (QPS) scenarios. If your application is read-heavy, you can select a read/write splitting instance and deploy multiple read replicas to resolve performance bottlenecks caused by the single-node standard architecture. A read/write splitting instance can handle QPS that is up to five times that of a standard instance.

Note

Latency exists when data is synchronized to read replicas. As such, read/write splitting instances are suitable for businesses that can tolerate a specific amount of dirty data. In scenarios that require high data consistency, we recommend that you choose the cluster architecture.

Usage notes

  • If a read replica fails, requests are forwarded to other available read replicas. If all read replicas are unavailable, requests are forwarded to the master node. Read replica failures may result in increased workloads on the master node and an increased response time. To process a large number of read requests, we recommend that you use multiple read replicas.

  • If an error occurs on a read replica, the HA system suspends the read replica and creates another read replica. This process involves resource allocation, data synchronization, and service loading. The amount of time that is required for a switchover depends on the system workloads and data volume. Tair does not guarantee a specific amount of time required for data restoration by using read replicas.

  • Full data synchronization among read replicas is triggered in specific scenarios. For example, it can be triggered when a switchover occurs on the master node. During full data synchronization, read replicas are unavailable. If your requests are forwarded to the read replicas, the following error message is returned: -LOADING Redis is loading the dataset in memory\r\n.

  • For more information about routing methods, see Features of proxy nodes.

Purchase methods

If you have created a cloud-native standard instance that uses cloud disks, you can enable the read/write splitting feature for the instance. For more information, see Enable read/write splitting.