Standard architecture - Tair - Alibaba Cloud Documentation Center

The standard architecture provides master-replica and standalone deployment models to meet requirements for different use cases. The master-replica deployment model provides high-performance caching services and ensures high data reliability. The standalone deployment model is tailored towards cache-only scenarios and provides cost-effectiveness.

Master-replica

Overview

The standard architecture provides a master-replica deployment model. The master node handles your daily workloads and the replica node stays in hot standby mode to ensure high availability. If the master node fails, the high availability (HA) system switches the workloads to the replica node within 30 seconds after the failure occurs. This mechanism ensures the stability of your workloads.

Architecture of a standard master-replica instance

Benefits

Reliability
- Service reliability
  The master and replica nodes are deployed on different physical machines. The master node serves your workloads. You can use the Redis CLI and common clients to add, delete, modify, and query data on the master node. If the master node fails, the in-house HA system performs a failover to ensure high availability for your workloads.
- Data reliability
  By default, data persistence is enabled for standard master-replica instances. Instances support data backup. You can clone or roll back an instance based on a specified backup set to restore data after accidental operations. Instances created in zones that provide disaster recovery, such as Hangzhou Zone H and Zone I, also support zone disaster recovery.
Compatibility
Standard instances are fully compatible with the Redis protocol. You can migrate your workloads on a self-managed Redis database to a standard master-replica instance of Tair without service disruption. You can also use Data Transmission Service (DTS) to migrate incremental data without service disruption.
In-house proprietary systems
- HA system
  Tair uses the HA system to detect failures on the master node, such as disk I/O failures and CPU failures, and performs failovers to ensure high availability.
- Master-replica replication mechanism
  Alibaba Cloud has customized the master-replica replication mechanism of Tair. You can replicate data by using incremental logs between the master node and the replica node. If the replication is interrupted, system performance and stability remain unchanged. This resolves issues caused by the master-replica replication mechanism of native Redis databases.
  The following section describes specific issues caused by the master-replica replication mechanism of native Redis databases:
  - If the replication is interrupted, the replica node runs the PSYNC command to resynchronize partial data. If the resynchronization fails, the master node synchronizes all Redis Database (RDB) files to the replica node.
  - To synchronize full data, the master node performs a full replication as a response to the single-threading mode. As a result, the master node has a latency of several milliseconds or seconds.
  - Child processes are created to perform copy-on-write (COW) tasks. The child processes consume memory on the master node. The master node may run out of memory and cause the application to exit abnormally.
  - The replica files that the master node generates consume disk I/O and CPU resources.
  - The replication of GB-level files may lead to outbound traffic bursts on the server and increase the sequential I/O throughput of disks. This delays responses and causes more issues.

Scenarios

High compatibility with the Redis protocol
Standard instances are fully compatible with the Redis protocol. You can migrate your workloads to standard instances without service disruption.
Persistent data storage on Tair
Standard instances support data persistence, backup, and recovery features to ensure data reliability.
Foreseeable performance pressure on a single Tair instance
Native Redis databases use the single-threading mode. Therefore, if your workloads support a query rate of lower than 100,000 queries per second (QPS), we recommend that you use a standard instance. If you require higher performance, use cluster instances.
Use of simple Tair commands and less use of sorting and computing commands
CPU performance is the main bottleneck due to the single-threading mode of Tair. We recommend that you use cluster instances if you want to process a large number of sorting and computing workloads.

Standalone

Overview

The standalone deployment model contains only a single data node. No replica nodes are provided to synchronize data in real time. Standard standalone instances are suitable for cache-only scenarios that do not require high data reliability. Standard standalone instances are provided at a relatively low price.

Warning

Before you choose standard standalone instances, you must understand that standard standalone instances do not ensure data reliability or service continuity. We recommend that you do not use standard standalone instances in the production environment.
A standard standalone instance has only a single data node and does not provide data persistence or backup and restoration. As such, if the data node fails, the system creates a Tair instance and migrates data and requests to the new instance. In this case, data loss may occur, and a data prefetching is required for your application.
The following features are not supported for standard standalone instances: automatic or manual backup, offline key analysis, and the recycle bin. If your application requires data reliability, we recommend that you choose standard master-replica instances.

Architecture of a standard standalone instance

Scenarios

Cache-only workloads
Standard standalone instances cannot ensure data reliability. Your application must re-cache data after the master node fails. If you want to handle workloads that require data reliability, use master-replica instances.
High compatibility with the Redis protocol
Standard standalone instances are compatible with all Redis protocols. You can migrate your workloads to standard standalone instances without service disruption.
Stable query rate on a single data node
Standard standalone instances run in the single-threaded model and have only a single CPU core. We recommend that you use standard standalone instances in scenarios with less than 80,000 QPS. If you require higher performance, use cluster instances.
Use of simple Redis commands, where only a few sorting and computing commands are used.
CPU performance is a main bottleneck of standard standalone instances due to their single-threaded model. If you want to process a large number of sorting and computing workloads, we recommend that you use cluster instances.