edit-icon download-icon

Cluster comparison among Redis 4.0, codis and ApsaraDB for Redis

Last Updated: Jan 30, 2018

Architecture comparison

Redis 4.0 cluster

The Redis 4.0 cluster adopts a decentralized structure. The metadata information of the cluster is distributed across nodes and its master-slave switchover relies on the master election through negotiation by multiple nodes. Redis provides the redis-trib tool for cluster deployment and O&M operations.

The access from the client to the hashed database nodes is dependent on the smart client, that is, the client must evaluate and select the route based on the node information returned by Redis. For example, if a client initiates a request for a node and the requested key is not located on this node, the client must evaluate the returned move or ask command and redirect the request to the corresponding node.

codis cluster

  • Codis is composed of three major components:

    • Codis-server: a Redis database with the source code modified. It supports slots, resizing, and migration.

    • Codis-proxy: multi-threaded, with a kernel written in Go.

    • Codis Dashboard: the cluster manager.

  • Codis provides a web-based graphical interface for managing the cluster.

  • The metadata of the cluster is stored in ZooKeeper or etcd.

  • An independent component codis-ha is provided to take charge of the master-slave switchover of Redis nodes.

  • The proxy-based Codis client is not aware of any route table changes. The client needs to run the list proxy command on the Codis dashboard to get the list of all proxies, and decide the proxy node to access according to its own round-robin policies to achieve load balancing.

ApsaraDB for Redis

Architecture diagram of ApsaraDB for Redis cluster is as follows:

Structural diagram

  • The cluster version of ApsaraDB for Redis is composed of three major components:

    • Redis-config: a cluster management tool with dual-node structure and supports disaster recovery.

    • Redis-server: a Redis database with optimized source code. It supports slots, resizing, and migration.

    • Redis-proxy: single-threaded and stateless, with a kernel written in the C++ 14. A cluster can mount multiple proxy nodes based on the cluster type.

  • The metadata of the cluster is stored in meta databases.

  • An independent HA component is provided to take charge of the master-slave switchover of the cluster.

  • ApsaraDB for Redis clusters are also based on the proxy. Users are unaware of the route information and VIP is provided to the client for access at the same time. The client only needs one connection address and does not need to care about the load balancing of proxy access.

Performance comparison

Stress testing environment

The preceding three Redis clusters are established respectively on three physical machines. Each physical machine is configured with a Gigabit NIC, 24-core CPU, and 189 GB of memory. The 3 physical machines run the stress testing tools memtier_benchmark, Codis proxy/Alibaba Cloud proxy, and Redis server, respectively. Redis server adopts the self-contained Redis kernel of various clusters.

Fix the key size to 32 bytes and the ratio of set/get operations to 1:10. Each thread has 16 clients. The stress testing continued for five minutes in the 8-thread, 16-thread, 32-thread, 48-thread, and 64-thread scenarios respectively.

Redis 4.0 cluster requires an additional client connection node, but memtier_benchmark does not support it. As a result, hashtag is used for the stress testing of Redis 4.0.

Every cluster has eight master databases, and eight slave databases, with the AOF enabled. The lowest buffer for AOF rewrite is 64 MB.

The subjects of the stress testing are respectively the single Redis 4.0 node, the single Alibaba Cloud Redis-proxy, the single-core Codis-proxy, and the 8-core Codis-proxy. Codis adopts Go 1.7.4.

The stress testing result is as follows.

Stress testing result

We can see that the single-core Codis-proxy delivers the poorest performance. The stress testing for the 8-core Codis-proxy didn’t adopt the hashtag for the key, which is equivalent to scattering the requests to the eight database nodes in the backend, or equivalent to eight Alibaba Cloud Redis-proxies, the performance data is naturally higher.

The performance of the single-core Alibaba Cloud Redis-proxy approaches that of the native Redis database node when the stress is huge enough.

In the practical production environment, the client needs to implement the cluster protocol to use the native Redis cluster so as to parse the move, ask and other commands and redirect to the node. Two access operations may be required for random accesses to the key and the performance won’t be the same with that of a single node.

Comparison of supported features

Comparison of supported protocols

- Redis 4.0 cluster codis cluster Alibaba Cloud Redis cluster
Transaction The same slot supported Not supported The same slot supported
sub/pub The same slot supported Not supported Supported
flushall Supported Not supported Supported
select Not supported Not supported Supported
mset/mget The same slot supported Supported Supported

Comparison of horizontal scaling

Redis 4.0, Codis, and ApsaraDB for Redis distributed clusters all implement slot-oriented management. The smallest scaling unit is the slot.

The essence of horizontal scaling in distributed clusters is the management over route information of the cluster nodes and migration of data. The smallest unit of data migration for the three clusters is the key.

Principle of horizontal scaling of Redis cluster

Redis 4.0 cluster supports moving a specified slot in the node and automatic re-distribution of existing slots in the cluster node after an empty node is added. Taking the Redis-trib move_slot for example, the slot moving process is analyzed as follows:

  1. Call the setslot command to modify the slot status on the source and target nodes.

  2. Get the slot key list on the source node.

  3. Call the migrate command to migrate the key. During the migration process, Redis remains in the blocking status. Only after the target node is restored successfully will a result be returned.

  4. Call the setslot command to modify the slot status on the source and target nodes.

How can we ensure data consistency during the migration process?

Redis cluster provides a redirection mechanism in the migration status. It returns ASK to the client which must send the asking command to the target node upon receiving ASK, and then initiate a request to the target node for access. When the accessed key meets all of the following conditions, the redirected return will occur:

  • The slot for the key is located on this node. If not, MOVE is returned.

  • The slot is in the migration status.

  • The key does not exist.

As mentioned, migrate is a synchronization-blocking operation. If the key is not empty, it can be read or written even if the slot is in the migration status so as to ensure data consistency.

Principle of horizontal scaling of Codis

Codis implements the same slot re-distribution policy as Redis cluster. The Codis-server kernel stores no slot information and it does not parse the slot where the key is located. It only records the corresponding key to the dict with the slot as the key during a dbadd or other operation. If the key has a tag, it conducts the crc32 operation on the tag and inserts the key to the skiplist with the crc32 value as the key.

The Codis Dashboard initiates the migration state machine program in the background. It ensures that all the proxies are notified to start the migration, that is, the prepare stage. If more than one proxy fails, the migration fails. The migration steps are similar to those in the Redis cluster, except the following:

  • The slot state information is stored in ZooKeeper/etcd.

  • The slotsmgrttagslot command, instead of the migrate command, is sent. The slotsmgrttagslot command gets a key for migration at random in execution. If the key has a tag, it gets all the keys in the skiplist mentioned above for bulk migration.

How can we ensure data consistency during the migration process?

Codis is also a synchronization-blocking migration operation. In terms of data consistency assurance, the Codis-server kernel does not maintain the slot state, so the consistency assurance job falls onto the shoulder of the proxy component. When Codis-proxy is processing a request, it first determines the state of the slot where the key is located. If the slot is in the migration status, it initiates the migration command for the specified key to the Codis-server. After the key is migrated, Codis-proxy turns to the target Codis-server for requests. The practice is simple and has few changes to the Redis kernel. But it also causes slow migration and the client may be stuck for a long time.

Principle of horizontal scaling of ApsaraDB for Redis

Apart from supporting a specified source, node, or slot, ApsaraDB for Redis also provides dynamic allocation of slots based on elements such as the node capacity and slot size to minimize the impact granularity to the cluster availability – which serves as its allocation principle. The migration roughly follows the steps below:

  1. The Redis-config calculates the source and target nodes and slots.

  2. The Redis-config sends the command for migration slots to the Redis-server.

  3. The Redis-server starts the state machine and migrate keys in batch.

  4. The Redis-config checks the Redis-server on a regular basis and updates the slot status.

How can we ensure data consistency during the migration process?

Unlike Codis, ApsaraDB for Redis maintains the slot information in the kernel. ApsaraDB for Redis casts away the practices of Codis, which migrates the entire slot, and Redis cluster, which migrates a single key. It supports bulk migration in the kernel to accelerate the migration speed.

The data migration process in ApsaraDB for Redis is asynchronous. It doesn’t wait for the target node to be restored successfully, but verifies the restoration success by enabling the target node notification and the source node regular checks. In this way, the impact of synchronization blocking on the access to other slots is reduced.

Meanwhile, because of the asynchronous migration, ApsaraDB for Redis implements the normal write request process to ensure data consistency if the request is a write request and the key does not exist in the migration key list. Other data consistency assurance mechanisms in ApsaraDB for Redis are the same with that in Redis 4.0 cluster.

Alibaba Cloud Redis-server optimizes the large key migration process.

Others

- Redis 4.0 Codis ApsaraDB for Redis
Kernel hot upgrade Not supported Not supported Supported
Proxy hot upgrade No proxy Not supported Supported
Number of slots 16384 1024 16384
Password Not supported, and redis-trib script needs to be modified. Supported, and passwords for all components must be consistent. Supported

The hot upgrades of the ApsaraDB for Redis kernel and proxy require no connection interruption during the process and have no impact to the client.

Thank you! We've received your feedback.