Distributed locks are one of the most widely adopted features in large applications. You can implement distributed locks based on Redis by using various methods. This topic describes and analyzes common methods of implementing distributed locks. This topic also describes the best practices on how ApsaraDB for Redis Enhanced Edition (Tair) implements high-performance distributed locks. The best practices are based on the extensive experience of Alibaba Group in using ApsaraDB for Redis Enhanced Edition (Tair) and distributed locks.

Distributed locks and scenarios

If multiple threads in a process need to have concurrent access to a specified resource, you can use mutexes (also known as mutual exclusion locks) and read/write locks to create applications. If multiple processes on a host need to have concurrent access to a specified resource, you can use interprocess synchronization primitives, such as semaphores, pipelines, and shared memory. However, if multiple hosts need to have concurrent access to a specified resource, you must use distributed locks. The distributed locks are the mutual exclusion locks that have global presence. You can apply distributed locks on the resources in distributed systems to avoid logical failures by preventing race hazards.

Features of distributed locks

  • Mutual exclusion

    Only one client can hold a lock at a given moment.

  • Deadlock free

    Distributed locks use a lease-based locking mechanism. If a client acquires a lock and encounters an exception, the lock is automatically released after a certain period. This prevents resource deadlocks.

  • Consistency

    Failovers may be triggered by external errors or internal errors. External errors include hardware failures and network exceptions, and internal errors include slow queries and system defects. After failovers are triggered, a replica serves as a master to implement high availability for ApsaraDB for Redis. In this scenario, if services have high requirements for mutual exclusion, the lock of the client must remain the same after the failovers.

Implement distributed locks based on native Redis databases

Note The methods described in this section also apply to ApsaraDB for Redis Community Edition.
  • Acquire a lock

    Redis provides an easy method that you can use to acquire a lock. This easy method is to run the SET command. A command example and the key parameters or options of the command are described as follows:

    SET resource_1 random_value NX EX 5
    Table 1. Description of key parameters or options
    Parameter or option Description
    resource_1 Specifies the key of the distributed lock. If the key exists, the corresponding resource is locked and cannot be accessed by the other clients.
    random_value Specifies a random string. The value must be unique across clients.
    EX Specifies an expiration time for the key. The unit is seconds. You can also use the PX option to set an expiration time that is measured in milliseconds.
    NX Sets the key only if the key does not exist in Redis.

    In the sample code, the expiration time for the resource_1 key is set to five seconds. Therefore, if the client does not release the key, the key expires in five seconds, and the system reclaims the lock. Then, the other clients can lock and access the resource.

  • Release a lock

    In most cases, you can run the DEL command to release a lock. However, this may cause the following issue.

    1. At the t1 time point, the key of the distributed lock is resource_1 for application 1, and the expiration time for the resource_1 key is set to three seconds.
    2. Application 1 remains blocked for more than three seconds due to certain reasons, such as a long response time. Therefore, the resource_1 key expires and the distributed lock is automatically released at the t2 time point.
    3. At the t3 time point, application 2 acquires the distributed lock.
    4. Application 1 resumes from the block and runs the DEL resource_1 command at the t4 time point to release the distributed lock that is held by application 2.

    This example shows that a lock can be released only by the client that sets the lock. Therefore, before a client runs the DEL command to release a lock, the client must run the GET command to check whether the lock is set by the client. In most cases, a client uses the following Lua script in Redis to release the lock that is set by the client:

    if redis.call("get",KEYS[1]) == ARGV[1] then
        return redis.call("del",KEYS[1])
    else
        return 0
    end
  • Renew a lock

    If a client cannot complete the required operations within the lock validity time, the client must renew the clock. A lock can be released only by the client that sets the client. Similarly, a lock can be renewed only by the client that sets the lock. In Redis, a client can use the following Lua script to renew a lock:

    if redis.call("get",KEYS[1]) == ARGV[1] then
        return redis.call("expire",KEYS[1], ARGV[2])
    else
        return 0
    end

Implement distributed locks based on ApsaraDB for Redis Enhanced Edition (Tair)

You can implement distributed locks by using the enhanced string commands of the performance-enhanced instances provided by ApsaraDB for Redis Enhanced Edition (Tair). These enhanced commands are an alternative to Lua scripts.

  • Acquire a lock

    The method of acquiring a lock based on ApsaraDB for Redis Enhanced Edition (Tair) is the same as that based on native Redis databases. The method is to run the SET command:

    SET resource_1 random_value NX EX 5
  • Release a lock

    The CAD command supported by ApsaraDB for Redis Enhanced Edition (Tair) provides an efficient way for you to release a lock:

    /* if (GET(resource_1) == my_random_value) DEL(resource_1) */
    CAD resource_1 my_random_value
  • Renew a lock

    You can run the following CAS command to renew a lock:

    CAS resource_1 my_random_value my_random_value EX 10
    Note The CAS command does not check whether the new value is the same as the original value.

Sample code based on Jedis

  • Define the CAS and CAD commands
    enum TairCommand implements ProtocolCommand {
        CAD("CAD"), CAS("CAS");
    
        private final byte[] raw;
    
        TairCommand(String alt) {
          raw = SafeEncoder.encode(alt);
        }
    
        @Override
        public byte[] getRaw() {
          return raw;
        }
    }
  • Acquire a lock
    public boolean acquireDistributedLock(Jedis jedis,String resourceKey, String randomValue, int expireTime) {
        SetParams setParams = new SetParams();
        setParams.nx().ex(expireTime);
        String result = jedis.set(resourceKey,randomValue,setParams);
        return "OK".equals(result);
    }
  • Release a lock
    public boolean releaseDistributedLock(Jedis jedis,String resourceKey, String randomValue) {
        jedis.getClient().sendCommand(TairCommand.CAD,resourceKey,randomValue);
        Long ret = jedis.getClient().getIntegerReply();
        return 1 == ret;
    }
  • Renew a lock
    public boolean renewDistributedLock(Jedis jedis,String resourceKey, String randomValue, int expireTime) {
        jedis.getClient().sendCommand(TairCommand.CAS,resourceKey,randomValue,randomValue,"EX",String.valueOf(expireTime));
        Long ret = jedis.getClient().getIntegerReply();
        return 1 == ret;
    }

Methods of ensuring lock consistency

The replications between masters and replicas are asynchronous. If a master crashes before data changes are transferred to replicas and a failover is triggered, the data changes in the buffer may not be replicated to the new master. This results in data inconsistency. Note that the failover is triggered to ensure high availability and the new master is the original replica. If the lost data is related to a distributed lock, the locking mechanism becomes faulty and service exceptions occur. This topic introduces three methods that you can use to ensure lock consistency.

  • Redlock

    Redlock is proposed by the founders of the open source Redis project to ensure consistency. Redlock is based on the calculation of probabilities. Assume that a single master-replica Redis instance may lose a lock during a failover and the probability is k%. Note that the failover is triggered to implement high availability. If you use Redlock to implement distributed locks, you can calculate the probability at which N independent Redis instances lose locks at the same time based on the following formula: Probability of losing locks = (k%)^N. Due to the high stability of Redis, the probability is small, which can meet your service requirements.

    Note When you implement Redlock, you do not have to ensure that all the locks in N Redis instances take effect at the same time. In most cases, Redlock can meet your business requirements if you ensure that the locks in M Redis nodes take effect at the same time. Note that M is greater than 1 and less than or equal to N.

    Redlock has the following disadvantages:

    • A client takes a long time to acquire and release a lock.
    • You must handle significant difficulties if you want to implement Redlock in the cluster or standard master-replica instances of ApsaraDB for Redis.
    • Redlock consumes a large number of resources. To implement Redlock, you must create multiple independent ApsaraDB for Redis instances or user-created Redis instances.
  • WAIT command

    The WAIT command of Redis blocks the current client until all the previous write commands are transferred from a master to a specified number of replicas. In the WAIT command, you can specify a time-out period that is measured in milliseconds. The following WAIT command example is used in ApsaraDB for Redis to ensure the consistency of distributed locks.

    SET resource_1 random_value NX EX 5
    WAIT 1 5000

    If you run the WAIT command, a client continues to perform other operations only in two scenarios after the client acquires a lock. One of the scenarios is that data is transferred to the replicas. The other scenario is that the time-out period is reached. In this example, the time-out period is 5,000 milliseconds. If the output of the WAIT command is 1, data is replicated between the master and the replicas. In this case, data consistency is ensured. The WAIT command is more cost-effective than Redlock.

    The considerations of the WAIT command are described as follows:

    • The WAIT command only blocks the client that sends the WAIT command, and does not affect the other clients.
    • If the WAIT command returns a valid integer, the lock is transferred from the master to the replicas. However, if a failover is triggered to implement high availability before the command returns a successful response, data may be lost. In this case, the output of the WAIT command only indicates a possible replication failure, and data integrity cannot be ensured. After the WAIT command returns errors, you can acquire a lock again or verify data.
    • You do not have to run the WAIT command to release a lock. This is because the distributed locks are mutually exclusive. Logical failures do not occur even if you release the lock after a certain period.
  • ApsaraDB for Redis Enhanced Edition (Tair)

    Redlock and the WAIT command offer the following benefits:

    • Redlock improves data consistency if the number of Redis nodes increases.
    • The WAIT command is cost-effective.
    ApsaraDB for Redis Enhanced Edition (Tair) offers the following benefits:
    • The unique high availability (HA) and data persistence mechanisms of ApsaraDB for Redis Enhanced Edition (Tair) help you ensure data security and service stability. ApsaraDB for Redis Enhanced Edition (Tair) allows you to ensure high data consistency even if you do not deploy multiple Redis nodes or use the WAIT command.
    • The CAS and CAD commands supported by the performance-enhanced instances help you reduce the costs of developing and managing distributed locks and improve lock performance.
    • Performance-enhanced instances of ApsaraDB for Redis Enhanced Edition (Tair) use a multi-threading model. For more information, see Enhanced multi-threading performance. The instances of this type provide the performance three times that provided by native Redis databases with the same specifications. Therefore, the ApsaraDB for Redis service remains available when highly concurrent distributed locks are used.