Distributed locks are one of the most widely adopted features in large applications. You can implement distributed locks based on Redis by using a variety of methods. This topic describes the common methods to implement distributed locks and the best practices for implementing distributed locks by using ApsaraDB for Redis Enhanced Edition (Tair). These best practices are developed based on the accumulated experience of Alibaba Group in using ApsaraDB for Redis Enhanced Edition (Tair) and distributed locks.

Distributed locks and scenarios

If a specific resource needs to be concurrently accessed by multiple threads in a process during application development, you can use mutexes (also known as mutual exclusion locks) and read/write locks. If a specific resource needs to be concurrently accessed by multiple processes on a host, you can use interprocess synchronization primitives, such as semaphores, pipelines, and shared memory. However, if a specific resource needs to be concurrently accessed by multiple hosts, you must use distributed locks. Distributed locks are mutual exclusion locks that have global presence. You can apply distributed locks to resources in distributed systems to prevent logical failures that may be caused by resource contention.

Features of distributed locks

  • Mutually exclusive

    Only one client can hold a lock at a given moment.

  • Deadlock-free

    Distributed locks use a lease-based locking mechanism. If a client acquires a lock and encounters an exception, the lock is automatically released after a period of time. This prevents resource deadlocks.

  • Consistent

    Switchovers in ApsaraDB for Redis may be triggered by external or internal errors. External errors include hardware failures and network exceptions, and internal errors include slow queries and system defects. After a switchover is triggered, a replica node is promoted to be a master node to implement high availability (HA) for ApsaraDB for Redis. In this scenario, if your business has high requirements for mutual exclusion, locks must remain the same after the switchovers.

Implement distributed locks based on open source Redis

Note The methods described in this section also apply to ApsaraDB for Redis Community Edition.
  • Acquire a lock

    In Redis, you only need to run the SET command to acquire a lock. The following section provides a command example and describes the key parameters or options of the command.

    SET resource_1 random_value NX EX 5
    Table 1. Description of key parameters or options
    Parameter or option Description
    resource_1 Specifies the key of the distributed lock. If the key exists, the corresponding resource is locked and cannot be accessed by the other clients.
    random_value Specifies a random string. The value must be unique across clients.
    EX Specifies a validity period for the key. Unit: seconds. You can also use the PX option to set a validity period accurate to the millisecond.
    NX Sets the key only if the key does not exist in Redis.

    In the sample code, the validity period for the resource_1 key is set to five seconds. If the client does not release the key, the key expires after five seconds, and the system reclaims the lock. Then, the other clients can lock and access the resource.

  • Release a lock

    In most cases, you can run the DEL command to release a lock. However, this may cause the following issue.

    1. At the t1 time point, the key of the distributed lock is resource_1 for application 1, and the validity period for the resource_1 key is set to three seconds.
    2. Application 1 remains blocked for more than three seconds due to some reasons, such as long response time. The resource_1 key expires and the distributed lock is automatically released at the t2 time point.
    3. At the t3 time point, application 2 acquires the distributed lock.
    4. Application 1 resumes from being blocked and runs the DEL resource_1 command at the t4 time point to release the distributed lock that is held by application 2.

    This example shows that a lock needs to be released only by the client that sets the lock. Therefore, before a client runs the DEL command to release a lock, the client must first run the GET command to check whether the lock was set by itself. In most cases, a client uses the following Lua script in Redis to release the lock that is set by the client:

    if redis.call("get",KEYS[1]) == ARGV[1] then
        return redis.call("del",KEYS[1])
    else
        return 0
    end
  • Renew a lock

    If a client cannot complete the required operations within the lease term of the lock, the client must renew the lock. A lock can be renewed only by the client that sets the lock. In Redis, a client can use the following Lua script to renew a lock:

    if redis.call("get",KEYS[1]) == ARGV[1] then
        return redis.call("expire",KEYS[1], ARGV[2])
    else
        return 0
    end

Implement distributed locks based on ApsaraDB for Redis Enhanced Edition (Tair)

You can implement distributed locks by using the enhanced string commands available for performance-enhanced instances of ApsaraDB for Redis Enhanced Edition (Tair). These enhanced commands are an alternative to Lua scripts.

  • Acquire a lock

    The method to acquire a lock in ApsaraDB for Redis Enhanced Edition (Tair) is the same as that in open source Redis. The method is to run the SET command. Sample command:

    SET resource_1 random_value NX EX 5
  • Release a lock

    The CAD command of ApsaraDB for Redis Enhanced Edition (Tair) provides an elegant and efficient way for you to release a lock. For more information about the CAD command, see CAD. Sample command:

    /* if (GET(resource_1) == my_random_value) DEL(resource_1) */
    CAD resource_1 my_random_value
  • Renew a lock

    You can run the CAS command to renew a lock. For more information, see CAS. Sample command:

    CAS resource_1 my_random_value my_random_value EX 10
    Note The CAS command does not check whether the new value is the same as the original value.

Sample code based on Jedis

  • Define the CAS and CAD commands
    enum TairCommand implements ProtocolCommand {
        CAD("CAD"), CAS("CAS");
    
        private final byte[] raw;
    
        TairCommand(String alt) {
          raw = SafeEncoder.encode(alt);
        }
    
        @Override
        public byte[] getRaw() {
          return raw;
        }
    }
  • Acquire a lock
    public boolean acquireDistributedLock(Jedis jedis,String resourceKey, String randomValue, int expireTime) {
        SetParams setParams = new SetParams();
        setParams.nx().ex(expireTime);
        String result = jedis.set(resourceKey,randomValue,setParams);
        return "OK".equals(result);
    }
  • Release a lock
    public boolean releaseDistributedLock(Jedis jedis,String resourceKey, String randomValue) {
        jedis.getClient().sendCommand(TairCommand.CAD,resourceKey,randomValue);
        Long ret = jedis.getClient().getIntegerReply();
        return 1 == ret;
    }
  • Renew a lock
    public boolean renewDistributedLock(Jedis jedis,String resourceKey, String randomValue, int expireTime) {
        jedis.getClient().sendCommand(TairCommand.CAS,resourceKey,randomValue,randomValue,"EX",String.valueOf(expireTime));
        Long ret = jedis.getClient().getIntegerReply();
        return 1 == ret;
    }

Methods to ensure lock consistency

The replications between master nodes and replica nodes are asynchronous. If a master node crashes before data changes are synchronized to a replica node and an HA switchover is triggered, the data changes in the buffer may not be replicated to the new master node. This results in data inconsistency. Note that the new master node is the original replica node. If the lost data is related to a distributed lock, the locking mechanism becomes faulty and service exceptions occur. This section describes three methods that you can use to ensure lock consistency.

  • Use the Redlock algorithm

    The Redlock algorithm is proposed by the founders of the open source Redis project to ensure consistency. The Redlock algorithm is based on the calculation of probabilities. A single master-replica Redis instance may lose a lock during an HA switchover and the probability is k%. If you use the Redlock algorithm to implement distributed locks, you can calculate the probability at which N independent master-replica Redis instances all lose locks at the same time based on the following formula: Probability of losing locks = (k%)^N. Due to the high stability of Redis, the probability is small, which can meet your service requirements.

    Note When you implement the Redlock algorithm, you do not need to ensure that all the locks in N Redis instances take effect at the same time. In most cases, the Redlock algorithm can meet your business requirements if you ensure that the locks in M Redis nodes take effect at the same time. Note that M is greater than 1 and less than or equal to N.

    The Redlock algorithm has the following issues:

    • A client takes a long time to acquire or release a lock.
    • You must handle significant difficulties if you want to implement the Redlock algorithm in the cluster or standard master-replica instances of ApsaraDB for Redis.
    • The Redlock algorithm consumes large amounts of resources. To implement the Redlock algorithm, you must create multiple independent ApsaraDB for Redis instances or self-managed Redis instances.
  • Use the WAIT command

    The WAIT command of Redis blocks the current client until all the previous write commands are synchronized from a master node to a specified number of replica nodes. In the WAIT command, you can specify a timeout period that is measured in milliseconds. The WAIT command is used in ApsaraDB for Redis to ensure the consistency of distributed locks. Sample command:

    SET resource_1 random_value NX EX 5
    WAIT 1 5000

    If you run the WAIT command, a client continues to perform other operations only in two scenarios after the client acquires a lock. One of the scenarios is that data is synchronized to the replica nodes. The other scenario is that the timeout period is reached. In this example, the timeout period is 5,000 milliseconds. If the output of the WAIT command is 1, data is replicated between the master node and the replica nodes. In this case, data consistency is ensured. The WAIT command is more cost-effective than the Redlock algorithm.

    Before you use the WAIT command, take note of the following items:

    • The WAIT command only blocks the client that sends the WAIT command, and does not affect the other clients.
    • If the WAIT command returns a valid value, the lock is synchronized from the master node to the replica nodes. However, if an HA switchover is triggered before the command returns a successful response, data may be lost. In this case, the output of the WAIT command only indicates a possible replication failure, and data integrity cannot be ensured. After the WAIT command returns errors, you can acquire a lock again or verify the data.
    • You do not need to run the WAIT command to release a lock. This is because distributed locks are mutually exclusive. Logical failures do not occur even if you release the lock after a period of time.
  • Use ApsaraDB for Redis Enhanced Edition (Tair)

    The Redlock algorithm and the WAIT command offer the following benefits:

    • The Redlock algorithm improves data consistency as the number of Redis nodes increases.
    • The WAIT command is cost-effective.
    ApsaraDB for Redis Enhanced Edition (Tair) offers the following benefits:
    • The unique HA and data persistence mechanisms of ApsaraDB for Redis Enhanced Edition (Tair) help you ensure data security and service stability. ApsaraDB for Redis Enhanced Edition (Tair) allows you to ensure high data consistency even if you do not deploy multiple nodes or use the WAIT command.
    • The CAS and CAD commands available for performance-enhanced instances help you reduce the costs of developing and managing distributed locks and improve lock performance.
    • Performance-enhanced instances of ApsaraDB for Redis Enhanced Edition (Tair) use a multi-threading model. For more information about performance-enhanced instances, see Performance-enhanced instances. Instances of this type provide performance three times that of open source Redis with the same specifications. Therefore, the ApsaraDB for Redis service remains available even when high-concurrency distributed locks are used.