ApsaraDB for Redis can monitor the health states of nodes. If a master node in an instance fails, ApsaraDB for Redis automatically triggers a master-replica switchover. For example, the roles of master and replica nodes are switched over to ensure the high availability (HA) of the instance. ApsaraDB for Redis allows a master-replica switchover to be manually triggered. This feature can be applied to disaster recovery drills and access to nearby nodes that are deployed in different zones.
- Manual switchover
A master-replica switchover is manually performed by you or an authorized Alibaba Cloud technical expert. For more information, see Manually switch workloads from a master node to a replica node.
- Risk mitigation
Alibaba Cloud detects vulnerabilities in an ApsaraDB for Redis instance. These vulnerabilities may cause the ApsaraDB for Redis instance to run not as expected. In this case, ApsaraDB for Redis fixes the vulnerabilities and performs a master-replica switchover during the specified maintenance window. High-risk vulnerability fixes are automatically performed at the earliest opportunities and master-replica switchovers are triggered.
You can find the events that were triggered under the preceding conditions in history events. For more information, see Query history events. You can also manage pending events of master-replica switchovers. For more information, see Query and manage pending events.
- Instance failure
Alibaba Cloud detects failures in an ApsaraDB for Redis instance. These failures cause the ApsaraDB for Redis instance to run not as expected. In this case, ApsaraDB for Redis performs a master-replica switchover to switch your workloads over to replica nodes. This minimizes the impacts of the failures.
You are notified of such events with internal messages in the following format:
[Alibaba Cloud] Dear ******: Your ApsaraDB for Redis instance r-bp1zxszhcgatnx**** (name: ****) has an error. A switchover is triggered to ensure that your instance runs as expected. We recommend that you check whether your application is still connected to your instance and configure your application to automatically reconnect to the instance.
||Make sure that your applications are configured to automatically reconnect to the
instance or handle exceptions. Otherwise, one of the following error messages may
be returned during a switchover:
- Q: What is the principle behind the master-replica switchover triggered by an instance
A: The HA system relies on its detection mechanism to detect failures. The following table describes the HA mechanism.
Event Description Health check The HA system checks whether master and replica nodes are healthy. Master node failure
- When a master node is determined to be unavailable, a replica node acts as the master node. At the same time, the virtual IP address (VIP) of the master node is switched to the replica node.
- Another replica node is created to ensure data synchronization.
Replica node failure When a replica node is determined to be unavailable, another replica node is created to ensure data synchronization and maintain the data persistence of the master-replica architecture.Note Some data that was recently written to a master node may be lost because the synchronization between the master and replica nodes is asynchronously implemented.
- Q: Does a master-replica switchover affect the use of read replicas in read/write
splitting instances? For more information about read/write splitting instances, see
Read/write splitting instances.
A: A master-replica switchover does not affect the use of read replicas in read/write splitting instances.
- Q: Does a master-replica switchover triggered for a specific data shard in an instance
affect the instance as a whole if the instance is a cluster master-replica instance?
For more information about cluster master-replica instances, see Cluster master-replica instances.
A: The instance as a whole is not affected. Only the data shard is affected. For more information, see Impacts.