ApsaraDB for MongoDB responds to node faults differently depending on the instance type. This page explains what happens during a fault, how long disruptions last, and what you need to configure on the application side.
How each instance type handles a fault
| Instance type | Fault response | Service impact | Transient connection error |
|---|---|---|---|
| Standalone | System repairs the faulty node in place | Unavailable during repair | Until repair completes |
| Replica set | Automatic failover to a secondary or hidden node | Transparent to the application | Less than 30 seconds |
| Sharded cluster (shard and Configserver nodes) | Automatic failover to the hidden node | Transparent to the application | Less than 30 seconds |
| Sharded cluster (mongos node) | No automatic failover | Unavailable until the node recovers | Until the node recovers |
Standalone instances
A standalone instance has only one node. When the node is faulty, the system repairs it in place. Services are unavailable during the repair.
Standalone instances are designed for testing, training, and non-critical workloads. For production environments, use replica set or sharded cluster instances to ensure high availability (HA).
Replica set instances

When a node in a replica set instance is faulty, the system automatically fails over to a secondary or hidden node. The failover is transparent to your application. A transient connection error of less than 30 seconds may occur during the switchover.
Configure your application to automatically reconnect after a transient connection error.
In production environments, connect using a connection string URI instead of the connection string of the primary node. With a connection string URI, read/write operations remain available even if a node is faulty. For more information, see Overview of replica set instance connections.
Sharded cluster instances

In a sharded cluster instance, both shard and Configserver nodes use a three-node replica set architecture. When one of these nodes is faulty, the system automatically fails over to the hidden node. The failover is transparent to your application. A transient connection error of less than 30 seconds may occur during the switchover.
Configure your application to automatically reconnect after a transient connection error.
mongos nodes behave differently. A mongos node uses a single-node architecture with no automatic failover. If a mongos node is faulty, traffic routed to that node becomes unavailable until the node recovers.
In production environments, connect using a connection string URI instead of the connection string of a single mongos node. With a connection string URI, your client automatically redirects requests to a healthy mongos node if the connected mongos node is faulty. For more information, see Overview of sharded cluster instance connections.