MongoDB writeConcern principles - Alibaba Cloud Developer Forums: Cloud Discussion Forums

  • UID625
  • Fans3
  • Follows1
  • Posts68

[Others]MongoDB writeConcern principles

More Posted time:Oct 25, 2016 16:08 PM
MongoDB supports clients to configure and write the policy (writeConcern) flexibly to meet the demand of various scenarios.
db.collection.insert({x: 1}, {writeConcern: {w: 1}})
The writeConcern option
MongoDB supports the following writeConcern options:
1. w: <number>: Send confirmation to the client after the data is written to <number> nodes.
o {w: 0}: No confirmation is required for writing to the clients. Applicable to performance-demanding scenarios that do not care about the accuracy.
o {w: 1}: The default writeConcern option. Send confirmation to the client as soon as the data is written to the primary node.
o {w: “Majority”}: Send confirmation to the client after the data is written to a majority of members of the replica set. Applicable to data security-demanding scenarios. This option will impair the writing performance.
2. j: <Boolean>: Send confirmation to the client after the writing operation journal becomes persistent.
o By default, it is “{j: False}. If you want to confirm with the client only after the data written to the primary node becomes persistent, you should specify this option as true.
3. wtimeout: <Millseconds>: The write timeout value. It is only valid when the value of w is greater than 1.
o When {w: } is specified, the data writing operation is regarded as a success only after the data is successfully written to <number> nodes. If a node has a fault during the writing process, this condition may fail to be met and thus the confirmation cannot be sent to the client. In this case, you can set the wtimeout option on the client to specify a timeout duration. When the writing process lasts longer than the timeout duration, the writing operation is deemed as failed.
{w: “Majority”} parsing
{w: 1} and {j: True} among other writeConcern options are easy to comprehend: to send confirmation from the primary node when the condition is met. But {w: “majority”} is relatively complicated. The data writing operation is deemed as successful after the data is confirmed to be successfully written to a majority of nodes. The replication of MongoDB is achieved through constant pulling and replaying oplog on the secondary nodes, instead of the primary node's active synchronization of the written data to the secondary node. So how does the primary node confirm that the data has been successfully written to a majority of the nodes?

1. The client initiates a request to the primary node to specify writeConcern option as {w: “Majority”}. The primary node receives the request, writes the data locally, records the write request to oplog and waits for a majority of the nodes to complete synchronizing this/this batch of oplog (the secondary node will report the latest progress to the primary node after it applies the oplog).
2. The secondary node pulls the newly written oplog to the primary node, replays it locally and records the request to the oplog. To enable the secondary node to pull the oplog on the primary node promptly, the find command supports an awaitData option. When the find command fails to locate any conforming documents, it does not return results immediately, but waits for the maxTimeMS (2s by default) time to see whether new conforming data exists. If yes, it returns the result. So when the oplog is newly written, the secondary node can obtain the new oplog immediately.
3. The secondary node has separate threads. When the latest timestamp of the oplog is updated, the secondary node will send the replSetUpdatePosition command to the primary node to update its oplog timestamp.
4. When the primary node finds there are enough nodes whose oplog timestamps meet the condition, it sends confirmation to the client.