This topic describes how to test and analyze the impacts of chained replication on instance performance by connecting an Elastic Compute Service (ECS) to an ApsaraDB for MongoDB replica set instance.
Chained replication
Concept
ApsaraDB for MongoDB supports chained replication. When a secondary node in a replica set instance synchronizes data from another secondary node, a chained replication occurs. Chained replication reduces the load of the primary node in the instance. However, a longer primary/secondary latency may occur in different network topologies. For more information, see Self-Managed Chained Replication.
In a chained replication, all nodes do not need to form a single chain. Instead, secondary nodes can select other nodes other than the primary node as the synchronization source based on information such as round-trip time (RTT). The following figure shows multiple five-node replica set topologies. These topologies belong to chained replication.
Configure chained replication
You can adjust the settings.chainingAllowed
parameter on the Parameters page in the ApsaraDB for MongoDB console to enable or disable chained replication. For more information about how to modify required parameters, see Standalone and replica set instance.
In some cases, chained replication may cause a synchronization latency between the primary and secondary nodes. Therefore, you can disable chained replication to optimize synchronization performance.
For security reasons, you have no permissions to run the replSetReconfig command on your instance. You must modify required parameters in the ApsaraDB for MongoDB console.
Instance performance impact test
Test environment
Create an ECS instance and an ApsaraDB for MongoDB instance. For more information, see Create a replica set instance and Create and manage an ECS instance in the console (express version).
Architecture of the ApsaraDB for MongoDB instance: a standard replica set topology that consists of a primary node, a secondary node, and a hidden node. You can add secondary and read-only nodes to increase the number of nodes in the instance.
RTT between the ECS instance and the ApsaraDB for MongoDB instance: The average RTT is 0.103 ms when the two instances are in the same region and zone.
The following table describes the configurations of the ECS instance and the ApsaraDB for MongoDB instance used in the test.
Configuration item | ECS Instance | ApsaraDB for MongoDB instance that uses cloud disks |
Region and zone | Beijing Zone H | Beijing Zone H |
Network type | Virtual Private Cloud (VPC) | VPC |
Instance category | Compute-optimized c6 | Dedicated |
Instance type | ecs.c6.xlarge(4c8g) | ecs.c7.xlarge(4c8g) |
Storage type | ESSD PL0 | ESSD PL1 |
Image or engine version | Alibaba Cloud Linux 3.2104 LTS 64-bit | 4.19.91-26.al7.x86_64 |
Kernel version | N/A | Major version: MongoDB 5.0 Minor version: MongoDB 5.0.30 |
Test tool
The open-source Yahoo Cloud Serving Benchmark (YCSB) 0.17.0 tool is used in the test.
YCSB is a Java tool that can be used to benchmark the performance of multiple types of databases. For more information about how to install and use YCSB, see YCSB.
Procedure
Add a whitelist to the ApsaraDB for MongoDB instance. Log on to the ECS console. Then, view the primary private IP address of the ECS instance in the network information section of the instance details page and add the IP address to the whitelist of the ApsaraDB for MongoDB instance.
Use the YCSB tool to load data for testing.
./bin/ycsb.sh load mongodb -s -p workload=site.ycsb.workloads.CoreWorkload -p recordcount=5000000 -p mongodb.url="mongodb://test:****@dds-bp13e84d11****.mongodb.rds.aliyuncs.com:3717/admin" -p table=test -threads 8
Parameters
recordcount
: the amount of data loaded to the ApsaraDB for MongoDB instance.mongodb.url
: the endpoint of the ApsaraDB for MongoDB instance. In the test, the database account is test and the database is admin. You can obtain the endpoint in the Internal Connections - VPC section of the Database Connections page in the ApsaraDB for MongoDB console.threads
: the number of concurrent threads on the client.
View the test results and the monitoring information of the tested ApsaraDB for MongoDB instance. On the Node Monitoring tab of the Monitoring Data page, you can select the test time range to view the CPU utilization, QPS, and average response time (RT) for the primary node. For more information, see Node monitoring (previously basic monitoring).
Test results
Parameters
Write Concern: the data persistence assurance level, which determines which conditions must be met before a write operation is considered successful. Valid values:
{w:"majority"}
: the default value, which indicates that a write operation is not considered successful until the operation is replicated to most nodes in the replica set instance.{w: 1}
: indicates that a write operation is considered successful only by the primary node.
Test result details
3-node topology
The replica set topology consists of a primary node, a secondary node, and a hidden node.
writeConcern = {w:"majority"}
Item | Chained replication enabled | Chained replication disabled |
Throughput (OPS) | 5277 | 5241 |
CPU utilization | 65% | 65% |
QPS | ||
Average RT (ms) |
writeConcern = {w:1}
Item | Chained replication enabled | Chained replication disabled |
Throughput (OPS) | 15075 | 14785 |
CPU utilization | 93% | 93% |
QPS | ||
Average RT (ms) |
7-node topology
The replica set topology consists of a primary node, five secondary nodes, and a hidden node.
writeConcern = {w:"majority"}
Item | Chained replication enabled | Chained replication disabled |
Throughput (OPS) | 3005 | 4312 |
CPU utilization | 56% | 85% |
QPS | ||
Average RT (ms) |
writeConcern = {w:1}
Item | Chained replication enabled | Chained replication disabled |
Throughput (OPS) | 14414 | 11492 |
CPU utilization | 91% | 93% |
QPS | ||
Average RT (ms) |
15-node topology
The replica set topology consists of a primary node, five secondary nodes, a hidden node, and eight read-only nodes.
Seven nodes can participate in the primary node election and eight nodes do not participate. The eight nodes are read-only nodes.
writeConcern = {w:"majority"}
Item | Chained replication enabled | Chained replication disabled |
Throughput (OPS) | 2932 | 3123 |
CPU utilization | 58% | 91% |
QPS | ||
Average RT (ms) |
writeConcern = {w:1}
Item | Chained replication enabled | Chained replication disabled |
Throughput (OPS) | 14093 | 7500 |
CPU utilization | 90% | 94% |
QPS | ||
Average RT (ms) |
Performance comparison and summary
The writeConcern settings determine whether write performance is degraded if chained replication is disabled and the number of nodes is fixed.
writeConcern =
{w:1}
For the 3-node instance, performance degradation caused by disabling chained replication is negligible.
For the 7-node instance, performance degradation caused by disabling chained replication is about 20.3%.
For the 15-node instance, performance degradation caused by disabling chained replication reaches 46.8%. In addition, the CPU utilization of the primary node significantly increases after chained replication is disabled.
writeConcern =
{w:"majority"}
For the 3-node instance, performance degradation caused by disabling chained replication is negligible.
For the 7--node and 15-node instances, performance is improved by about 6.5% to 43.5% after chained replication is disabled.
Cause for performance improvement: After chained replication is disabled, the overall synchronization links of all nodes are shortened in the replica set instance. This way, the majority condition is more easily met and the latency of a single write is reduced.
Impacts of the number of non-voting nodes: As the number of non-voting nodes increases (limited to 7 nodes), performance improvement caused by disabling chained replication is gradually not as expected, which is caused by high primary/secondary synchronization load on the primary node.
Write performance decreases as the number of nodes increases with the same chainingAllowed and writeConcern settings. The performance shows a more significant degradation trend after chained replication.
By default, chainingAllowed is set to true and writeConcern is set to
{w:"majority"}
. If you use the default settings, write performance is not significantly degraded when you change the number of nodes from 7 to 15. This is caused that a replica set instance can contain up to seven voting nodes, and the majority condition remains unchanged.Regardless of whether the chained replication is disabled, performance with writeConcern set to
{w:1}
is significantly better than that with writeConcern set to{w:"majority"}
and conforms to the writeConcern design.When the writeConcern settings are fixed, as the number of nodes increases, the CPU utilization of the primary node has a more significant increase during disabling chained replication than during enabling.
Best practices
When a small number of nodes exist, you can enable or disable chained replication based on your business requirements. The overall instance performance is not affected, and the CPU utilization does not significantly change.
When a large number of nodes exist:
If writeConcern is set to
{w:1}
, we recommend that you enable chained replication.If writeConcern is set to
{w: "majority" }
, balance the load of the primary node (such as CPU utilization) and instance performance. If chained replication is disabled, write performance is improved. However, the load of the primary node significantly increases.