This topic describes the MongoDB versions and storage engines that are supported by ApsaraDB for MongoDB and the relationship between MongoDB versions and storage engines. This can help you select an instance that best suits your needs.
Supported MongoDB versions
- MongoDB 6.0
- MongoDB 5.0
- MongoDB 4.4
- MongoDB 4.2
- MongoDB 4.0
- MongoDB 3.4
- MongoDB 3.2
This version is deprecated. For more information, see [Notice] ApsaraDB for MongoDB has phased MongoDB 3.2 out and released MongoDB 4.2 since February 4.
Storage engines
Storage engine | Scenario | Description |
---|---|---|
WiredTiger | It is the default storage engine and applicable to most business scenarios. | Data is in the B-tree structure. Compared with the earlier MMAPv1 storage engine, WiredTiger significantly improves performance and reduces storage costs while supporting data compression. |
Relationship between MongoDB versions and storage engines
Storage engine | MongoDB 4.4 or later | MongoDB 4.2 | MongoDB 4.0 | MongoDB 3.4 |
---|---|---|---|---|
WiredTiger |
|
|
|
|
MongoDB 6.0
- Queryable encryption
Queryable encryption allows users to encrypt sensitive data from the client side, store the data as fully randomized encrypted data on the database server side, and then run expressive queries on the encrypted data.
Queryable encryption allows only the client to obtain the plaintext of sensitive data. After a query that contains an encryption key obtained from Key Management Service (KMS) is sent to the server, the server processes the query and returns a response in ciphertext. Then, the client decrypts the response by using the encryption key and displays the response in plaintext.
- Cluster-to-Cluster Sync
The mongosync tool is introduced in MongoDB 6.0 to provide you with continuous, uni-directional data synchronization of MongoDB clusters in the same or different environments (Atlas, hybrid, on-premises, and edge clusters). The mongosync tool allows you to manage and monitor the synchronization process in real time. In the process, you can start, stop, resume, or reverse the synchronization task.
- Time series collections
Time series collections are enhanced in terms of indexing, query, and sorting.
- Secondary and compound indexes can be used on time series collections to improve read performance.
- Geospatial indexes can be added for time series data to support scenarios that involve
distance and location.
For example, you can track temperature changes in refrigerated trucks and monitor fuel consumption of cargo ships on a specific route.
Last point
queries on time series data are optimized. You no longer need to scan the entire collection to obtain the last data point.- Time series data can be sorted in a more efficient manner by using timestamps and creating clustered and secondary indexes on metadata fields.
- Change streams
- Pre-images can be viewed.
Note Prior to MongoDB 6.0, only post-images can be viewed. Starting from MongoDB 6.0, pre-images can also be viewed. For more information about pre-images and post-images, see Change Streams with Document Pre- and Post-Images.
- DDL statements such as
create
,createIndexes
,modify
, andshardCollection
are supported. For more information, see Change Events. - The
wallTime
field is added to change events. The timestamp used by the field supports multiple conversion operators (including$toDate
,$tsSeconds
, andtsIncrement
) to facilitate consumption.
- Pre-images can be viewed.
- Aggregations
- Supports
$lookup
and$graphLookup
for sharded cluster instances. - Improves the support of
$lookup
for joins. - Improves the support of
$graphLookup
for graph traversal. - Achieves up to a hundredfold improvement in the performance of
$lookup
.
Note For more information about$lookup
and$graphLookup
, see $lookup (aggregation) and $graphLookup (aggregation). - Supports
- Queries
Operators such as
$maxN
,$topN
,$minN
,$bottomN
,$lastN
, and$sortArray
are added. Operators allow you to use databases for computing to reduce the burden on business applications.Note For more information about the operators, see Aggregation Pipeline Operators. - Elasticity
- Updates the default size of a data chunk from 64 MB to 128 MB. This reduces data migration frequency and decreases networking and routing overheads.
- Supports the
configureCollectionBalancing
command that provides the following features:- Supports different chunk sizes for different sharded tables.
For example, you can set the chunk size to 256 MB for ultra-large sharded tables and set the chunk size to 64 MB or 32 MB for small sharded tables that require even data distribution.
- Supports active collection defragmentation.
Compared with the
compact
command, theconfigureCollectionBalancing
command provides better defragmentation services to reduce disk space usage.
Note For more information about theconfigureCollectionBalancing
command, see configureCollectionBalancing. - Supports different chunk sizes for different sharded tables.
- Security protection
MongoDB 6.0 optimizes the Client-Side Field Level Encryption (CSFLE) feature. After the optimization, CSFLE is available for Key Management Interoperability Protocol (KMIP)-compliant key providers. In other words, in addition to local Keyfile-based key management services, MongoDB also supports third-party key management devices by using KMIP. This provides enhanced security.Note SFLE has been widely used in the management of sensitive data, especially in data migration scenarios.
MongoDB 5.0
- Native time series platform
MongoDB 5.0 natively supports the entire lifecycle of time series data, from ingestion, storage, query, real-time analysis, and visualization to online archival or automatic expiration as data ages. This streamlines the building and running of time series applications and lowers costs. In version 5.0, MongoDB has expanded the universal application data platform to make it easier for developers to process time series data. This further extends the application scenarios of MongoDB to areas such as IoT, financial analysis, and logistics.
- Live resharding
You can change the shard key for your collection on demand as your workload grows or evolves. No database downtime or complex migration within the dataset is required in this process. You can run the reshardCollection command in the MongoDB Shell to select the database and collection that you want to reshard and specify the new shard key.
reshardCollection: "<database>.<collection>", key: <shardkey>
Note- <database>: the name of the database that you want to reshard.
- <collection>: the name of the collection that you want to reshard.
- <shardkey>: the name of the shard key.
- When you run the reshardCollection command on MongoDB, it clones an existing collection and then applies all oplogs in the existing collection to a new one. After applying all oplogs, MongoDB switches to the new collection and deletes the existing collection.
- Versioned API
Versioned API allows you to add new features to the database of each version with full backward compatibility. When you change an API, you can run a new version of the API on the same server at the same time as the existing version of the API. As new MongoDB versions are released at a faster pace, the Versioned API feature provides easier access to the features of the latest versions.
The Versioned API feature defines a set of commands and parameters that are most commonly used in applications. These commands remain unchanged for all database releases, including annual major releases and quarterly rapid releases. As a result, the application lifecycle is decoupled from the database lifecycle, which allows you to pin the driver to a specific version of the MongoDB API. This way, even after your database is upgraded, your application can continue to run for several years without the need to modify any code.
- Default majority write concern.
Starting from MongoDB 5.0, the default level of write concern is majority. A write operation is committed, and write success response is passed back to the application only when the write operation is applied to the primary node and persisted to the logs of a majority of secondary nodes. This ensures that MongoDB 5.0 provides stronger data durability guarantees out of the box.
- Long-running snapshot queries
Long-running snapshot queries improve the versatility and flexibility of applications. By default, snapshot queries executed by this feature have a duration of 5 minutes. The execution duration can be customizable. In addition, this feature maintains strong consistency with snapshot isolation guarantees without affecting the performance of your live and transactional workloads, and allows you to execute snapshot queries on secondary nodes. This allows you to run different workloads in a single cluster and scale your workloads to different shards.
- New MongoDB Shell
For better user experience, the MongoDB Shell has been redesigned from the ground up to provide a modern command-line experience, enhanced usability features, and a powerful scripting environment. The new MongoDB Shell has become the default shell for MongoDB. The new MongoDB Shell introduces syntax highlighting, intelligent auto-complete, contextual help, and useful error messages to create a visualized and interactive experience.
- Version release adjustment
Starting with the 5.0 release, MongoDB is released as two different release series: Rapid Releases and Major Releases. Rapid Releases are available for evaluation and development purposes. We recommend that you do not use Rapid Releases in the production environment.
MongoDB 4.4
- Hidden indexes
Existing indexes are hidden to ensure that these indexes will not be used in subsequent queries. You can check whether the deletion of a specified inefficient index compromises performance. If not, you can delete the inefficient index.
- Refinable shard keys
One or more suffix fields are added to improve the distribution of existing documents on chunks. This prevents the concurrent access to a single shard and decreases pressure on servers.
- Compound hashed shard keys
A single hash field can be specified in a composite index to simplify the business logic.
- Hedged reads
For sharded cluster instances, a read request can be sent to two replicas of the same shard simultaneously. The results received first are used to recover the client. This reduces the request latency.
- Streaming replication
Oplogs of the primary database are actively sent to the secondary database. Compared with the previous version where the secondary database is polled, this method saves nearly half of the round-trip time and improves the performance of data replication between the primary and secondary databases.
- Simultaneous indexing
The indexing processes of the primary and secondary databases are synchronized. This greatly reduces the latency generated by the primary and secondary databases in the indexing processes and ensures that the secondary database can access the latest data in a timely manner.
- Mirrored reads
The primary node synchronizes a portion of read traffic to the secondary node for processing. The secondary node processes read traffic, which can reduce the access latency.
- Resumable initial sync
ApsaraDB for MongoDB supports resumable upload during full synchronization between primary and secondary nodes. This prevents synchronization from beginning in case of network disconnection.
- Time-based oplog retention
You can customize the retention period of oplogs. When oplogs are cleared, full synchronization from the primary node is required.
- Union
The $unionWith stage is added to improve the query performance of MongoDB. This stage is similar to the
UNION ALL
statement in SQL. - Custom aggregation expressions
The $accumulator and $function operators are added to implement custom aggregation expression and improve interface consistency and user experience.
MongoDB 4.2
- Distributed transactions
The two-phase commit method ensures the ACID feature of sharded cluster transactions, expands business scenarios, and achieves a leap from NoSQL to NewSQL.
- Repeatable reads
The repeatable read feature provides the automatic retry capability in a poor-quality network environment. This reduces logic complexity at the service side and ensures continuity of your business.
- Wildcard indexes
You can create a wildcard index for nondeterministic fields to overwrite multiple feature fields in a document for flexible management and usage.
- Field-level encryption
Field-level encryption is supported at the driver layer and can be used to separately encrypt specified sensitive information such as accounts, passwords, prices, and mobile phone numbers. You can use field-level encryption to make business more flexible and secure without full-database encryption.
- Materialized views
Latest materialized views can cache computing results to make computing more efficient and logic less complex.
MongoDB 4.0
- Cross-document transactions
As the first NoSQL database that supports cross-document transactions, MongoDB 4.0 combines the speed, flexibility, features, and ACID guarantee of document models.
- Migration speed increase by 40%
Concurrent read and write operations allow new shard nodes to migrate data fast and bear service loads.
- Read performance significantly improved
The transaction feature ensures that secondary nodes no longer block read requests due to log synchronization. The multi-node scaling feature is supported in all versions to significantly improve reading capabilities.
MongoDB 3.4
- Faster synchronization between the primary and secondary nodes
All indexes are created when data is synchronized (only the _id index is created in earlier versions). During data synchronization, the secondary node continuously reads new oplog information to ensure that the local database of the secondary node has enough space to store temporary data.
- More efficient load balancing
In earlier versions, mongos nodes are responsible for load balancing of sharded cluster instances. Multiple mongos nodes contest a distributed lock. The node that obtains the lock performs load balancing tasks and migrates chunks between shard nodes. In MongoDB 3.4, the primary Configserver node is responsible for load balancing. This greatly improves the concurrency and efficiency of load balancing.
- More aggregation operations
Many aggregation operators are added in MongoDB 3.4 to provide more powerful data analysis capabilities. For example,
bucket
can conveniently classify data.$grahpLookup
supports more complex relational operations than$lookup
in MongoDB 3.2.$addFields
allows richer document operations such as summing some fields as a new field. - Sharding zones supported
The zone concept is introduced for sharded cluster instances to replace the current tag-aware sharding mechanism. The zone feature can allocate data to one or more specified shard nodes. This feature allows you to conveniently deploy sharded cluster instances across data centers.
- Collation supported
In earlier versions, strings stored in documents are always compared byte by byte regardless of Chinese, English, uppercase, or lowercase. After collation is introduced, the string content can be interpreted or compared based on the used locale. Case-insensitive comparison is also supported.
- Read-only views
MongoDB 3.4 supports read-only views. MongoDB 3.4 virtualizes the data that meets a query condition into a special collection, on which you can perform further queries.