All Products
Search
Document Center

ApsaraDB for MongoDB:What do I do if my instance is in the stuttering state or encounters an exception due to a large number of databases and collections?

Last Updated:May 14, 2025

If a large number of databases and collections exist in your ApsaraDB for MongoDB instance, the performance of databases in the instance may be degraded and other issues may occur.

The concept of databases and tables in traditional relational databases corresponds to the concept of databases and collections in ApsaraDB for MongoDB.

In ApsaraDB for MongoDB, the WiredTiger storage engine creates disk files for a collection. An individual index becomes a new disk file. In the WiredTiger storage engine, each opened resource, such as a file system object, uses a unique data structure known as a data handle (dhandle) to store information about checkpoints, the number of session references, pointers to the in-memory B-tree structure, and data statistics.

Therefore, if a larger number of collections exist in an ApsaraDB for MongoDB instance, more operating system files in the WiredTiger storage engine are opened and more corresponding dhandles are generated. If a large number of dhandles exist in memory, lock contention occurs. This degrades the performance of instances.

Potential issues

  • Slow queries and increased request latency caused by handle locks or schema locks

    If a large number of collections exist, the following slow query log is generated:

    2024-03-07T15:59:16.856+0800 I  COMMAND  [conn4175155] command db.collections command: count { count: "xxxxxx", query: { A: 1, B: 1 }, 
    $readPreference: { mode: "secondaryPreferred" }, $db: "db" } planSummary: COLLSCAN keysExamined:0 keysExaminedBySizeInBytes:0 
    docsExamined:1 docsExaminedBySizeInBytes:208 numYields:1 queryHash:916BD9E3 planCacheKey:916BD9E3 reslen:185 
    locks:{ ReplicationStateTransition: { acquireCount: { w: 2 } }, Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 2 } }, 
    Collection: { acquireCount: { r: 2 } }, Mutex: { acquireCount: { r: 1 } } } storage:{ data: { bytesRead: 304, timeReadingMicros: 4 }, 
    timeWaitingMicros: { handleLock: 40, schemaLock: 134101710 } } protocol:op_query 134268ms

    The log indicates that you perform a count operation only on the collection that contains a single document and a long time is required to complete the operation. The timeWaitingMicros: { handleLock: 40, schemaLock: 134101710 } } protocol:op_query 134268ms field in the log indicates that read requests wait for a long time to obtain handle locks and schema locks stored in the underlying layer due to a large number of collections.

  • An out-of-memory (OOM) error occurs during synchronization initialization when you add nodes.

  • A longer time is required to restart an instance.

  • It takes a longer time to synchronize data.

  • It takes a longer time to back up and restore data.

  • The failure rate of physical backups is increased.

  • It takes a longer time to recover an instance from failures.

Note

A large number of collections do not necessarily cause issues. Issues occur due to factors such as the business model and loads. For example, databases use the same specifications and have 10,000 collections and 100,000 documents in the following scenarios. However, different issues occur in the scenarios.

  • Accounting software system: Access has obvious aggregation characteristics. Most collections are stored only as cold data, and only a small part of collections are frequently accessed.

  • Multi-tenant management system: Tenants are isolated by collections. Most collections are accessed or used.

Optimization methods

Remove unnecessary collections

Query collections that can be removed from databases, such as collections that have expired or are no longer required. Then, run the dropCollection command to remove the collections. For more information about the command, see dropCollection().

Warning

Before you run the command, make sure that a full backup file is available.

Run the following commands to query the information of a database and a collection in the database:

  • Run the following command to query the number of collections in a database:

    db.getSiblingDB(<dbName>).getCollectionNames().length
  • Run the following command to query the information of a database, such as the number of collections and indexes, the number of document entries, and the total data volume.

    // Query the statistics on a database.
    db.getSiblingDB(<dbName>).stats()
  • Run the following command to query the information of a collection:

    / Query the statistics of a collection.
    db.getSiblingDB(<dbName>).<collectionName>.stats()

Remove unnecessary indexes

To resolve the preceding potential issue, reduce the number of indexes. After the number of indexes is reduced, the number of disk files stored by the WiredTiger storage engine and the number of corresponding dhandles are reduced.

Comply with the following rules when you optimize indexes:

  • Avoid invalid indexes

    The indexes of fields that are not accessed by queries are not hit. The indexes are invalid. You can remove the indexes.

  • Prefix matching rules for indexes

    For example, the {a:1} and {a:1,b:1} indexes exist. The {a:1} index is a redundant index that servers as a prefix of other indexes. You can remove the index.

  • Order of index fields in an equivalence query

    For example, the {a:1,b:1} and {b:1,a:1} indexes exist. During equivalence matching, the order of the index fields does not affect the matching results. You can remove the index that has the smallest number of hits.

  • Reference ESR rules in a range query

    Construct optimal composite indexes based on the actual business query range and in the order of quality, Sort, Range. For more information, see The ESR (Equality, Sort, Range) Rule.

  • Review indexes that have a small number of hits

    In most cases, indexes that have a small number of hits are duplicated with another index that has a large number of hits. Determine whether to remove the indexes based on all query patterns.

You can use the $indexStats aggregation stage of ApsaraDB for MongoDB to view the statistics on all indexes in a collection. For more information about the aggregation stage, see $indexStats (aggregation). Run the following command to query the statistics on all indexes in the collection. Before you run the command, make sure that you have the required permissions.

// Query the statistics of all indexes in a collection.
db.getSiblingDB(<dbName>).<collectionName>.aggregate({"$indexStats":{}})

Sample result:

{
   "name" : "item_1_quantity_1",
   "key" : { "item" : 1, "quantity" : 1 },
   "host" : "examplehost.local:27018",
   "accesses" : {
      "ops" : NumberLong(1),
      "since" : ISODate("2020-02-10T21:11:23.059Z")
   }
}

The following table describes the parameters that are returned in the preceding result.

Parameter

Description

name

The name of the index.

key

The details of the index key.

accesses.ops

The number of operations that use the index. This parameter value also indicates the number of hits of the index.

accesses.since

The time when the statistics were collected. If an instance is restarted or the index is rebuilt, the accesses.since and accesses.ops parameters are re-configured.

If the number of hits of the index is small, such as when the accesses.ops parameter is set to 1, the index may be redundant or invalid. You can remove the index. If your instance runs MongoDB 4.4 or later, run the hiddenIndex command to hide an index and ensure that the instance does not encounter an exception for a period of time before you remove the index. This reduces the risks caused by index removal. For more information about the command, see db.collection.hideIndex().

Examples

For example, a collection named players exists. The players in the collection are governed by the rule that whenever a player collects 20 coins, the coins are automatically converted into a star. The collection contains the following documents:

// players collection
{
  "_id": "ObjectId(123)",
  "first_name": "John",
  "last_name": "Doe",
  "coins": 11,
  "stars": 2
}

The collection contains the following indexes:

  • _id (default index)

  • { last_name: 1 }

  • { last_name: 1, first_name: 1 }

  • { coins: -1 }

  • { stars: -1 }

The following index optimization logic applies:

  • Queries do not access the coins field. Therefore, the { coins: -1 } index is invalid.

  • The { last_name: 1, first_name: 1 } index contains the { last_name: 1 } index based on the preceding index prefix matching rule. Therefore, you can remove the { last_name: 1 } index.

  • Run the $indexStats command. The command output shows that the number of hits of the { stars: -1 } index is small. This indicates that the index is not frequently used. However, you must retain the { stars: -1 } index to avoid scans for all documents because the players must be sorted in a reverse order based on the number of stars at the end of the last round of the game.

After index optimization, the following indexes remain in the collection:

  • _id

  • { last_name: 1, first_name: 1 }

  • { stars: -1 }

Index optimization provides the following benefits:

  • The storage capacity of indexes is reduced.

  • Data writing performance is improved.

If you have further questions about index optimization, submit a ticket.

Integrate data in multiple collections

To reduce the number of collections, integrate data from multiple collections into a single collection.

For example, a database named temperatures exists. The database is used to store all temperature data obtained from a sensor. The sensor works from 10:00 to 22:00. During the working period, the sensor reads temperature data every half hour and saves the data to the database. The temperature data of each day is stored in a collection whose name contains a date.

The following data is stored in two collections named temperatures.march-09-2020 and temperatures.march-10-2020:

  • Collection named temperatures.march-09-2020

    {
      "_id": 1,
      "timestamp": "2020-03-09T010:00:00Z",
      "temperature": 29
    }
    {
      "_id": 2,
      "timestamp": "2020-03-09T010:30:00Z",
      "temperature": 30
    }
    ...
    {
      "_id": 25,
      "timestamp": "2020-03-09T022:00:00Z",
      "temperature": 26
    }
  • Collection named temperatures.march-10-2020

    {
      "_id": 1,
      "timestamp": "2020-03-10T010:00:00Z",
      "temperature": 30
    }
    {
      "_id": 2,
      "timestamp": "2020-03-10T010:30:00Z",
      "temperature": 32
    }
    ...
    {
      "_id": 25,
      "timestamp": "2020-03-10T022:00:00Z",
      "temperature": 28
    }
    

The number of collections in the database has increased over time. ApsaraDB for MongoDB does not have a clear upper limit on the number of collections, and the preceding examples do not specify a clear data lifecycle relationship. Therefore, the number of collections in the database and the number of corresponding indexes are increasing.

The data modeling used in the examples does not facilitate cross-day queries. If you want to query multiple-day data to obtain temperature trends for a long time, perform queries based on $lookup. The queries have poorer performance than the queries on the same collection.

An optimized data modeling integrates temperature data from the two collections into a single collection and stores daily temperature data in a single document. The following sample code provides an example of an optimized data modeling:

// temperatures.readings
{
  "_id": ISODate("2020-03-09"),
  "readings": [
    {
      "timestamp": "2020-03-09T010:00:00Z",
      "temperature": 29
    },
    {
      "timestamp": "2020-03-09T010:30:00Z",
      "temperature": 30
    },
    ...
    {
      "timestamp": "2020-03-09T022:00:00Z",
      "temperature": 26
    }
  ]
}
{
  "_id": ISODate("2020-03-10"),
  "readings": [
    {
      "timestamp": "2020-03-10T010:00:00Z",
      "temperature": 30
    },
    {
      "timestamp": "2020-03-10T010:30:00Z",
      "temperature": 32
    },
    ...
    {
      "timestamp": "2020-03-10T022:00:00Z",
      "temperature": 28
    }
  ]
}

The optimized data modeling consumes much less resources than the original data modeling. You do not need to create indexes based on the time when temperature is read each day. The default index _id of the temperatures.readings collection facilitates queries by date. The optimized data modeling resolves the issue of an increasing number of collections.

Note

You can use time series collections in time series data to resolve the preceding issue. For more information, see Time Series.

Time series collections are supported only by instances that run MongoDB 5.0 or later.

Instance splitting

If the total number of collections in an ApsaraDB for MongoDB standalone instance cannot be reduced, split the instance and modify the configurations of the instance.

The following table describes the splitting solutions in two scenarios.

Scenario

Splitting solution

Description

Collections are distributed across multiple databases

If business across multiple databases is not closely associated, such as when multiple applications or services share the same instance, migrate data from some databases to a new ApsaraDB for MongoDB instance. For more information, see Migrate data from an ApsaraDB for MongoDB replica set instance to an ApsaraDB for MongoDB replica set or sharded cluster instance. Before the migration is complete, the business logic and access mode must be split.

If business across multiple databases is closely associated, refer to the splitting solution in the scenario where collections are distributed in a single database.

  • Select the required source databases when you create a Data Transmission Service (DTS) task.

  • Retain the original collection names or modify the names during the migration.

  • Run the dropDatabase command to remove the source databases in the source instance after the migration is complete.

Collections are distributed in a single database

Your application must first determine whether all collections can be split by a dimension, such as region, city, priority, or another dimension related to business.

Then, use DTS to migrate data from some collections in a database to one or more new ApsaraDB for MongoDB instances. This way, the collections are distributed across N databases. Before the migration is complete, the business logic and access mode must be split.

  • Select the required source database when you create a DTS task.

  • Retain the original collection names or modify the names during the migration.

  • Run the drop command to remove the collections in the source instance after the migration is complete.

  • If you want to perform aggregate queries after splitting, additional business logic is required.

Examples

A multi-tenant management platform system uses a MongoDB database. In the initial modeling, each tenant is a separate collection. As business grows, the number of tenants has exceeded 100,000, and the total data volume of the database has reached terabytes. In this case, the database is slowly accessed and high latency occurs.

The applications of the system split tenants in China by the region dimension. The tenants are categorized into the following regions: North China, Northeast China, East China, Central China, South China, Southwest China, and Northwest China. ApsaraDB for MongoDB instances are created in a zone of a region and multiple rounds of DTS migrations are performed. To meet the requirements of the applications for data aggregation and analysis, data is synchronized from the instances to data warehouses.

The number of collections in an instance is significantly reduced and the specifications of the instance is degraded after splitting. The applications follow the principle of nearby access by region. This shortens request latency to milliseconds, significantly improves user experience, and simplifies subsequent instance O&M.

Migrate date from all collections in a replica set instance to a sharded cluster instance and use shard tags to manage the collections

If all collections are distributed in a database and need to be managed by a logical instance, you can migrate data from the collections to a sharded cluster instance and use shard tags for collection management. When you use shard tags to manage the collections, you must run the sh.addShardTag and sh.addTagRange commands. However, the collections are still managed by the sharded cluster instance without the need to modify business configurations. You need only to replace the original connection string with that of the shaded cluster instance. For more information about the commands, see sh.addShardTag() and sh.addTagRange().

If your instance contains 100,000 active collections, purchase a sharded cluster instance that contains 10 shard nodes. You can perform the following steps to make configurations and migrate data. After the migration is complete, each shard node in the sharded cluster instance contains 10,000 active collections.

  1. Purchase a sharded cluster instance. In this example, a sharded cluster instance that contains two shards is used. For more information about how to create a sharded cluster instance, see Create a sharded cluster instance.

  2. Connect to the mongos node of the sharded cluster instance. For more information, see Connect to an ApsaraDB for MongoDB sharded cluster instance by using the mongo shell.

  3. Run the following commands to add shard tags to all shards:

    sh.addShardTag("d-xxxxxxxxx1", "shard_tag1")
    sh.addShardTag("d-xxxxxxxxx2", "shard_tag2")
    Note
    • Before you run the preceding commands, make sure that the used account has the required permissions.

    • Data Management (DMS) does not support the sh.addShardTag command. We recommend that you use the mongo shell or mongosh to connect to the instance and run the commands.

  4. Pre-configure range-based tag distribution rules for all collections in the shards.

    use <dbName>
    sh.enableSharding("<dbName>")
    
    sh.addTagRange("<dbName>.test", {"_id":MinKey}, {"_id":MaxKey}, "shard_tag1")
    sh.addTagRange("<dbName>.test1", {"_id":MinKey}, {"_id":MaxKey}, "shard_tag2")

    In this example, _id is used as a shard key. Replace the shard key with your actual shard key. Make sure that all queries involve the shard key field. The shard key must be the same as the shard key in the field in the subsequent operation. Use the boundary values specified by [MinKey,MaxKey] to ensure that all data in a single collection is stored only in a single shard.

  5. Perform shardCollection operations on all collections that you want to migrate.

    sh.shardCollection("<dbName>.test", {"_id":1})
    sh.shardCollection("<dbName>.test1", {"_id":1})
  6. Run the sh.status() command to verify that specified rules are in effect.image.png

  7. Migrate date from all collections to the sharded cluster instance. For more information, see Migrate data from an ApsaraDB for MongoDB replica set instance to an ApsaraDB for MongoDB replica set or sharded cluster instance.

    Note

    You perform sharding operations on the sharded cluster instance and all collection information exist. In this case, set the Processing Mode of Conflicting Tables parameter to Ignore Errors and Proceed.

  8. After the system checks data consistency, switch your business to the sharded cluster instance.

Note
  • If you want to add shards to the sharded cluster instance, perform Step 3 to add tags to new shards.

  • If the number of collections in databases keeps increasing, perform Step 4 and Step 5. Otherwise, the collections exist only in the primary shard, which causes a larger number of collections in the shard. In this case, your instance is always in the stuttering state or encounters an exception.

Migrate date from all collections in a replica set instance to a sharded cluster instance and use zones to manage the collections

You can use zones to manage migrated collections. This method is similar to that of using shard tags. When you use zones to manage the collections, you must also run the sh.addShardToZone() and sh.updateZoneKeyRange() commands. For more information about zones, see Manage Shard Zones. For more information about the commands, see sh.addShardToZone() and sh.updateZoneKeyRange().

To use zones to manage the collections, perform the following steps:

  1. Purchase a sharded cluster instance. In this example, a sharded cluster instance that contains two shards is used. For more information about how to create a sharded cluster instance, see Create a sharded cluster instance.

  2. Connect to the mongos node of the sharded cluster instance. For more information, see Connect to an ApsaraDB for MongoDB sharded cluster instance by using the mongo shell.

  3. Run the following commands to specify zones for all shards in the sharded cluster instance:

    sh.addShardToZone("d-xxxxxxxxx1", "ZoneA")
    sh.addShardToZone("d-xxxxxxxxx2", "ZoneB")
    Note
    • Before you run the preceding commands, make sure that the used account has the required permissions.

    • DMS does not support the sh.addShardToZone command. We recommend that you use the mongo shell or mongosh to connect to the instance and run the commands.

  4. Pre-configure range-based zone distribution rules for all collections in the shards.

    use <dbName>
    sh.enableSharding("<dbName>")
    
    sh.updateZoneKeyRange("<dbName>.test", { "_id": MinKey }, { "_id": MaxKey }, "ZoneA")
    sh.updateZoneKeyRange("<dbNmae>.test1", { "_id": MinKey }, { "_id": MaxKey }, "ZoneB")

    In this example, _id is used as a shard key. Replace the shard key with your actual shard key. Make sure that all queries involve the shard key field. The shard key must be the same as the shard key in the field in the subsequent operation. Use the boundary values specified by [MinKey,MaxKey] to ensure that all data in a single collection is stored only in a single shard.

  5. Perform shardCollection operations on all collections that you want to migrate.

    sh.shardCollection("<dbName>.test", { _id: "hashed" })
    sh.shardCollection("<dbName>.test1", { _id: "hashed" })
  6. Run the sh.status() command. The command output shows that specified rules are in effect.image

  7. Migrate date from all collections to the sharded cluster instance. For more information, see Migrate data from an ApsaraDB for MongoDB replica set instance to an ApsaraDB for MongoDB replica set or sharded cluster instance.

    Note

    You perform sharding operations on the sharded cluster instance and all collection information exist. In this case, set the Processing Mode of Conflicting Tables parameter to Ignore Errors and Proceed.

  8. After the system checks data consistency, switch your business to the sharded cluster instance.

Note
  • If you want to add shards to the sharded cluster instance, perform Step 3 to specify zones for new shards.

  • If the number of collections in databases keeps increasing, perform Step 4 and Step 5. Otherwise, the collections exist only in the primary shard, which causes a larger number of collections in the shard. In this case, your instance is always in the stuttering state or encounters an exception.

Risks

We recommend that you do not run the dropDatabase command to remove a database with a large number of collections.

After the dropDatabase command is run, the WiredTiger storage engine asynchronously removes the metadata and physical files of all desired collections. This may affect the primary/secondary synchronizations on secondary nodes, resulting in a continuous increase in replication latency. Moreover, the flowControl mechanism participates in the removal process, or all writes with the settings of {writeConcern:majority} are affected.

To avoid the preceding issue, you can use one of the following methods:

  • Specify a proper interval between two removals of collections in the desired database, and run the dropDatabase command after all collections are removed.

  • Use DTS or other migration tools to migrate databases and collections to be retained to a new instance, and remove the original instance after the cutover migration is completed.

You must configure proper primary/secondary latency alert items for your instance. If your instance encounters this issue, you can submit a ticket to contact technical support.

Summary

  • The total number of collections in a replica set instance cannot exceed 10,000. If the total number of indexes in a single collection exceeds 15, lower this value.

  • If your business requires a large number of collections, such as when a multi-tenant system needs to be isolated by collection, split the collections and use sharded cluster instances.

  • If your databases have a large number of collections, you can submit a ticket to reduce the number of collections and modify the business logic.

References