Restore data from a downloaded local disk backup to a self-managed MongoDB database - ApsaraDB for MongoDB

This topic describes how to restore data from a downloaded backup of a MongoDB instance that uses local disks to a self-managed MongoDB database that is configured as a single node or a replica set.

Restore data from a logical backup of a local disk

Notes

MongoDB is constantly updated, and older versions of mongorestore may not be compatible with newer versions of MongoDB. You must select a mongorestore version that is compatible with your MongoDB version. For more information, see mongorestore.
If the self-managed database is a sharded cluster, set the <hostname> parameter in the data import command to the address of the Mongos component in the self-managed database.
If your self-managed database is a sharded cluster, you must add --nsExclude="config.*" to the data import command. Otherwise, an error may occur during data restoration.
When you restore data from a sharded cluster instance to a self-managed database, you must download the backup data of each shard component and import the data into the self-managed database. If the sharded cluster instance contains orphaned documents, dirty data may be introduced into the self-managed database. When you restore backups of multiple shards to the same sharded cluster, use the drop parameter only when you restore the backup of the first shard.

Preparations

Download and install a version of MongoDB on your client (a local server or an ECS instance) that is the same as the source ApsaraDB for MongoDB instance. For more information, see Install MongoDB.
You have downloaded the logical backup file. If you have not done so, see Download a backup file.

Procedure

Copy the downloaded backup file to the client where your self-managed MongoDB database is located. The mongorestore tool must be installed on this client.
Run the following command to import data from the backup file to your self-managed MongoDB database.
```
mongorestore -h <hostname> --port <server port> -u <username> -p <password> --drop --gzip --archive=<backupfile> -vvvv --stopOnError
```
Modify the following parameters:
- <hostname>: The server address of the self-managed MongoDB database. For a local server, you can enter 127.0.0.1.
  If the self-managed database is a sharded cluster, set this parameter to the address of the Mongos component in the self-managed database.
- <server port>: The port of the self-managed MongoDB database.
- <username>: The username used to log on to the self-managed MongoDB database. Make sure that the user has permissions on all databases. We recommend that you use the root account.
- <password>: The password for the database account.
- <backupfile>: The name of the downloaded logical backup file.
The following parameters do not need to be modified:
- --drop: Deletes a collection before restoring the backup file.
  Note
  When you restore backups of multiple shards to the same sharded cluster, use this parameter only when you restore the backup of the first shard.
- --gzip: Specifies that the data in the backup file is compressed in gzip format and must be decompressed.
  Note
  This parameter is supported since MongoDB 3.1.4. For more information, see mongo-tools.
- -vvvv: The level of detail in the output. The more 'v' characters, the more detailed the output.
- --stopOnError: Stops the import process if an error occurs.
- --nsExclude: Excludes matching collections from the restoration. For example, --nsExclude="config.*".
Example:
```
mongorestore -h 127.0.0.1 --port 27017 -u root -p ******** --drop --gzip --archive=hins1111_data_20190710.ar -vvvv --stopOnError
```

Restore data from a physical backup of a local disk

Prerequisites

The instance is a replica set instance.
The TDE feature is disabled for the instance.
The storage engine of the instance is WiredTiger or RocksDB. If the storage engine of the instance is TerarkDB, see restore data from a disk backup.
Note
- You can view the storage engine of the instance on the Basic Information page in the ApsaraDB for MongoDB console.
- If the storage engine of the instance is RocksDB, you must compile and install a MongoDB application that is built with the RocksDB storage engine.

Database version requirements

The version of the ApsaraDB for MongoDB instance must be compatible with the version of the self-managed MongoDB database. The following table describes the version mappings.

MongoDB instance	Self-managed MongoDB database
Version 3.2	Version 3.2 or 3.4
Version 3.4	Version 3.4
Version 4.0	Version 4.0
Version 4.2	Version 4.2

Physical backup file formats

Physical backup file format	File extension	Description
tar format	.tar.gz	For instances created before March 26, 2019, the physical backup file is in tar format.
xbstream format	_qp.xb	For instances created on or after March 26, 2019, the physical backup file is in xbstream format. Note Because Windows does not support the percona-xtrabackup tool required to decompress this file, the xbstream format can be decompressed only on Linux.

Environment

The following procedure uses an Alibaba Cloud ECS instance that runs a 64-bit Ubuntu 16.04 image. For more information about how to create an ECS instance, see Create an ECS instance.
The required version of MongoDB is installed on the ECS instance. For more information, see the official MongoDB documentation.
Environment variables for MongoDB are configured on the ECS instance. This lets you run commands without entering the full paths of executable files.
The /test/mongo/data directory is used as the database directory for the MongoDB physical restoration.
The /test/mongo/data1 and /test/mongo/data2 directories are used as the database directories for the replica set nodes.

Step 1: Configure environment variables

Configure environment variables for MongoDB in your self-managed database environment. This lets you run commands without entering full paths. Before you perform this step, make sure that you have installed MongoDB.

If you have already configured environment variables for MongoDB, you can skip this step and proceed to Step 2: Download and decompress the physical backup file.

Run the following command to open the profile environment variable file in Linux.
```
sudo vi /etc/profile
```
Press the i key to enter edit mode. Then, add the following content to the last line:
```
export PATH=$PATH:/<Path to the MongoDB server>/bin
```
Note
In this example, the path to the MongoDB server is /test/mongo/bin. You can change the path based on your requirements.
Example:
```
export PATH=$PATH:/test/mongo/bin
```
Press the Esc key to exit edit mode. Then, enter :wq to save the file and exit.
Run the following command to apply the changes to the environment variable file:
```
source /etc/profile
```

Step 2: Download and decompress the physical backup file

Download the physical backup file of the MongoDB instance. You can run the following command to download the file.
```
wget -c '<External download URL of the data backup file>' -O <Custom_file_name>.<Extension>
```
Example:
```
wget -c 'http://rdsbak-hz-v3.oss-cn-hangzhou-internal.aliyuncs.com/custins5475****/hins1907****_data_20210906103710_qp.xb?Expires=......' -O backupfile._qp.xb
```
Note
- Based on the type of the downloaded file, make sure that the file extension is .tar.gz or _qp.xb.
- Enclose the download URL in single quotation marks (') to ensure that the URL is parsed correctly.
Run the following command to create a data directory in the /test/mongo/ directory and move the downloaded physical backup file to the /test/mongo/data/ directory.
```
mkdir -p /test/mongo/data && mv <Physical_backup_file_name.Extension> /test/mongo/data
```
Decompress the physical backup file.
- If the downloaded physical backup file has the .tar.gz extension, such as hins20190412.tar.gz, use the following method to decompress it.
```
cd /test/mongo/data/ && tar xzvf hins20190412.tar.gz 
```
- If the downloaded physical backup file has the _qp.xb extension, such as hins20190412_qp.xb, use the following method to decompress it.
  1. Install the percona-xtrabackup tool and the qpress package. For more information, see the installation steps on the official Percona XtraBackup website.
  2. Decompress the physical backup file. For example, if the database backup file is named hins20190412_qp.xb:
```
# Go to the directory where the file is located.
cd /test/mongo/data/
# Unpack the file.
cat hins20190412_qp.xb | xbstream -x -v
# Decompress the physical backup file.
innobackupex --decompress --remove-original /test/mongo/data
```

Step 3: Restore data from the physical backup in single-node mode

Run the following command to create a configuration file named mongod.conf in the /test/mongo folder.
```
touch /test/mongo/mongod.conf
```

On the command line, run vi /test/mongo/mongod.conf to open the mongod.conf file. Press the i key to enter edit mode.

Select a startup configuration template based on the storage engine of the ApsaraDB for MongoDB instance and copy it to the mongod.conf file.

Note

This configuration file sets the startup mode to single-node mode and enables authentication.

WiredTiger storage engine

systemLog:
    destination: file
    path: /test/mongo/mongod.log
    logAppend: true
security:
    authorization: enabled
storage:
    dbPath: /test/mongo/data
    directoryPerDB: true
net:
    port: 27017
    unixDomainSocket:
        enabled: false
processManagement:
    fork: true
    pidFilePath: /test/mongo/mongod.pid

Note

By default, ApsaraDB for MongoDB uses the WiredTiger storage engine and enables the directoryPerDB option. Therefore, this option is specified in the configuration.

RocksDB storage engine

systemLog:
    destination: file
    path: /test/mongo/logs/mongod.log
    logAppend: true
security:
    authorization: enabled
storage:
    dbPath: /test/mongo/data
        engine: rocksdb
net:
    port: 27017
    unixDomainSocket:
        enabled: false
processManagement:
    fork: true
    pidFilePath: /test/mongo/mongod.pid

Press the Esc key to exit edit mode. Then, enter :wq to save the file and exit.
Start MongoDB using the new mongod.conf configuration file.
```
mongod -f /test/mongo/mongod.conf
```
After the startup is complete, run the following command to log on to the MongoDB database and enter the mongo shell.
```
mongo --host 127.0.0.1 -u <username> -p <password> --authenticationDatabase admin
```
- <username>: The database account for the MongoDB instance. The default is root.
- <password>: The password for the database account.
  Note
  If your password contains special characters, enclose the password in single quotation marks ('). For example: 'test123!@#'. Otherwise, the logon may fail.
In the mongo shell, run show dbs to query all databases in the local MongoDB instance and verify that the restoration was successful.
The restoration is now complete. You can run the exit command in the mongo shell to exit.

After you complete these steps, the MongoDB database is started in single-node mode. To start the database in replica set mode, proceed to Step 4.

Step 4: Start the MongoDB database in replica set mode

By default, a physical backup of an ApsaraDB for MongoDB instance contains the replica set configuration of the original instance. You must remove this configuration to start the database in replica set mode. To do so, perform the following steps:

On the command line, use the mongo shell to log on to the MongoDB database as the test user.
```
mongo --host 127.0.0.1 -u test -p <password_of_the_test_user> --authenticationDatabase admin
```
Note
If your password contains special characters, enclose the password in single quotation marks ('). For example: 'test123!@#'. Otherwise, the logon may fail.

After you log on, run the commands in the following code block to perform the following actions:

Create a temporary user in the admin database and grant the user temporary read and write permissions on the local database.
Switch to the temporary user and remove the original replica set configuration from the local database.
Switch back to the test user and delete the temporary user and its permissions.
Note
Replace <password_of_the_test_user> in the following code with the password of your test user before you run the commands.

use admin
db.runCommand({
    createRole: "tmprole",
    roles: [
        {
            role: "test",
            db: "admin"
        }
    ],
    privileges: [
        {
            resource: {
                db: 'local',
                collection: 'system.replset'
            },
            actions: [
                'remove'
            ]
        }
    ]
})
db.runCommand({
    createUser: "tmpuser",
    pwd: "tmppwd",
    roles: [
        'tmprole'
    ]
})
db.auth('tmpuser','tmppwd')
use local
db.system.replset.remove({})
use admin
db.auth('test','<password_of_the_test_user>')
db.dropRole('tmprole')
db.dropUser('tmpuser')

Run the following commands to shut down the MongoDB service and exit the mongo shell.
```
use admin
db.shutdownServer()
exit
```
Create a replica set authentication file.
To start MongoDB in replica set mode, you must create a key file for authentication between the replica set nodes.
1. Run the following command to create a keyFile folder in the mongo directory. This folder will serve as the directory for the authentication file. Then, create a key file in that directory.
```
mkdir -p /test/mongo/keyFile && touch /test/mongo/keyFile/mongodb.key
```
2. Run vi /test/mongo/keyFile/mongodb.key to open the mongodb.key file. Press the i key to enter edit mode and enter the encryption content. For example:
```
MongoDB Encrypting File
```
  Note
  The encryption content is subject to the following restrictions:
  - The length must be between 6 and 1,024 characters.
  - It can contain only characters from the Base64 character set.
  - It cannot contain the equal sign (=).
3. Press the Esc key to exit edit mode. Then, enter :wq to save and exit the file.
4. On the command line, run the following command to change the permissions of the authentication file to 400. This ensures that the file content is visible only to the file owner.
```
sudo chmod 400 /test/mongo/keyFile/mongodb.key
```
Note
This authentication file is applied to all replica set nodes.
Prepare two empty nodes for the replica set.
1. Run the following command to copy the mongod.conf file twice. These copies will serve as the startup configuration files for the other two nodes.
```
cp /test/mongo/mongod.conf /test/mongo/mongod1.conf && cp /test/mongo/mongod.conf /test/mongo/mongod2.conf
```
2. Run the following command to create data directories for the other two nodes.
```
mkdir -p /test/mongo/data1 && mkdir -p /test/mongo/data2
```
Modify the configuration file of each node as described in the following steps:
- Run vi /test/mongo/mongod.conf to open the configuration file for Node 1. Modify the file to include the following content, and then save and exit the file.
```
systemLog:
    destination: file
    path: /test/mongo/mongod.log
    logAppend: true
security:
    authorization: enabled
    keyFile: /test/mongo/keyFile/mongodb.key
storage:
    dbPath: /test/mongo/data
    directoryPerDB: true
net:
    bindIp: 127.0.0.1
    port: 27017
    unixDomainSocket:
        enabled: false
processManagement:
    fork: true
    pidFilePath: /test/mongo/mongod.pid
replication:
    replSetName: "rs0"
```
- Run vi /test/mongo/mongod1.conf to open the configuration file for Node 2. Modify the file to include the following content, and then save and exit the file.
```
systemLog:
    destination: file
    path: /test/mongo/mongod1.log
    logAppend: true
security:
    authorization: enabled
    keyFile: /test/mongo/keyFile/mongodb.key
storage:
    dbPath: /test/mongo/data1
    directoryPerDB: true
net:
    bindIp: 127.0.0.1
    port: 27018
    unixDomainSocket:
        enabled: false
processManagement:
    fork: true
    pidFilePath: /test/mongo/mongod1.pid
replication:
    replSetName: "rs0"
```
- Run vi /test/mongo/mongod2.conf to open the configuration file for Node 3. Modify the file to include the following content, and then save and exit the file.systemLog: destination: file path: /test/mongo/mongod2.log logAppend: true security: authorization: enabled keyFile: /test/mongo/keyFile/mongodb.key storage: dbPath: /test/mongo/data2 directoryPerDB: true net: bindIp: 127.0.0.1 port: 27019 unixDomainSocket: enabled: false processManagement: fork: true pidFilePath: /test/mongo/mongod2.pid replication: replSetName: "rs0"
```
systemLog:
    destination: file
    path: /test/mongo/mongod2.log
    logAppend: true
security:
    authorization: enabled
    keyFile: /test/mongo/keyFile/mongodb.key
storage:
    dbPath: /test/mongo/data2
    directoryPerDB: true
net:
    bindIp: 127.0.0.1
    port: 27019
    unixDomainSocket:
        enabled: false
processManagement:
    fork: true
    pidFilePath: /test/mongo/mongod2.pid
replication:
    replSetName: "rs0"
```
The following describes the key parameters:
- path under systemLog: The path of the MongoDB log file for the current node.
- dbpath: The path of the MongoDB data file for the current node.
- pidFilePath: The path of the MongoDB PID file (a file that records the process ID) for the current node.
- keyFile: The path of the replica set authentication file. All nodes must use the same authentication file.
- bindIp: The IP address of the current node. If the replica set is deployed on the same server, all nodes can use the same IP address.
- port: The port number of the current node. If the replica set is deployed on the same server, all nodes must use different port numbers.
- replication: The replica set configuration.
- replSetName: The name of the replica set.

Run the following command to start the three nodes.

mongod -f /test/mongo/mongod.conf && mongod -f /test/mongo/mongod1.conf && mongod -f /test/mongo/mongod2.conf

After the startup is complete, log on to the MongoDB database using the test account.
```
mongo --host 127.0.0.1 -u test -p <password_of_the_test_account> --authenticationDatabase admin
```
Note
If your password contains special characters, enclose the password in single quotation marks ('). For example: 'test123!@#'. Otherwise, the logon may fail.
In the mongo shell, run the following command to add the replica set member nodes that you created in the previous steps to the replica set and initialize the replica set.
```
rs.initiate( {
   _id : "rs0",
   version : 1,
   members: [
      { _id: 0, host: "127.0.0.1:27017" , priority : 1},
      { _id: 1, host: "127.0.0.1:27018" , priority : 0},
      { _id: 2, host: "127.0.0.1:27019" , priority : 0}
   ]
})
```
Example of a successful initialization:
Note
This step uses the rs.initiate() command. For more information about this command, see the official MongoDB documentation for rs.initiate().
After the command is successfully run, the two new nodes start to synchronize data with the primary node. The time required for this process varies based on the size of the backup file. After the data synchronization is complete, the startup in replica set mode is finished.
Verify that the startup was successful.
1. Run exit to exit the mongo shell.
2. Run the following command to log on to the MongoDB database again.
```
mongo -u <username> -p <password> --authenticationDatabase admin
```
  - <username>: The database account for the MongoDB instance. The default is root.
  - <password>: The password for the database account.
    Note
    If your password contains special characters, enclose the password in single quotation marks ('). For example: 'test123!@#'. Otherwise, the logon may fail.
3. Observe the left side of the mongo shell command line. If <Replica_set_name>:PRIMARY> is displayed, the startup in replica set mode was successful.

FAQ

Why does an error occur when I use the specified mongod.conf configuration file to start the self-managed database?

The following are common reasons:

You may have started the database before you specified the mongod.conf configuration file. This causes the storage.bson file to be automatically generated in the data directory. To resolve this issue, move this file and restart the database by specifying the mongod.conf configuration file.
A mongod process may already be running in the current system. You can find the mongod process ID by running the ps -e | grep mongod command. Then, stop the mongod process by running the kill <Process_ID> command and restart the database by specifying the mongod.conf configuration file.
You may not have specified the correct systemLog.path log path in the mongod.conf configuration file. Make sure that the specified path exists and that you have specified the log file name. For example: path: /<Log_file_path>/<Log_file_name>.log.

Why does an error occur when I use the specified mongod.conf configuration file to start the database in replica set mode?

You may not have changed the permissions of the specified keyFile authentication file to 600. On the command line, run sudo chmod 600 <Path_to_keyFile> to modify the permissions and then try again.

Why does the system slow down after I start the MongoDB database in replica set mode?

After the startup is complete, the system automatically starts to synchronize data from the primary node to the other nodes. You must wait for the data synchronization to complete.

How do I restore the instance data to a self-managed database if the instance architecture does not allow I to download backup files?

You can use DTS to migrate the instance data to a self-managed database. For more information, see Migrate data from a self-managed MongoDB database or an ApsaraDB for MongoDB instance.
You can back up and restore the instance by using mongodump and mongorestore provided by ApsaraDB for MongoDB.