All Products
Search
Document Center

ApsaraDB for MongoDB:Use NimoFullCheck to check data consistency after migration

Last Updated:Jan 26, 2024

NimoFullCheck is a tool developed by Alibaba Cloud to check data consistency between an Amazon DynamoDB database and an ApsaraDB for MongoDB database. This topic describes how to use NimoFullCheck to check data consistency after you migrate data from an Amazon DynamoDB database to an ApsaraDB for MongoDB database.

Prerequisites

Data is migrated from an Amazon DynamoDB database to an ApsaraDB for MongoDB database by using NimoShake. For more information, see Migrate an Amazon DynamoDB database to ApsaraDB for MongoDB by using NimoShake.

Background information

After you migrate data from an Amazon DynamoDB database to an ApsaraDB for MongoDB database, you can use NimoFullCheck to check data consistency between the two databases.

The check consists of the following two steps:

  • Brief check: checks whether the number of items in a table in the Amazon Dynamo database is equal to the number of documents in the corresponding collection in the ApsaraDB for MongoDB database. If the numbers are different, the check terminates and an error message is returned. You can locate issues based on the returned error message.

  • Precise check: precisely compares data in the two databases after the brief check is passed. NimoFullCheck fetches data from the Amazon Dynamo database and parses the data. If the data contains unique indexes, NimoFullCheck compares the data with that in the destination ApsaraDB for MongoDB database based on the unique indexes. If the data does not contain unique indexes, NimoFullCheck compares all data entries in two databases one by one, which is slow.

Usage notes

  • NimoFullCheck only supports consistency check for full data migration. If you check data consistency after an incremental data synchronization, the result is inconsistent.

  • NimoFullCheck uses data in the ApsaraDB for MongoDB database as the baseline for check. In other words, NimoFullCheck checks whether data in the Amazon DynamoDB database is consistent with that in the ApsaraDB for MongoDB database.

Procedure

The following procedure assumes that you run NimoFullCheck in the Ubuntu operating system.

  1. Run the following command to download the NimoShake package:

    wget https://github.com/alibaba/NimoShake/releases/download/release-v1.0.0-20191015/nimo.tar.gz
    Note

    We recommend that you download the latest NimoShake package. For more information, see NimoShake.

  2. Run the following command to decompress the NimoShake package:

    tar zxvf nimo.tar.gz
  3. After you decompress the package, run the cd nimo command to go to the nimo directory.

  4. Run the following command to start NimoFullCheck with required parameters:

    ./nimo-full-check.linux --<Parameter 1>=<Value 1> --<Parameter 2>=<Value 2>

    The following table describes the parameters of NimoFullCheck.NimoFullCheck parameters

    Parameter

    Description

    Example

    id

    The ID of the migration task. Set the value to the ID of the migration task that is specified when you used NimoShake to migrate data. For more information, see Migrate an Amazon DynamoDB database to ApsaraDB for MongoDB by using NimoShake.

    --id=nimo-shake

    logLevel

    The level of the logs to be generated. Valid values:

    • none: does not generate logs.

    • error: generates logs that contain error messages.

    • warn: generates logs that contain warnings.

    • info: generates logs that indicate system status.

    • debug: generates logs that contain debugging information.

    Default value: info.

    --logLevel=info

    sourceAccessKeyID

    The access key ID used to connect to the source Amazon DynamoDB database.

    --sourceAccessKeyID=xxxxxxxxxx

    sourceSecretAccessKey

    The secret access key used to connect to the source Amazon DynamoDB database.

    --sourceSecretAccessKey=xxxxxxxxxx

    sourceSessionToken

    Optional. The session token used to access the source Amazon DynamoDB database.

    --sourceSessionToken=xxxxxxxxxx

    sourceRegion

    Optional. The region where the source Amazon DynamoDB database resides.

    --sourceRegion=us-east-2

    qpsFull

    The number of times that the Scan command is run on tables per second. Default value: 10000.

    --qpsFull=10000

    qpsFullBatchNum

    The number of data entries to fetch per second. Default value: 128.

    --qpsFullBatchNum=128

    targetAddress

    The endpoint of the destination ApsaraDB for MongoDB database. For more information about how to view the endpoint, see Connect to a replica set instance or Connect to a sharded cluster instance.

    Example: mongodb://username:password@s-*****-pub.mongodb.rds.aliyuncs.com:3717.

    --targetAddress=mongodb://username:password@s-*****-pub.mongodb.rds.aliyuncs.com:3717

    diffOutputFile

    The name of the file that stores the information about inconsistent data. If you do not specify this parameter, the default file name nimo-full-check-diff is used.

    --diffOutputFile=nimo-full-check-diff

    parallel

    The number of threads to be used for consistency check. Default value: 16.

    --parallel=16

    sample

    The maximum number of documents to be checked in each collection. A value of 0 indicates that all documents in a collection are checked. Default value: 1000. A value of 1000 indicates that a maximum of 1,000 documents can be checked in a collection.

    --sample=1000

    filterCollectionWhite

    The collection whitelist for consistency check. Set the value to the names of the collections that must be checked. Example: --filterCollectionWhite = c1;c2, indicating that only collections c1 and c2 are checked.

    --filterCollectionWhite=ci;c2

    filterCollectionBlack

    The collection blacklist for consistency check. Set the value to the names of the collections that do not need to be checked. Example: --filterCollectionBlack = c1;c2, indicating that all collections other than c1 and c2 are checked.

    --filterCollectionBlack=ci;c2

    convertType

    Specifies whether the source data, which uses the Dynamo protocol, was converted during the migration. Valid values:

    • raw: Data was written to the destination database without conversion.

    • change: Data was converted before it is written to the destination database. For example, {"hello":"1"} was converted to {"hello": 1}.

    Note

    The value of the parameter must be the same as that specified for migration. If the values are different, the check fails.

    --convertType=change

    version

    Displays the version number of NimoFullCheck.

    Note

    You do not need to specify a value for the parameter. If you need to display the version number, add the --version parameter to the command.

    --version

    help

    Displays all the parameters that are supported by NimoFullCheck.

    --help

    Note

    If the check is successful, the message full check done! is returned. If the check terminates due to an error, NimoFullCheck exits and an error message is returned. You can locate issues based the returned error message.