This topic describes the data types and parameters that are supported by MongoDB Writer and how to configure MongoDB Writer by using the codeless user interface (UI) and code editor.

Background information

MongoDB Writer connects to a remote MongoDB database by using the Java client MongoClient and writes data to the database. The locking feature in the latest version of MongoDB is improved from database-level locking to document-level locking. This enables MongoDB Writer to efficiently write data to MongoDB databases. If you want to update data, specify the primary key.
Note
  • Before you configure MongoDB Writer, you must configure a MongoDB data source. For more information, see Configure a MongoDB connection.
  • If you use ApsaraDB for MongoDB, a root account is provided for the MongoDB database by default.
  • For security purposes, Data Integration can use only the account of a MongoDB database to connect to the MongoDB database. When you add a MongoDB data source, do not use the root account.

MongoDB Writer obtains data from a reader and converts the data from data types supported by Data Integration to data types supported by MongoDB. Data Integration does not support arrays. MongoDB supports arrays, and arrays support the indexing feature.

You can configure parameters to convert strings to MongoDB arrays. Then, MongoDB Writer uses parallel threads to write the arrays to a MongoDB database.

Data types

MongoDB Writer supports most MongoDB data types. Make sure that the data types of your database are supported.

The following table lists the data types that are supported by MongoDB Writer.
Category MongoDB data type
Integer INT and LONG
Floating point DOUBLE
String STRING and ARRAY
Date and time DATE
Boolean BOOL
Binary BYTES
Note When MongoDB Writer writes data of the DATE data type to a MongoDB database, MongoDB Writer converts the data to the DATETIME data type.

Parameters

Parameter Description Required Default value
datasource The name of the data source. It must be the same as the name of the added data source. You can add data sources by using the code editor. Yes No default value
collectionName The name of the collection in MongoDB. Yes No default value
column The names of the document fields to which you want to write data. Specify the names in an array.
  • name: the name of a field.
  • type: the data type of a field.
  • splitter: the delimiter. Specify this parameter only if you want to convert strings to arrays.
Yes No default value
writeMode The write mode. The following parameters are included:
  • isReplace: If you set isReplace to true, MongoDB Writer overwrites the data that contains the same primary key in the destination table. If you set isReplace to false, MongoDB Writer does not overwrite the data.
  • replaceKey: the primary key for each data record. Data is overwritten based on the primary key. The primary key must be unique.
No No default value
preSql The SQL statement that you want to execute before the synchronization node is run. For example, you can set this parameter to the SQL statement that is used to delete outdated data. If the preSql parameter is left empty, no SQL statement is executed before the synchronization node is run. Make sure that the value of the preSql parameter is specified based on the JSON syntax. No No default value

Before the synchronization node is run, Data Integration executes the SQL statement specified by the preSql parameter. Then, Data Integration starts to write data. The preSql parameter does not affect the data that is written. You can specify the preSql parameter to make sure the idempotence of the write operation. For example, you can specify the preSql parameter to delete outdated data before a synchronization node is run based on your business requirements. If the synchronization node fails, you need only to rerun the synchronization node.

Requirements on the format of the preSql parameter:
  • Configure the type parameter to specify the action type. Valid values: drop and remove. Example: "preSql":{"type":"remove"}.
    • drop: deletes the collection specified by the collectionName parameter and the data in the collection.
    • remove: deletes data based on specified conditions.
    • json: the conditions used to delete data. Example: "preSql":{"type":"remove", "json":"{'operationTime':{'$gte':ISODate('${last_day}T00:00:00.424+0800')}}"}. ${last_day} is a scheduling parameter of DataWorks. You can specify this parameter in the format of $[yyyy-mm-dd]. Other operators and functions are also supported, such as comparison operators $gt, $lt, $gte, and $lte, logical operators $and and $or, and functions max, min, sum, avg, and ISODate. You can use them based on your business requirements.
      Data Integration uses the following standard MongoDB API to query and delete the specified data:
      query=(BasicDBObject) com.mongodb.util.JSON.parse(json);                
      col.deleteMany(query);
      Note If you want to delete data based on conditions, we recommend that you specify the conditions in the JSON format.
    • item: the name, condition, and value for filtering data. Example: "preSql":{"type":"remove","item":[{"name":"pv","value":"100","condition":"$gt"},{"name":"pid","value":"10"}]}.

      Data Integration configures query conditions based on the value of the item parameter and deletes data by using the standard MongoDB API. Example: col.deleteMany(query);.

  • If the value of the preSql parameter cannot be recognized, no SQL statement is executed.

Configure MongoDB Writer by using the codeless UI

This method is not supported.

Configure MongoDB Writer by using the code editor

For more information about how to configure a synchronization node by using the code editor, see Create a sync node by using the code editor.

In the following code, a synchronization node is configured to write data to MongoDB. For more information about the parameters, see the preceding parameter description.
{
    "type": "job",
    "version": "2.0",// The version number. 
    "steps": [
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "mongodb",// The writer type. 
            "parameter": {
                "datasource": "",// The name of the data source. 
                "column": [
                    {
                        "name": "_id",// The name of the field. 
                        "type": "ObjectId"// The data type of the field. If you set the replaceKey parameter to _id, you must set the type parameter to ObjectId. If you set the type parameter to string, the data cannot be overwritten. 
                    },
                    {
                        "name": "age",
                        "type": "int"
                    },
                    {
                        "name": "id",
                        "type": "long"
                    },
                    {
                        "name": "wealth",
                        "type": "double"
                    },
                    {
                        "name": "hobby",
                        "type": "array",
                        "splitter": " "
                    },
                    {
                        "name": "valid",
                        "type": "boolean"
                    },
                    {
                        "name": "date_of_join",
                        "format": "yyyy-MM-dd HH:mm:ss",
                        "type": "date"
                    }
                ],
                "writeMode": {// The write mode. 
                    "isReplace": "true",
                    "replaceKey": "_id"
                },
                "collectionName": "datax_test"// The name of the collection. 
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {// The maximum number of dirty data records allowed. 
            "record": "0"
        },
        "speed": {
            "jvmOption": "-Xms1024m -Xmx1024m",
            "throttle": true,// Specifies whether to enable bandwidth throttling. The value false indicates that bandwidth throttling is disabled, and the value true indicates that bandwidth throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true. 
            "concurrent": 1,// The maximum number of parallel threads. 
            "mbps": "1"// The maximum transmission rate. 
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}