edit-icon download-icon

Configure MongoDB reader

Last Updated: Apr 12, 2018

The MongoDB Reader plug‑in uses MongoClient, the Java client of MongoDB, to read data from MongoDB. In the latest version of Mongo, the granularity of the DB lock has been reduced from the DB level to the document level. Combined with the powerful indexing function of MongoDB, it allows a high-performance reading of MongoDB.

Note:

If you are using ApsaraDB for MongoDB, a root account is provided by default. To make sure the security quality, Data Integration only supports using the relevant account of MongoDB for connection. Avoid using the root account as access account when adding and using the MongoDB data source.

MongoDB Reader reads data in parallel from MongoDB by means of Data Integration framework. Based on the specified rules, it partitions the data in MongoDB into multiple data fragments and reads them in parallel using the controlling Job program based on the specified rules, then converts the data types supported by MongoDB to the ones supported by Data Integration individually.

MongoDB Reader supports most data types in MongoDB. Check whether your data type is supported before using it.

MongoDB Reader converts MongoDB data types as follows:

Category MongoDB data type
Integer int, Long
Floating point double
String string, array
Date and time date
Boolean bool
Binary bytes

Parameter description

Parameter Description Required Default Value
datasource Data source name. It must be identical to the data source name added. Adding data source is supported in script mode. Yes None
collectionName Collection name of MonogoDB. Yes None
column An array of multiple column names of a file in MongoDB.
- name: The Column name.
- type: The Column type.
- splitter: MongoDB supports array, but CDP framework does not. Therefore, the data items read from MongoDB in an array format are joined into a string using this delimiter.
Yes None
query Used to define the range of returned MongoDB data. For example, if you set it to "query":"{'operationTime':{'$gte':ISODate('${last_day}T00:00:00.424+0800')}}", only the data with an operationTime greater than or equal to 00:00 of ${last_day} is returned. ${last_day} is the scheduling parameter of DataWorks in the format of $[yyyy-mm-dd]. You can use conditional operators ($gt, $lt, $gte, $lte), logical operators (and, or) and functions (max, min, sum, avg, ISODat) supported by MongoDB as needed. For more information, see the query syntax of MongoDB. No None

Development in wizard mode

Development in wizard mode is unavailable currently.

Development in script mode

Configure the data synchronization job to write data to MongoDB:

  1. {
  2. "type": "job",
  3. "version": "1.0",
  4. "configuration": {
  5. "reader": {
  6. "plugin": "mongodb",
  7. "parameter": {
  8. "datasource": "datasourceName",
  9. "collectionName": "tag_data",
  10. "column": [
  11. {
  12. "name": "unique_id",
  13. "type": "string"
  14. },
  15. {
  16. "name": "frontcat_id",
  17. "type": "string"
  18. },
  19. {
  20. "name": "property",
  21. "type": "string"
  22. },
  23. {
  24. "name": "scorea",
  25. "type": "int"
  26. },
  27. {
  28. "name": "online",
  29. "type": "bool"
  30. },
  31. {
  32. "name": "percentage",
  33. "type": "double"
  34. }
  35. ]
  36. }
  37. },
  38. "writer": {}
  39. }
  40. }
Thank you! We've received your feedback.