MongoDB is a document-oriented database that is second only to Oracle and MySQL. DataWorks provides MongoDB Reader and MongoDB Writer for you to read data from and write data to MongoDB data sources. You can use the codeless user interface (UI) or code editor to configure synchronization nodes for MongoDB data sources.

Background information

Workspaces in standard mode support the data source isolation feature. You can add data sources separately for the development and production environments to isolate the data sources. This helps keep your data secure. For more information, see Isolate connections between the development and production environments.

Procedure

  1. Go to the Data Source page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. After you select the region where the required workspace resides, find the workspace and click Data Integration.
    4. In the left-side navigation pane, choose Data Source > Data Sources.
  2. On the Data Source page, click Add data source in the upper-right corner.
  3. In the Add data source dialog box, click MongoDB in the NoSQL section.
  4. In the Add MongoDB data source dialog box, configure the parameters.
    You can use one of the following modes to add a MongoDB data source: Alibaba Cloud Instance Mode and Connection String Mode.
    Note If the MongoDB data source that you want to add to a DataWorks workspace does not belong to the same Alibaba Cloud account as the workspace, you can add the MongoDB data source only in connection string mode.
    • Alibaba Cloud Instance Mode: In most cases, this mode is used to add a MongoDB data source that is deployed on the classic network. If the workspace and the MongoDB data source that you want to add reside in the same region, the MongoDB data source can be connected to the workspace over the classic network. If they reside in different regions, the connectivity between them over the classic network cannot be ensured. MongDB
      Parameter Description
      Data Source Type The mode in which the data source is added. Set this parameter to Alibaba Cloud Instance Mode.
      Note If you have not assigned the default role to Data Integration, log on to the Resource Access Management (RAM) console by using your Alibaba Cloud account and perform authorization. Then, refresh the configuration page.
      Data Source Name The name of the data source. The name can contain letters, digits, and underscores (_) and must start with a letter.
      Data Source Description The description of the data source. The description can be a maximum of 80 characters in length.
      Environment The environment in which the data source is used. Valid values: Development and Production.
      Note This parameter is displayed only when the workspace is in standard mode.
      Region The region where the ApsaraDB for MongoDB instance resides.
      Instance ID The ID of the ApsaraDB for MongoDB instance. You can view the ID in the ApsaraDB for MongoDB console.
      Database name The name of the database that you created in the ApsaraDB for MongoDB console. You can create a database and specify a username and a password for the database in this console.
      Username The username that is used to connect to the database.
      Password The password that is used to connect to the database.
    • Connection String Mode: In most cases, this mode is used to add a MongoDB data source that is deployed on the Internet. Access to a MongoDB data source over the Internet may generate fees. MongoDB
      Parameter Description
      Data Source Type The mode in which the data source is added. Set this parameter to Connection String Mode.
      Data Source Name The name of the data source. The name can contain letters, digits, and underscores (_) and must start with a letter.
      Data Source Description The description of the data source. The description can be a maximum of 80 characters in length.
      Environment The environment in which the data source is used. Valid values: Development and Production.
      Note This parameter is displayed only when the workspace is in standard mode.
      Address The server address. Specify this parameter in the format of Host address:Port number. You can click Add access address to specify multiple addresses.
      Note If you specify multiple addresses, you must make sure that all the host IP addresses specified in the addresses are either public or private IP addresses.
      Database name The name of the database that you created in the ApsaraDB for MongoDB console.
      Username The username that is used to connect to the database.
      Password The password that is used to connect to the database.
      To add a MongoDB data source in connection string mode, perform the following steps:
      1. Set Data Source Type to Connection String Mode.
      2. In the Add MongoDB data source dialog box, configure the parameters. You must set the IP address of the host in the address specified by the Access address parameter to the private IP address of the data source.
      3. Click Complete without testing the connectivity of the data source.
      4. Create a custom resource group and use the resource group to run a synchronization node. For more information, see Create a custom resource group for Data Integration.
      Notice
      • ApsaraDB for MongoDB data sources can be connected only over the classic network.
      • If an ApsaraDB for MongoDB instance is deployed in a virtual private cloud (VPC), you must set Data Source Type to Connection String Mode when you add the instance to DataWorks as a data source.
      • If a MongoDB data source is deployed in a VPC, you cannot test the connectivity of the data source.
  5. Set Resource Group connectivity to Data Integration.
  6. Find the desired resource group in the resource group list in the lower part of the dialog box and click Test connectivity in the Actions column.
    A synchronization node can use only one type of resource group. To ensure that your synchronization nodes can be normally run, you must test the connectivity of all the resource groups for Data Integration on which your synchronization nodes will be run. If you want to test the connectivity of multiple resource groups for Data Integration at a time, select the resource groups and click Batch test connectivity. For more information, see Select a network connectivity solution.
    Note
    • By default, the resource group list displays only exclusive resource groups for Data Integration. To ensure the stability and performance of data synchronization, we recommend that you use exclusive resource groups for Data Integration.
    • If you want to test the network connectivity between the shared resource group or a custom resource group and the data source, click Advanced below the resource group list. In the Warning message, click Confirm. Then, all available shared and custom resource groups appear in the resource group list.
  7. After the data source passes the connectivity test, click Complete.

What to do next

You have learned how to add a MongoDB data source. You can proceed to subsequent tutorials. In subsequent tutorials, you will learn how to configure MongoDB Reader and MongoDB Writer. For more information, see MongoDB Reader and MongoDB Writer.