The MongoDB output widget enables the writing of data from an external database into MongoDB, along with the copying and pushing of data from storage systems connected to the big data platform into MongoDB for data integration and reprocessing. This topic describes the steps to configure the MongoDB output widget.
Prerequisites
A MongoDB data source is created. For more information, see create a MongoDB data source.
To configure the properties of the MongoDB output widget, the account must have read-through permission for the data source. If you lack this permission, you can request access to the data source. For more information, see requesting data source permission.
Procedure
Select Development > Data Integration from the top menu bar on the Dataphin home page.
On the integration page's top menu bar, select Project (in Dev-Prod mode, select Environment).
In the navigation pane on the left, click Batch Pipeline. From the Batch Pipeline list, select the offline pipeline you want to develop to access its configuration page.
To open the Component Library panel, click on Component Library located in the upper-right corner of the page.
In the Component Library panel's left-side navigation pane, select Output. Then, find the MongoDB widget in the output widget list on the right and drag it onto the canvas.
Connect the target input, transform, or flow widget to the MongoDB output widget by clicking and dragging the
icon.On the MongoDB output widget, click the
icon to open the MongoDB Output Configuration dialog box.
In the Mongodb Output Configuration dialog box, set the necessary parameters.
Parameter
Description
Basic Settings
Step Name
This is the name of the MongoDB output widget. The naming convention is as follows:
Only Chinese characters, letters, numbers, and underscores (_) are supported.
Up to 64 characters can be entered.
Datasource
In the data source drop-down list, all MongoDB-type data sources are displayed, including data sources for which you have write-through permission and those for which you do not. Click the
icon to copy the current data source name.For data sources without write-through permission, you can click Request after the data source to request write-through permission for the data source. For more information, see request, renew, and return data source permissions.
If you do not have a MongoDB-type data source, click Create Data Source to create a data source. For more information, see create a MongoDB data source.
Table
Select the target table for output data. You can enter table name keywords for search or enter the exact table name and click Accurate Search. After selecting a table, the system will automatically perform table status detection. Click the
icon to copy the name of the currently selected table.Update Information (optional)
Specify the update information. For example,
{"isUpsert":"true","upsertkey"""unique_id"}.Field Separator (optional)
Fill in the separator between fields. If not filled, the default is a comma (,).
Field Mapping
Input Field
Displays the output fields of the upstream widget.
Output Field
Displays the output fields. Dataphin supports configuring output fields by Batch Add and Create New Output Field:
Batch Add: Click Batch Add. JSON and TEXT formats are supported for batch configuration.
Batch configuration in JSON format, for example:
// Example: [{"name": "user_id","type": "String"}, {"name": "user_name","type": "String"}]Notename specifies the name of the field to import. type specifies the data type of the field after it is imported. For example,
"name":"user_id","type":"String"imports the field named user_id and sets its data type to String.Batch configuration in TEXT format, for example:
// Example: user_id,String user_name,StringThe row delimiter is used to separate each field's information. The default is a line feed (\n). Line feed (\n), semicolon (;), and period (.) are supported.
The column delimiter is used to separate the field name and field type. The default is a comma (,).
Create New Output Field.
Click + Create New Output Field. Fill in Column and select Type according to the page prompts. After completing the configuration of the current row, click the
icon to save.Copy Ancestor Table Field.
Click Copy Ancestor Table Field. The system will automatically generate output fields based on the field names of the ancestor table.
Manage Output Fields.
You can also perform the following operations on the added fields:
Click the Actions column
icon to edit the existing fields.Click the Actions column
icon to delete the existing field.
Mapping
The mapping relationship is used to map the input fields of the source table to the output fields of the target table, facilitating subsequent data synchronization. The mapping relationship includes Same Name Mapping and Same Row Mapping. The applicable scenarios are described as follows:
Same Name Mapping: Maps fields with the same field name.
Same Row Mapping: The field names of the source table and target table are inconsistent, but the data in the corresponding rows of the fields need to be mapped. Only fields in the same row are mapped.
Complete the property configuration by clicking Confirm on the MongoDB output widget.