Write Data to Dameng DM via the Output Component - Dataphin

The Dameng (DM) output component writes data from external databases to a DM database. It can also copy data from storage systems connected to a big data platform to a DM database for integration and reprocessing. This topic describes how to configure the DM output component.

Prerequisites

A Dameng (DM) data source has been created. For more information, see Create a Dameng (DM) data source.
The account used to configure the DM output component has write-through permission for the data source. If the account does not have the required permission, you must request it. For more information, see Request, renew, and return data source permissions.

Procedure

On the Dataphin home page, choose Develop > Data Integration from the top menu bar.
From the top menu bar of the integration page, select a Project. If you are in Dev-Prod mode, also select an Environment.
In the navigation pane on the left, click Offline Integration. In the Offline Integration List, click the offline pipeline that you want to develop to open its configuration page.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the navigation pane on the left of the Component Library panel, select Outputs. Find the DM component in the list of output components on the right and drag it to the canvas.
Click and drag the icon of the source input, transform, or flow component to connect it to the DM output component.
Click the icon on the DM output component card to open the Dameng Output Configuration dialog box.

In the Dameng Output Configuration dialog box, configure the parameters as described in the following table.

Parameter		Description
Basic settings	Step name	The name of the DM output component. Dataphin automatically generates a step name. You can also change it as needed. Naming conventions: Only Chinese characters, letters, underscores (_), and numbers are allowed. The name must be 64 characters or less.
	Datasource	The drop-down list displays all DM data sources. This includes data sources for which you have write-through permissions and those for which you do not. Click the icon to copy the current data source name. For a data source for which you do not have write-through permissions, click Request next to the data source to request the permissions. For more information, see Request, renew, and return data source permissions. If you do not have a DM data source, click Create Data Source to create one. For more information, see Create a Dameng (DM) data source.
	Table	Select the target table for the output data. You can enter a keyword to search for the table, or enter the exact table name and click Exact Search. After you select a table, the system automatically checks its status. Click the icon to copy the name of the selected table. If the target table for data synchronization does not exist in the DM data source, you can use the one-click table creation feature to quickly create it. The steps are as follows: Click One-click table creation. Dataphin automatically generates the code to create the target table. This includes the target table name, which defaults to the source table name, and field types, which are initially converted based on Dataphin fields. Modify the SQL script for creating the target table as needed, and then click Create. After the target table is created, Dataphin automatically sets it as the target table for the output data. Note If a table with the same name exists in the development environment, Dataphin reports an error when you click Create.
	Policy for missing production table	The policy to apply when the production table does not exist. You can select Do not process or Automatic creation. The default value is Automatic creation. If you select Do not process, the production table is not created when the node is published. If you select Automatic creation, a table with the same name is created in the target environment when the node is published. Do not process: If the target table does not exist, a message is displayed upon submission, but the node can still be published. You must then create the target table in the production environment manually before the node can be executed. Automatic creation: You must edit the table creation statement. The statement for the selected table is filled in by default, and you can modify it. Use the `${table_name}` placeholder for the table name in the statement. This is the only placeholder supported. It is replaced with the actual table name during execution. If the target table does not exist, the system first tries to create it using the table creation statement. If the creation fails, the publishing check fails. You can then modify the statement based on the error message and try to publish again. If the target table already exists, the creation statement is not executed. Note This parameter is supported only in projects that are in Dev-Prod mode.
	Loading Policy	Select the policy for writing data to the target table. Loading Policy includes: Append data (insert into): If a primary key or constraint violation occurs, a dirty data error is reported. Update on primary key conflict (merge into): If a primary key or constraint violation occurs, the data in the mapped fields of the existing record is updated.
	Write-through	The primary key update syntax is not an atomic operation. If the data to be written contains duplicate primary keys, you must enable write-through. Otherwise, parallel write is used. The performance of write-through is lower than that of parallel write. Note This parameter is available only when Loading Policy is set to Update on primary key conflict.
	Batch write data volume (optional)	The amount of data to write in a single batch. You can also set Batch write record count. When writing data, the system performs the write operation when either of the two limits is reached. The default value is 32 MB.
	Batch write record count (optional)	The default value is 2,048 records. During data synchronization, a batch write strategy is used. The parameters for this strategy include Batch write record count and Batch write data volume. When the accumulated data reaches either of the configured limits (data volume or record count), the system considers a batch to be full and immediately writes it to the destination. Set the batch write data volume to 32 MB. Adjust the batch write record count based on the size of a single record. A large value helps you take full advantage of batch writing. For example, if a single record is about 1 KB, set the batch write data volume to 16 MB. Then, set the batch write record count to a value greater than 16,384 (16 MB / 1 KB), such as 20,000. With this configuration, the system triggers a batch write whenever the accumulated data reaches 16 MB.
	Preparation statement (optional)	An SQL script to execute on the database before data import. For example, to ensure continuous service availability, you can create a target table named Target_A before this step writes data. The step then writes data to Target_A. After the write operation is complete, you can rename the active service table Service_B to Temp_C, rename Target_A to Service_B, and then delete Temp_C.
	Post-execution statement (optional)	An SQL script to execute on the database after data import.
Field mapping	Input fields	Displays the input fields based on the output from the upstream component.
	Output fields	Displays the output fields. The following operations are supported: Field management: Click Field management to select output fields. Click the icon to move fields from Selected input fields to Unselected input fields. Click the icon to move fields from Unselected input fields to Selected input fields. Batch add: Click Batch add to configure fields in batches using JSON, TEXT, or DDL format. To configure in JSON format, for example: `// Example: [{ "name": "user_id", "type": "String" }, { "name": "user_name", "type": "String" }]` Note name specifies the name of the imported field, and type specifies the data type of the field after it is imported. For example, `"name":"user_id","type":"String"` imports the field named user_id and sets its data type to String. To configure in TEXT format, for example: `// Example: user_id,String user_name,String` The row delimiter separates the information for each field. The default delimiter is a line feed (\n). Semicolons (;) and periods (.) are also supported. The column delimiter separates the field name from the field type. The default delimiter is a comma (,). To configure in DDL format, for example: `CREATE TABLE tablename ( id INT PRIMARY KEY, name VARCHAR(50), age INT );` Create output field: Click + Create output field, enter a Column name, and select a Type. After you configure the current row, click the icon to save.
	Mapping	Manually map fields based on the input from the upstream component and the fields in the target table. Mapping includes Map by row and Map by name. Map by name: Maps fields that have the same name. Map by row: Maps fields that are in the same row, even if their names are different in the source and target tables. Only fields in the same row are mapped.

Click Confirm to complete the configuration of the Dameng output component.