Write Data to TDH Inceptor via Dataphin Batch Pipeline - Dataphin

The TDH Inceptor output component writes data to a TDH Inceptor data source. When you sync data from other data sources to a TDH Inceptor data source, you must configure this output component after you configure the source component. This topic describes how to configure the TDH Inceptor output component.

Limits

The TDH Inceptor output component supports writing data to TDH Inceptor tables in ORC, Parquet, and text file formats. Data integration for transactional tables in ORC format is not supported.

Prerequisites

A TDH Inceptor data source is created. For more information, see Create a TDH Inceptor data source.
The account used to configure the properties of the TDH Inceptor output component requires read-through permission for the data source. If you do not have this permission, you must request it. For more information, see Request data source permissions.

Procedure

On the Dataphin home page, choose Developer > Data Integration from the top menu bar.
In the top menu bar on the integration page, select a Project. If you are in Dev-Prod mode, you must also select an Environment.
In the navigation pane on the left, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to configure to open its configuration page.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the navigation pane on the left of the Component Library panel, select Outputs. From the list of output components on the right, find the TDH Inceptor component and drag it to the canvas.
Click and drag the icon of the target input, transform, or flow component to connect it to the current TDH Inceptor output component.
Click the icon on the TDH Inceptor output component card to open the TDH Inceptor Output Configuration dialog box.

In the TDH Inceptor Output Configuration dialog box, configure the parameters.

Parameter		Description
Basic Settings	Step Name	The name of the TDH Inceptor output component. Dataphin automatically generates a step name. You can also change the name as needed. The naming conventions are as follows: Can contain only Chinese characters, letters, underscores (_), and digits. Cannot exceed 64 characters in length.
	Datasource	The data source drop-down list displays all TDH Inceptor data sources. This includes data sources for which you have write permissions and those for which you do not. Click the icon to copy the current data source name. For data sources where you do not have write permissions, click Request next to the data source to request them. For more information, see Request, renew, or release permissions on a data source. If you do not have a TDH Inceptor data source, click Create Data Source to create one. For more information, see .
	Table	Select the destination table for the data output. You can enter a keyword to search for the table. Click the icon to copy the name of the selected table. Select the destination table for data sync. If the destination table does not exist in the TDH Inceptor data source, you can use the One-Click Table Creation feature to quickly generate it. The steps are as follows: Click One-Click Table Creation. Dataphin automatically generates the code to create the destination table. This includes the destination table name, which defaults to the source table name, and field types, which are based on an initial conversion from Dataphin fields. You can modify the SQL script for creating the destination table as needed. Then, click Create. After the destination table is created, Dataphin automatically uses it as the destination table for data output. Note If a table with the same name exists in the development environment, Dataphin reports an error when you click Create.
	Policy for missing production table	The policy to apply when the production table does not exist. You can select Do not handle or Automatic creation. The default is Automatic creation. If you select Do not handle, the production table is not created when the node is published. If you select Automatic creation, a table with the same name is created in the destination environment when the node is published. Do not handle: If the destination table does not exist, a message is displayed when you submit the node, but you can still publish it. In this case, you must manually create the destination table in the production environment before you can run the node. Automatic creation: You must edit the table creation statement. The creation statement for the selected table is filled in by default, and you can adjust it. The table name in the creation statement uses the placeholder `${table_name}`. Only this placeholder is supported. It is replaced with the actual table name during execution. If the destination table does not exist, it is first created according to the table creation statement. If the creation fails, the check fails during publishing. You can modify the creation statement based on the error message and then publish again. If the destination table already exists, the creation statement is not executed. Note This parameter is supported only in projects in Dev-Prod mode.
	File Encoding	Select the file encoding method. The system supports UTF-8 and GBK.
	Compression Format	Select the file compression format. Supported formats include zlib and hadoop-snappy.
	Loading Policy	The policy for writing data to the destination table in the TDH Inceptor data source. It includes appending and overwriting. The scenarios are described as follows: Append data: If a primary key or constraint violation occurs, the system reports a dirty data error. Overwrite data: If a primary key or constraint violation occurs, the system first deletes the original data and then inserts the entire new row.
	Field Separator	Enter the field separator for file storage. If you leave this blank, the system uses a comma (,) as the default Field Separator.
	Partition	If you selected a partitioned table, you must select a partition for the data table. The default partition is ds=${bizdate}.
	Hadoop Parameter Configuration (Optional)	Used to adjust write parameters. You can enter different parameters for different table types. Separate multiple parameters with commas (,). The format is `{"key1":"value1", "key2":"value2"}`. For example, in a scenario where the output table is in ORC format and has many fields, you can adjust the `{"hive.exec.orc.default.buffer.size"}` parameter based on the memory size. If you have enough memory, try increasing this value to improve write performance. If you are low on memory, try decreasing this value to reduce garbage collection (GC) time and improve write performance. The default value is 16384 Byte (16 KB). The value should not exceed 262144 Byte (256 KB). When writing to a Hudi table, you can use the `{"hoodie.parquet.compression.codec":"snappy"}` parameter to change the compression format to snappy.
	Prepared Statement	The SQL script to execute on the database before data import. The script can be up to 128 characters long.
	Completion Statement	The SQL script to execute on the database after data import. The script can be up to 128 characters long.
Field Mapping	Input Fields	Displays the input fields.
	Output Fields	The output fields area displays all fields of the selected table. If you do not want to output certain fields to downstream components, you can delete them: To delete a few fields, you can click the icon in the Actions column to delete unnecessary fields. To delete many fields, click Field Management. On the Field Management page, select multiple fields, and then click the icon to move the Selected Input Fields to the Unselected Input Fields list.
	Quick Mapping	The mapping relationship connects the input fields of the source table to the output fields of the destination table. Mapping relationships include mapping by name and mapping by row index. The scenarios are described as follows: Map by name: Maps fields that have the same name. Map by row index: The field names in the source and destination tables are different, but the data in the corresponding rows needs to be mapped. This maps only fields in the same row.

Click Confirm to complete the property configuration for the TDH Inceptor output component.