Configure the Amazon RDS for SQL Server output component - Dataphin

The Amazon RDS for SQL Server output component writes data to an Amazon RDS for SQL Server data source. When synchronizing data from other data sources to an Amazon RDS for SQL Server data source, you need to configure the target data source for the Amazon RDS for SQL Server output component after configuring the source data source information. This topic describes how to configure the Amazon RDS for SQL Server output component.

Prerequisites

An Amazon RDS for SQL Server data source is created. For more information, see Create an Amazon RDS for SQL Server data source.
The account used to configure the Amazon RDS for SQL Server output component properties must have write-through permission for the data source. If you do not have the permission, you need to request the data source permission. For more information, see Request, renew, and return data source permissions.

Procedure

On the Dataphin homepage, click Develop > Data Integration in the top navigation bar.
In the top navigation bar of the Integration page, select a project (In Dev-Prod mode, you need to select an environment).
In the navigation pane on the left, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop to open its configuration page.
Click Component Library in the upper-right corner of the page to open the Component Library panel.
In the navigation pane on the left of the Component Library panel, select Outputs. In the output component list on the right, find the Amazon RDS for SQL Server component and drag it to the canvas.
Click and drag the icon of the target input, transform, or flow component to connect it to the current Amazon RDS for SQL Server output component.
Click the icon on the Amazon RDS for SQL Server output component to open the Amazon RDS for SQL Server Output Configuration dialog box.

In the Amazon RDS For SQL Server Output Configuration dialog box, configure the parameters.

Parameter		Description
Basic Settings	Step Name	The name of the Amazon RDS for SQL Server output component. Dataphin automatically generates a step name, which you can modify based on your business scenario. The name must meet the following requirements: It can contain only Chinese characters, letters, underscores (_), and digits. It can be up to 64 characters in length.
	Datasource	The data source dropdown list displays all Amazon RDS for SQL Server data sources, including those for which you have write-through permission and those for which you do not. Click the icon to copy the current data source name. For data sources for which you do not have write-through permission, you can click Request next to the data source to request write-through permission. For more information, see Request, renew, and return data source permissions. If you do not have an Amazon RDS for SQL Server data source, click Create Data Source to create one. For more information, see Create an Amazon RDS for SQL Server data source.
	Schema (optional)	You can select tables across schemas. Select the schema where the table is located. If you do not specify a schema, the schema configured in the data source is used by default.
	Table	Select the target table for output data. You can enter a keyword to search for tables or enter the exact table name and click Exact Match. After you select a table, the system automatically checks the table status. Click the icon to copy the name of the selected table.
	Loading Policy	Select the strategy for writing data to the target table. Loading Policy includes: Append Data (insert Into): When a primary key/constraint conflict occurs, a dirty data error will be reported. Update On Primary Key Conflict (merge Into): When a primary key/constraint conflict occurs, the data of the mapped fields will be updated on the existing record.
	Synchronous Write	The primary key update syntax is not an atomic operation. If the data to be written has duplicate primary keys, you need to enable synchronous write. Otherwise, parallel write is used. Synchronous write performance is lower than parallel write. Note This option can be configured only when the loading policy is set to Update on primary key conflict.
	Batch Write Size (optional)	The size of data to be written at a time. You can also set Batch Write Records. The system writes data when either limit is reached. The default value is 32M.
	Batch Write Records (optional)	The default value is 2048 records. When data is synchronized and written, a batch write strategy is used. The parameters include Batch Write Records and Batch Write Size. When the accumulated data reaches either limit (the batch write size or batch write records), the system considers a batch of data to be full and immediately writes this batch of data to the destination at once. We recommend that you set the batch write size to 32MB. For the batch write records limit, you can adjust it based on the actual size of a single record to fully utilize the advantages of batch writing. For example, if the size of a single record is about 1KB, you can set the batch write size to 16MB and the batch write records to a value greater than the result of 16MB divided by the single record size of 1KB (i.e., greater than 16384 records), such as 20000 records. With this configuration, the system will trigger batch writing based on the batch write size. When the accumulated data reaches 16MB, a write operation will be performed.
	Pre-statement (optional)	The SQL script to be executed on the database before data import. For example, to ensure continuous service availability, before the current step writes data, it first creates a target table Target_A, then writes data to Target_A. After the current step completes writing data, it renames the continuously serving table Service_B to Temp_C, then renames table Target_A to Service_B, and finally deletes Temp_C.
	Post-statement (optional)	The SQL script to be executed on the database after data import.
Field Mapping	Input Fields	Displays the input fields based on the output of the upstream component.
	Output Fields	Displays the output fields. You can perform the following operations: Field management: Click Field Management to select output fields. Click the icon to move Selected Input Fields to Unselected Input Fields. Click the icon to move Unselected Input Fields to Selected Input Fields. Batch add: Click Batch Add to configure in JSON, TEXT, or DDL format. Batch configuration in JSON format, for example: `// Example: [{"name":"id","type":"String"}, {"name":"aaasa","type":"String"}, {"name":"creator","type":"String"}, {"name":"modifier","type":"String"}, {"name":"creator_nickname","type":"String"}, {"name":"modifier_nickname","type":"String"}, {"name":"create_time","type":"Date"}, {"name":"modify_time","type":"Date"}, {"name":"qbi_system_upload_id","type":"Long"}]` Note name indicates the name of the imported field, and type indicates the field type after import. For example, `"name":"user_id","type":"String"` means importing a field named user_id and setting its type to String. Batch configuration in TEXT format, for example: `// Example: id,String aaasa,String creator,String modifier,String creator_nickname,String modifier_nickname,String create_time,Date modify_time,Date qbi_system_upload_id,Long` The row delimiter is used to separate the information of each field. The default is a line feed (\n). Supported delimiters include line feed (\n), semicolon (;), and period (.). The column delimiter is used to separate the field name and field type. The default is a comma (,). Batch configuration in DDL format, for example: `CREATE TABLE tablename ( id INT PRIMARY KEY, name VARCHAR(50), age INT );` Create a new output field: Click +Create Output Field, fill in the Column and select the Type as prompted. After completing the configuration for the current row, click the icon to save.
	Quick Mapping	Based on the upstream input and the target table fields, you can manually select field mappings. Quick Mapping includes Same Row Mapping and Same Name Mapping. Same name mapping: Maps fields with the same name. Same row mapping: Maps fields in the same row when the field names in the source and target tables are different but the data in the corresponding rows needs to be mapped.

Click OK to complete the property configuration of the Amazon RDS for SQL Server output component.