The Microsoft SQL Server input component retrieves data from a Microsoft SQL Server data source. To sync data from a Microsoft SQL Server data source to another data source, first configure the Microsoft SQL Server input component to read from the source. Then configure the destination data source for the sync. This topic describes how to configure the Microsoft SQL Server input component.
Prerequisites
You have created a Microsoft SQL Server data source. For more information, see Create a Microsoft SQL Server Data Source.
The account used to configure the Microsoft SQL Server input component must have sync-read permission on the data source. If you do not have this permission, request it. For more information, see Request, Renew, or Release Data Source Permissions.
Procedure
On the Dataphin homepage, in the top menu bar, click Develop, and then click Data Integration.
On the Integration page, select a Project from the top menu bar. In Dev-Prod mode, also select an environment.
In the left navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop. The offline pipeline configuration page opens.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the left navigation pane of the Component Library panel, click Input. In the input component list on the right, locate the Microsoft SQL Server component and drag it onto the canvas.
You can click the
icon in the Microsoft SQL Server input component card to open the Microsoft SQL Server Input Configuration dialog box.In the Microsoft SQL Server Input Configuration dialog box, configure the following parameters.
Parameter
Description
Step Name
The name of the Microsoft SQL Server input component. Dataphin generates a step name automatically. You can change it based on your business scenario. Naming rules:
Use only Chinese characters, letters, underscores (_), and digits.
Keep the name up to 64 characters long.
Datasource
The drop-down list shows all Microsoft SQL Server data sources. This includes data sources for which you have sync-read permission and those for which you do not. Click the
icon to copy the current data source name.If you do not have sync-read permission for a data source, click Request next to the data source to request sync-read permission. For more information, see Request, Renew, or Release Data Source Permissions.
If you do not have a Microsoft SQL Server data source, click Create Data Source to create one. For more information, see Create a Microsoft SQL Server Data Source.
Time Zone
Dataphin processes time-formatted data based on the current time zone. By default, this matches the time zone configured for the selected data source. You cannot change this setting.
NoteFor tasks created before version V5.1.2, you can choose Data Source Default Configuration or Channel Configuration Time Zone. The default is Channel Configuration Time Zone.
Data Source Default Configuration: The default time zone of the selected data source.
Channel Configuration Time Zone: The time zone configured for the current integration task under Properties > Channel Configuration.
Schema (Optional)
Select the schema where the table resides. This supports cross-schema table selection. If you do not specify a schema, Dataphin uses the schema configured for the data source.
Source Table Count
Select the number of source tables. Options are Single Table and Multiple Tables:
Single Table: Use this when syncing business data from one source table to one destination table.
Multiple Tables: Use this when syncing business data from multiple source tables to one destination table. When writing data from multiple tables into one destination table, Dataphin uses the union algorithm.
Table Matching Method
Choose Generic Rule or Database Regex.
NoteThis option is available only when Source Table Count is set to Multiple Tables.
Table
Select the source table:
If Source Table Count is set to Single Table, enter a keyword to search for the table name or enter the exact table name and click Exact Search. After selecting a table, Dataphin automatically checks its status. Click the
icon to copy the selected table name.If Source Table Count is set to Multiple Tables, enter an expression based on the selected table matching method.
If you chose Generic Rule: Enter an expression in the field to filter tables with the same structure. Supported formats include enumeration, regex-like patterns, and combinations. Example:
table_[001-100];table_102;.If you chose Database Regex: Enter a regular expression supported by your database. Dataphin matches tables in the destination database using this expression. At runtime, Dataphin dynamically matches new tables based on the regex.
After entering the expression, click Exact Search. In the Confirm Match Details dialog box, review the list of matched tables.
Shard Key (Optional)
Dataphin partitions data based on the shard key column you specify. Use this with concurrency settings to enable concurrent reads. You can use any column from the source table as the shard key. For best performance, use a primary key or an indexed column.
ImportantIf you select a date-time type, Dataphin performs a brute-force split based on the full time range and concurrency setting. This does not guarantee even distribution.
Batch Read Size (Optional)
The number of records to read at once. Instead of reading records one by one, configure a batch size such as 1024. This reduces interactions with the data source, improves I/O efficiency, and lowers network latency.
Input Filter (Optional)
Enter filter conditions for input fields. For example:
ds=${bizdate}. Use Input Filter for these scenarios:A fixed subset of data.
Parameter-based filtering.
Hint (Optional)
Table hints influence how the query optimizer accesses tables or views. By adding specific hints to a query statement, you force the database management system to use a particular execution plan instead of relying on default optimization strategies.
Add hints after the table name in the FROM clause, inside parentheses, and separate multiple hints with commas. Example:
SELECT * FROM MyTable WITH (INDEX(IX_MyIndex), NOLOCK)
Output Fields
This section lists all fields from the selected table and filtered by your conditions. You can perform the following actions:
Field Management: Remove fields you do not need to send downstream:
Remove individual fields: Click the
icon in the Actions column to delete extra fields.Deleting fields in batches: Click Field Management. In the Field Management dialog box, select multiple fields and click the
left-shift icon to move the selected input fields to the unselected input fields list. Then, click OK.
Batch Add: Click Batch Add to add fields in JSON, TEXT, or DDL format.
NoteAfter clicking OK, the batch-added fields overwrite existing field configurations.
JSON format example:
// Example: [ { "index": 1, "name": "Id", "type": "int(10)", "mapType": "Long", "comment": "" }, { "index": 2, "name": "Data", "type": "nvarchar(100)", "mapType": "String", "comment": "" } ]Noteindex represents the column number of the specified object, name represents the field name of the imported field, and type represents the field type of the imported field. For example,
"index":3,"name":"user_id","type":"String"means importing the fourth column from the file, with the field name user_id and field type String.Configure multiple items at once in TEXT format, for example:
// Example: 1,Id,int(10),Long, 2,Data,nvarchar(100),String, 3,RowVersion,timestamp,Bytes,The row delimiter separates field entries. The default is a line feed (\n). You can also use semicolons (;) or periods (.).
The column delimiter separates field names and types. The default is a comma (,). Supported delimiters include
','. Field types are optional and default to','.
You can configure in batch using DDL format, for example:
CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );
Create a New Output Field: Click + Create Output Field. Enter values for Column, Type, and Comment. Select a Mapping Type. Click the
icon to save the row.
Click OK to complete the configuration of the Microsoft SQL Server input component.