How to configure the GoldenDB input component - Dataphin

The GoldenDB input component reads data from a GoldenDB data source. To sync data from GoldenDB to another data source, configure the GoldenDB input component first, and then configure the target data source.

Prerequisites

You have created a GoldenDB data source. For instructions, see Create a GoldenDB data source.
The account used to configure the GoldenDB input component must have read-through permission on the data source. If the account lacks this permission, request it. For details, see Request, renew, or release data source permissions.

Procedure

On the Dataphin homepage, in the top menu bar, click Develop, and then click Data Integration.
On the Integration page, select a Project in the top menu bar. In Dev-Prod mode, also select an environment.
In the navigation pane on the left, click Offline Integration, and then click the offline pipeline that you want to develop in the Offline Integration list to open its configuration page.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the left navigation pane of the Component Library panel, click Input. In the input component list on the right, locate the GoldenDB component and drag it onto the canvas.
Click the icon in the GoldenDB input component card to open the GoldenDB Input Configuration dialog box.

In the GoldenDB Input Configuration dialog box, configure the parameters.

Parameter	Description
Step Name	The name of the GoldenDB input component. Dataphin generates a default name that you can change. Naming rules: Use only Chinese characters, letters, underscores (_), and digits. Keep the name to 64 characters or fewer.
Datasource	Lists all GoldenDB data sources in Dataphin, including those you have read-through permission for and those you do not. Click the icon to copy the data source name. If you lack read-through permission for a data source, click Request next to the data source to request permission. For steps, see Request data source permissions. If you do not have a GoldenDB data source, click Create Data Source to create one. For details, see Create a GoldenDB data source.
Source Table Count	Select the number of source tables. Options are Single Table and Multiple Tables: Single Table: Use this option when syncing business data from one source table to one target table. Multiple Tables: Use this option when syncing business data from multiple source tables to one target table. When writing data from multiple tables into one table, the system uses the union algorithm.
Table Matching Method	You can only select Generic Rule. Note This setting is available only when you select Multiple Tables for Source Table Count.
Table	Select the source table: If you selected Single Table for Source Table Count, search by entering a keyword in the table name field. Or enter the exact table name and click Exact Match. After you select a table, the system detects its status automatically. Click the icon to copy the selected table name. If you selected Multiple Tables for Source Table Count, add tables as follows: In the input field, enter a table expression to filter for tables with the same structure. The system supports enumerated, regex-like, and mixed expressions. For example: `table_[001-100];table_102`. Click Exact Match. In the Confirm Match Details dialog box, review the list of matched tables. Click Confirm.
Split Key (Optional)	Splits data based on the column you specify. Use this with concurrency settings to enable parallel reads. Any column from the source table can serve as the split key. For best performance, use a primary key or indexed column. Important If you select a date-time type, the system performs brute-force splitting across the full time range using the max and min values. This method does not guarantee even distribution.
Batch Read Size (Optional)	The number of records to read per batch. Setting a batch size (for example, 1024) reduces round trips to the data source, improves I/O efficiency, and lowers network latency.
Input Filter (Optional)	Set conditions to filter the data you extract. Configure as follows: Set a static value: Extract matching data. For example: `ds=20210101`. Set a variable parameter: Extract a subset of data. For example: `ds=${bizdate}`.
Output Fields	Lists all fields from the selected table and filtered results. You can: Manage fields: Remove fields you do not need downstream: Remove one field: Click the icon in the Actions column to delete a single field. Remove multiple fields: Click Field Management. In the Field Management dialog box, select multiple fields. Click the left-shift icon to move them from the Selected list to the Unselected list. Click OK to complete the bulk removal. Batch add: Click Batch Add to add fields in JSON, TEXT, or DDL format. Note After you batch-add fields and click OK, the new configuration overwrites any existing field configuration. Batch configuration in JSON format, for example: `// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]` Note `index` specifies the column number, `name` specifies the field name after import, and `type` specifies the field type after import. For example, `"index":3,"name":"user_id","type":"String"` means that the fourth column of the file is imported with the field name `user_id` and the field type `String`. You can configure in batch in TEXT format, for example: `// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2` The row delimiter separates each field's information. The default is a line feed (\n). You can also use a semicolon (;) or period (.). The column delimiter separates field names from field types. The default is a comma (,). You can also use `','`. Field types are optional. If omitted, the default is `','`. Batch configuration in DDL format, for example: `CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );` Add a new output field: Click + Add Output Field. Fill in Column, Type, Comment, and select a Mapping Type. Click the icon to save the row.

Click OK to finish configuring the GoldenDB input component.