The GBase 8a input component retrieves data from a GBase 8a data source. To sync data from a GBase 8a data source to another data source, first configure the GBase 8a input component to read from the source. Then configure the target data source for the sync. This topic explains how to configure the GBase 8a input component.
Prerequisites
Create a GBase 8a data source. For instructions, see Create a GBase 8a Data Source.
The account used to configure the GBase 8a input component must have sync-read permission on the data source. If you do not have this permission, request it. For details, see Request Data Source Permissions.
Procedure
On the Dataphin homepage, in the top menu bar, click Develop, and then click Data Integration.
On the Integration page, in the top menu bar, select Project. In Dev-Prod mode, also select an environment.
In the left navigation pane, click Offline Integration. In the Offline Integration list, click the offline pipeline that you want to develop to open its configuration page.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the left navigation pane of the Component Library panel, select Input. In the list of input components on the right, locate the GBase 8a component and drag it onto the canvas.
Click the
icon in the GBase 8a Input widget card to open the GBase 8a Input Configuration dialog box.In the GBase 8a Input Configuration dialog box, configure the parameters.
Parameter
Description
Step Name
The name of the GBase 8a input component. Dataphin generates a step name automatically. You can change it based on your business scenario. Use these naming rules:
Use only Chinese characters, letters, underscores (_), and digits.
Keep the name under 64 characters.
Datasource
The drop-down list shows all GBase 8a data sources in Dataphin. It includes data sources for which you have sync-read permission and those for which you do not. Click the
icon to copy the current data source name.If you do not have sync-read permission for a data source, request read permission for it. For steps, see Request Data Source Permissions.
If you do not have a GBase 8a data source yet, click Create Data Source to create one. For details, see Create a GBase 8a Data Source.
Source Table Count
Select the number of source tables. Options are Single Table and Multiple Tables:
Single Table: Use this option when syncing business data from one table to one target table.
Multiple Tables: Use this option when syncing business data from multiple tables to one target table. When writing data from multiple tables into one table, the system uses the union algorithm.
Table Matching Method
You can only select Generic Rule.
NoteThis setting is available only when you select Multiple Tables for Source Table Count.
Table
Select the source table:
If you selected Single Table for Source Table Count, search by entering a keyword for the table name. Or enter the exact table name and click Exact Match. After you select a table, the system automatically checks its status. Click the
icon to copy the name of the selected table.If you selected Multiple Tables for Source Table Count, add tables as follows:
In the input box, enter an expression to filter tables with the same structure.
The system supports enumerated, regex-like, and mixed formats. For example:
table_[001-100];table_102.Click Exact Match. In the Confirm Match Details dialog box, review the list of matched tables.
Click Confirm.
Shard Key (Optional)
The system splits data using the configured shard key field. Use this with concurrency settings to enable concurrent reads. You can use any column from the source table as the shard key. For best performance, use a primary key or an indexed column.
ImportantIf you select a date-time type, the system performs brute-force splitting across the full time range based on the max and min values and the concurrency setting. This split is not guaranteed to be even.
Batch Read Size (Optional)
The number of records to read at once. Configure a batch size—such as 1024 records—to reduce round trips to the source database. This improves I/O efficiency and lowers network latency.
Input Filter (Optional)
Set conditions to filter the data to extract. Examples:
Use a static value. For example:
ds=20210101.Use a variable parameter. For example:
ds=${bizdate}.
Output Fields
This section lists all fields from the selected table and matching filter conditions. You can:
Manage fields: Remove fields you do not need downstream:
Remove one field: Click the
icon in the Actions column to delete a single field.Remove multiple fields: Click Field Management. In the Field Management dialog box, select multiple fields. Click the left-shift icon (
) to move them from the Selected list to the Unselected list. Click OK to complete the bulk removal.
Batch add: Click Batch Add to configure output fields in JSON, TEXT, or DDL format.
NoteAfter you click OK, the batch-added fields overwrite any existing field configurations.
JSON format example:
// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]NoteThe index indicates the column number of the specified object, the name indicates the field name after import, and the type indicates the field type after import. For example,
"index":3,"name":"user_id","type":"String"imports the fourth column in the file, with the field name user_id and the field type String.TEXT format example:
// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2The row delimiter separates field entries. Default is line feed (\n). You can also use semicolon (;) or period (.).
The column delimiter separates field names and types. Default is comma (,). You can also use
','. Field type is optional and defaults to','.
DDL format example:
CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );
Add a new output field: Click + Add Output Field. Enter the Column, Type, and Comment. Select a Mapping Type. Click the
icon to save the row.
Click OK to finish configuring the GBase 8a Input Component.