The KingbaseES input component is designed to read data from a KingbaseES data source. When synchronizing data from KingbaseES to other data sources, it's necessary to configure the KingbaseES input component to read the data source, followed by setting up the target data source for synchronization. This topic describes the steps to configure the KingbaseES input component.
Prerequisites
The KingbaseES data source is now established. For more information, see Create KingbaseES Data Source.
To configure the properties of the KingbaseES input component, the account must have read-through permission for the data source. If permission is lacking, you need to request access to the data source. For more information, see Request, Renew, and Return Data Source Permissions.
Procedure
On the Dataphin home page, navigate to the top menu bar and select Development > Data Integration.
At the top menu bar on the integration page, choose Project (select the environment in Dev-Prod mode).
In the left-side navigation pane, click Batch Pipeline. From the Batch Pipeline list, select the offline pipeline you want to develop to access its configuration page.
To open the Component Library panel, click on the Component Library in the upper-right corner of the page.
In the Component Library panel's left-side navigation pane, select Input. Locate the KingbaseES component in the list on the right and drag it onto the canvas.
To configure the KingbaseES input component, click the
icon on the component card to open the KingbaseES Input Configuration dialog box.
In the Kingbasees Input Configuration dialog box, set the necessary parameters.
Parameter
Description
Step Name
This is the name of the KingbaseES input component. Dataphin automatically generates the step name, and you can also modify it according to the business scenario. The naming convention is as follows:
Can only contain Chinese characters, letters, underscores (_), and numbers.
Cannot exceed 64 characters.
Datasource
The data source drop-down list displays all KingbaseES type data sources in the current Dataphin, including data sources for which you have read-through permission and those for which you do not. Click the
icon to copy the current data source name.
For data sources for which you do not have read-through permission, you can request read permission for the corresponding data source. For specific operations to request data source read permission, see Request, Renew, and Return Data Source Permissions.
If you do not have a KingbaseES type data source, click Create Data Source to create a data source. For more information, see Create KingbaseES Data Source.
Source Table Quantity
Select the source table quantity. The source table quantity includes Single Table and Multiple Tables:
Single Table: Suitable for scenarios where business data from one table is synchronized to one target table.
Multiple Tables: Suitable for scenarios where business data from multiple tables is synchronized to the same target table. When data from multiple tables is written to the same data table, the union algorithm is used.
Table
Select the source table:
If Source Table Quantity is set to Single Table, you can enter a table name keyword to search or enter the exact table name and then click Precise Search. After selecting the table, the system will automatically detect the table status. Click the
icon to copy the name of the currently selected table.
If Source Table Quantity is set to Multiple Tables, perform the following operations to add tables.
In the input box, enter the expression of the table to filter tables with the same structure.
The system supports enumeration form, class regular form, and mixed form. For example,
table_[001-100];table_102
.Click Precise Search to view the list of matched tables in the Confirm Match Details dialog box.
Click Confirm.
Shard Key (Optional)
The system performs data sharding based on the configured shard key field, which can be used in conjunction with concurrency configuration to achieve concurrent reading. It supports using a column in the source data table as the shard key. Additionally, it is recommended to use a primary key or an indexed column as the shard key to ensure transmission performance.
ImportantWhen selecting a date-time type, the system will perform brute-force sharding based on the total time range and concurrency by identifying the maximum and minimum values. Average is not guaranteed.
Batch Read Count (Optional)
The number of data records read at one time. When reading data from the source database, you can configure a specific batch read count (such as 1024 records) instead of reading one by one to reduce the number of interactions with the data source, improve I/O efficiency, and reduce network latency.
Input Filter (Optional)
Fill in the filter information for the input fields, such as
ds=${bizdate}
. Input Filter is applicable to the following two scenarios:A fixed part of the data.
Parameter filtering.
Output Fields
The output fields area displays all fields hit by the selected table and filter criteria. If you do not need to output certain fields to downstream components, you can delete the corresponding fields:
Single Field Deletion Scenario: If you need to delete a small number of fields, you can click the
icon under the operation column to delete the redundant fields.
Batch Field Deletion Scenario: If you need to delete many fields, you can click Field Management, select multiple fields in the Field Management dialog box, click the
left shift icon to move the selected input fields to the unselected input fields, and click Confirm to complete the batch deletion of fields.
Click Confirm to finalize the property settings for the Kingbasees Input Component.