The Lindorm input component is designed to read data from a Lindorm data source. When synchronizing data from a Lindorm data source to other data sources, it's necessary to first configure the source data source information for the Lindorm input component, followed by the configuration of the target data source for data synchronization. This topic describes the steps to configure the Lindorm input component.
Prerequisites
A Lindorm data source has been created. For more information, see Create a Lindorm Data Source.
To configure the Lindorm input component properties, the account must possess read-through permission for the data source. If you lack the necessary permission, you must obtain it from the data source. For more information, see Request Data Source Permission.
Procedure
On the Dataphin home page, select Development > Data Integration from the top menu bar.
In the integration page's top menu bar, select Project (Dev-Prod mode requires selecting an environment).
In the left-side navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline you want to develop to open its configuration page.
Click Component Library in the upper right corner of the page to open the Component Library panel.
In the Component Library panel's left-side navigation pane, select Input. Find the Lindorm component in the list of input components on the right and drag it to the canvas.
Click the
icon on the Lindorm input component card to open the Lindorm Input Configuration dialog box.In the Lindorm Input Configuration dialog box, configure the parameters.
Parameter
Description
Step Name
This is the name of the Lindorm input component. Dataphin automatically generates the step name, and you can modify it according to the business scenario. The naming convention is as follows:
Can only contain Chinese characters, letters, underscores (_), and numbers.
Cannot exceed 64 characters.
Datasource
The data source drop-down list displays all Lindorm-type data sources in the current Dataphin, including data sources for which you have read-through permission and those for which you do not. Click the
icon to copy the current data source name.For data sources without read-through permission, you can click Request after the data source to request read-through permission. For more information, see Request Data Source Permission.
If you do not have a Lindorm-type data source yet, click Create Data Source to create a data source. For more information, see Create a Lindorm Data Source.
Table
Select the source table for data synchronization. Click the
icon to copy the name of the currently selected table.ImportantTables in iceberg format with a primary key are not supported.
Input Filter
Filtering is supported only for iceberg storage format. Supported logical operators are: and, or, not. Supported relational operators are: >, >=, <, <=, =, !=, like, not like, is null, is not null. The
likeoperator only supports matching records that start with a certain string. Thenot likeoperator only supports matching records that do not start with a certain string, such asname like 'abc%'.Partition
If the source table is a partitioned table, you need to configure partition information. Single or multiple partitions are supported, such as
ds=20230101or/*query*/ds>=20230101 and ds<=20230107. Parameters are supported, such asds=${bizdate}.File Encoding
Select the file encoding of the source table. UTF-8 and GBK are supported.
Compression Format
If the file is compressed, select the corresponding compression format to allow Dataphin to decompress it. This is not a required field. Supported formats are gzip, bzip2, lzo, lzo_deflate, hadoop-snappy, framing-snappy, zip, zlib. The default format for orc tables is zlib. If other decompression formats are needed, they must be specified. Other format tables have no default format.
Field Separator
Use this separator to write to the target. If not specified, the default is
\u0001.Output Fields
The output fields area displays all fields hit by the selected table and filter criteria. If certain fields do not need to be output to downstream components, you can delete the corresponding fields:
Single Field Deletion Scenario: If you need to delete a small number of fields, you can click the
icon in the operation column to delete the extra fields.Batch Field Deletion Scenario: If you need to delete many fields, you can click Field Management, select multiple fields in the Field Management dialog box, then click the
left shift icon to move the selected input fields to the unselected input fields and click Confirm to complete the batch deletion of fields.
Click Confirm to finalize the property configuration for the Lindorm input component.