Field editing and value assignment component - DataWorks

Real-time extract, transform, and load (ETL) tasks of DataWorks Data Integration support field editing and value assignment components. You can add a field editing and value assignment component between a source component and a destination component to modify and assign values to new fields based on configured filtering rules.

Prerequisites

Basic settings and network resources for data synchronization are configured.
The data synchronization type is set to real-time synchronization of data from a single table.

Step 1: Configure a synchronization task in Data Integration

Add the required data sources to DataWorks. For more information, see Add and manage data sources.
Create a synchronization task in Data Integration. For more information, see the topics in the Data Integration-side synchronization task directory.
Note
If you create a synchronization task that is used to synchronize data from a single table in real time, you can add data processing components between the source component and the destination component. For more information, see Supported data source types and synchronization operations.

Step 2: Add a field editing and value assignment component

In the wizard in the upper part of the configuration page of the synchronization task, click the icon between the source type and the destination type and select Edit Field and Assign Value. The Edit Field and Assign Value section appears.

Configure the field editing and value assignment component.

View field details.
- Field Name: The name of an output field of the ancestor component.
- Type: The data type of an output field of the ancestor component.
- Value: specifies how a value is assigned to an existing field of the ancestor component or a newly added field. Valid values: Assignment, Variable, and Function.
- Actions: You can determine whether to pass an output field of the ancestor component to its descendant component.

Add fields: Click Add field in the lower-left corner of the Edit Field and Assign Value section to add fields. The following information describes the methods to assign values to the new fields.

Assignment: Convert the manually entered text content into a specific data type and assign the converted data to new fields as values.

Variable: Assign values of built-in variables to new fields when a synchronization task is run. The following table describes the supported built-in variables.

Note

The variables that are supported by different sources and destinations vary based on system features.

Supported variables

Variable		Description
Source	PROCESS_TIME	The time when the synchronization task consumes the data record. The value is a 13-digit timestamp in milliseconds.
	DATASOURCE_NAME_SRC	The name of the source.
	DB_NAME_SRC	The name of the source database.
	SCHEMA_NAME_SRC	The name of the source schema.
	TABLE_NAME_SRC	The name of the source table.
	TOPIC_NAME_SRC	The name of the source topic.
Destination	TABLE_NAME_DEST	The name of the destination table.
	TOPIC_NAME_DEST	The name of the destination topic.
	SCHEMA_NAME_DEST	The name of the destination schema.
	DB_NAME_DEST	The name of the destination database.
	DATASOURCE_NAME_DEST	The name of the destination.

Function: Perform function computing when a synchronization task is run to assign computing results to new fields as values. If an error occurs during function processing, the corresponding record is regarded as dirty data and is included in the statistics that are collected on dirty data in the synchronization task. Then, determine whether to stop the synchronization task based on the tolerance configuration of dirty data.

Supported functions

Function	Parameter	Description
DATE_FORMAT	Parameter1: an ancestor field of a numeric type. The value of this field is a 10-digit UNIX timestamp or a 13-digit timestamp in milliseconds. Parameter2: the type of the timestamp. Valid values: unix and millis. The value unix indicates a 10-digit UNIX timestamp. The value millis indicates a 13-digit timestamp in milliseconds. Parameter3: the time zone used for conversion. Parameter4: the time string format that is used to format a timestamp. Example: `yyyy-MM-dd HH:mm:ss:SSS ZZ`. `yyyy` indicates the year, `MM` indicates the month, `dd` indicates the day, `HH` indicates the hour that is expressed in the 24-hour clock, `mm` indicates the minute, `ss` indicates the second, `SSS` indicates the millisecond, and `ZZ` indicates the time zone.	Convert a timestamp into a time string in a specific format.
TO_TIMESTAMP	Parameter1: an ancestor field of the text type. The value of this field is a time string in a specific format. Parameter2: the type of the timestamp. Valid values: unix and millis. The value unix indicates a 10-digit UNIX timestamp. The value millis indicates a 13-digit timestamp in milliseconds. Parameter3: the time zone used for conversion. Parameter4: the format of the time string. Example: `yyyy-MM-dd HH:mm:ss:SSS ZZ`. `yyyy` indicates the year, `MM` indicates the month, `dd` indicates the day, `HH` indicates the hour that is expressed in the 24-hour clock, `mm` indicates the minute, `ss` indicates the second, `SSS` indicates the millisecond, and `ZZ` indicates the time zone.	Convert a time string into a 10-digit or 13-digit timestamp.
SUBSTRING	Parameter1: an ancestor field of the text type. Parameter2: the start position of a substring. The substring contains the character at the start position. The minimum start position is 0. Parameter3: the substring length. If the length is less than or equal to 0, a substring from the start position to the end is returned.	Obtain a substring from a string.

What to do next

After you configure information about the source, field editing and value assignment component, and destination for the synchronization task, you can click Perform Simulated Running in the upper-right corner of the configuration page to test the synchronization task and check whether the output results meet your business requirements.