Assign values to destination table columns using function expressions - DataWorks

DataWorks Data Integration lets you use function expressions to assign values to target table fields in real-time full-database sync tasks.

Limitations

You can use function expressions to assign values to target table fields only in real-time full-database sync tasks that use MySQL or PolarDB as a source and MaxCompute or Hologres as a destination.

Supported functions and syntax

Supported functions

Function name	Description	Parameters	Example expression
CONCAT	Concatenates strings.	Parameter 1...n: The value of another field (`#{col}`), a built-in variable, or a string literal (`"string"`). This parameter is optional. You can specify an unlimited number of parameters.	`concat(#{name},${table_name_src},"_01")`
DATE_FORMAT	Converts a timestamp to a time string in a specific format.	Parameter 1: An ancestor table field of a numeric type. The value can be a 10-digit UNIX timestamp or a 13-digit millisecond timestamp. Parameter 2: The timestamp type. Valid values: unix (10-digit UNIX timestamp) and millis (13-digit millisecond timestamp). Parameter 3: The time zone used for the conversion. Parameter 4: The format of the destination date string. Example: `yyyy-MM-dd HH:mm:ss:SSS ZZ`. `yyyy` indicates the year, `MM` indicates the month, `dd` indicates the day, `HH` indicates the hour in 24-hour format, `mm` indicates the minute, `ss` indicates the second, `SSS` indicates the millisecond, and `ZZ` indicates the time zone.	`date_format(#{timestamp},"millis","Asia/Shanghai","yyyy-MM-dd HH:mm:ss.SSS ZZ")`
TO_TIMESTAMP	Converts a time string to a 10-digit or 13-digit timestamp.	Parameter 1: An ancestor table field of a text type. The value is a time string in a specific format. Parameter 2: The format of the time string. Example: `yyyy-MM-dd HH:mm:ss:SSS ZZ`. `yyyy` indicates the year, `MM` indicates the month, `dd` indicates the day, `HH` indicates the hour in 24-hour format, `mm` indicates the minute, `ss` indicates the second, `SSS` indicates the millisecond, and `ZZ` indicates the time zone. Parameter 3: The time zone used for the conversion. Parameter 4: The type of the destination timestamp. Valid values: unix (10-digit UNIX timestamp) and millis (13-digit millisecond timestamp).	`to_timestamp(#{create_time},"yyyy-MM-dd HH:mm:ss","Asia/Shanghai","millis")`
DATE_CONVERT	Converts a time string of a specific format in a specific time zone to a time string of another format in another time zone.	Parameter 1: An ancestor table field of a text type. The value is a time string in a specific format. Parameter 2: The format of the time string in the ancestor table field. Example: `yyyy-MM-dd HH:mm:ss:SSS ZZ`. `yyyy` indicates the year, `MM` indicates the month, `dd` indicates the day, `HH` indicates the hour in 24-hour format, `mm` indicates the minute, `ss` indicates the second, `SSS` indicates the millisecond, and `ZZ` indicates the time zone. Parameter 3: The source time zone for the conversion. Parameter 4: The format of the destination date string. Example: `yyyy-MM-dd HH:mm:ss:SSS ZZ`. `yyyy` indicates the year, `MM` indicates the month, `dd` indicates the day, `HH` indicates the hour in 24-hour format, `mm` indicates the minute, `ss` indicates the second, `SSS` indicates the millisecond, and `ZZ` indicates the time zone. Parameter 5: The destination time zone for the conversion.	`date_convert(#{create_time},"yyyy-MM-dd HH:mm:ss","Asia/Shanghai","yyyyMMdd","UTC")`
SUBSTRING	Returns a substring of a string.	Parameter 1: An ancestor table field of a text type. Parameter 2: The start position of the substring. The substring includes the character at the start position. The minimum start position is 0. Parameter 3: (Optional) The length of the substring. If the length is less than or equal to 0, the substring from the start position to the end of the string is returned.	`substring(#{name},0,2)`
LOWER	Converts a string to lowercase.	Parameter 1: An ancestor table field of a text type or a built-in variable.	`lower(#{name})`
UPPER	Converts a string to uppercase.	Parameter 1: An ancestor table field of a text type or a built-in variable.	`upper(#{name})`

Supported syntax

Only function call expressions are supported. An expression consists of a function name, parentheses, and parameters. Other types of expressions are not supported. For example: DATE_FORMAT(#{create_time},"millis","Asia/Shanghai","yyyyMMdd").
Parameters of function expressions can reference string constants using double quotation marks (""). For example: CONCAT("hello","world").
Parameters of function expressions can reference numeric constants using numbers. For example: SUBSTRING(#{col1},0,10).
Parameters of function expressions support nested function calls. For example: CONCAT(SUBSTRING(#{col1},0,10),"test").
Parameters of function expressions can reference the output column values of an ancestor node using #{}.
Parameters of function expressions can reference built-in variables using ${}. The supported built-in variables are the same as those available for additional columns in each channel. For example: CONCAT(SUBSTRING(#{col1},0,10),${DATASOURCE_NAME_SRC}).
Function names and built-in variable names are case-insensitive.

Prerequisites

You have completed the data source configuration, where MySQL or PolarDB is the source and MaxCompute or Hologres is the destination.
A real-time full-database sync task has been created.

Procedure

Step 1: Go to the real-time task configuration page

Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Integration > Data Integration. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.
In the navigation pane on the left, click to open the sync task page.
Find the created real-time full-database sync task, and click More > Edit in the Actions column to open the sync task configuration page.

Step 2: Add a function processing column to the target table

The result of the function expression is written to a column in the target table. You must first add a field to the target table schema.

On the real-time task configuration page, click Next. In the Target Table Mapping section, click Refresh Mapping.
Find the target table to which you want to add a function column. In the Target Table Name column, click the button to open the target table schema configuration page.
In the Fields section, click Add Field. After you configure the field, click Apply And Refresh Mapping.
You can also use the Target Table Schema - Batch Modify And Add Fields option in Batch Operations to add function columns to multiple target tables.

Step 3: Define a function expression

In the Target Table Field Assignment column, click Configure to open the field assignment page.
In the Target Table Fields section, find the function column that you added. Set Assignment Method to Function. Then, enter the function in the Assignment column. For more information about the supported functions and syntax, see Supported functions and syntax.