This topic describes how to build a development environment, write business logic code, and publish a user-defined function (UDF) in Realtime Compute.

Note Currently, only Realtime Compute in exclusive mode supports user-defined extensions (UDXs).

Definition

A UDF maps zero, one, or multiple scalar values to a new scalar value.

Build a development environment

For more information, see Build the development environment.

Write business logic code

To define a UDF, you must extend the ScalarFunction class by implementing the eval method. The open and close methods are optional.
Notice UDFs return the same output for the same input by default. However, a UDF where an external service is called may return different output results even if the input values are the same. If a UDF cannot generate the same output for the same input, we recommend that you use the override isDeterministic method to make it return false. Otherwise, the output may not meet your expectations in certain cases. For example, a UDF operator moves forward.
An example of Java code is as follows:
package com.hjc.test.blink.sql.udx;

import org.apache.flink.table.functions.FunctionContext;
import org.apache.flink.table.functions.ScalarFunction;

public class StringLengthUdf extends ScalarFunction {
    // The open method is optional.
    // To write the open method, you must add import org.apache.flink.table.functions.FunctionContext; to the code.
    @Override
    public void open(FunctionContext context) {
        }
    public long eval(String a) {
        return a == null ? 0 : a.length();
    }
    public long eval(String b, String c) {
        return eval(b) + eval(c);
    }
    // The close method is optional.
    @Override
    public void close() {
        }
}

Publish a UDF

To publish a UDF, follow these steps: Write SQL statements in your specified class. Then, on the Development page, click Publish. On the Administration page, find the target job and click Start in the Actions column. The following example shows the SQL statements in a UDF.
-- udf str.length()
CREATE FUNCTION stringLengthUdf AS 'com.hjc.test.blink.sql.udx.StringLengthUdf';
create table sls_stream(
    a int,
    b int,
    c varchar
) with (
    type='sls',
    endPoint='<yourEndpoint>',
    accessKeyId='<yourAccessId>',
    accessKeySecret='<yourAccessSecret>',
    startTime = '2017-07-04 00:00:00',
    project='<yourProjectName>',
    logStore='<yourLogStoreName>',
    consumerGroup='consumerGroupTest1'
);
create table rds_output(
    id int,
    len bigint,
    content VARCHAR
) with (
    type='rds',
    url='yourDatabaseURL',
    tableName='<yourDatabaseTableName>',
    userName='<yourDatabaseUserName>',
    password='<yourDatabasePassword>'
);
insert into rds_output
select
    a,
    stringLengthUdf(c),
    c as content
from sls_stream

FAQ

Q: Why does a random number generator always generate the same value at runtime?

A: If no input parameters are passed to a UDF and you do not declare it as nondeterministic, the UDF may be optimized during compilation to return a constant value. To avoid this, you can implement the override isDeterministic method to make it return false.