This topic describes how to build a development environment and use user-defined extensions (UDXs) in Realtime Compute.

  • UDXs are supported only when Realtime Compute works in exclusive mode.
  • Blink is developed based on open source Flink SQL by Alibaba Cloud Realtime Compute to improve computing performance. UDXs can only be used in Blink.

UDX types

The following table lists the types of UDXs supported by Realtime Compute.
UDX type Description
User-defined function (UDF) User-defined scalar function. There is a one-to-one mapping between the input and output. Each time a UDF reads a row of data, it writes an output record.
User-defined aggregation function (UDAF) There is a many-to-one mapping between the input and output. A UDAF reads multiple rows of data and writes one output record. A UDAF can be used with the GROUP BY clause of SQL. For more information, see Aggregate functions.
User-defined table-valued function (UDTF) When a UDTF is called, it outputs multiple columns or rows of data.

UDX demo

Realtime Compute provides a UDX demo to help you quickly learn how to use UDFs, UDAFs, and UDTFs.
  • In the demo, a development environment of the required version is configured. You do not need to build another development environment.
  • The demo provides Maven projects. You can use IntelliJ IDEA for development. For more information, see Develop a UDX by using IntelliJ IDEA.

Build a development environment

The development of UDXs depends on the following JAR packages. You can download the packages as needed.

Register and use a UDX

  1. Log on to the Realtime Compute development platform.
  2. In the top navigation bar, click Development.
  3. Click Resources on the left.
  4. In the upper-right corner of the Resources pane, click Create Resource.
  5. In the Upload Resource dialog box, configure resource parameters.
    Parameter Description
    Location You can only upload local JAR packages.
    Note The maximum size of a JAR package that you can upload is 300 MB. If the JAR package exceeds 300 MB, you must upload it to the Object Storage Service (OSS) bucket that is bound to your cluster or use OpenAPI to upload it.
    Resource Click Upload Resource to select the resource that you want to reference.
    Resource Name Enter a name for the resource.
    Resource Description Enter a resource description.
    Resource Type Select the type of the resource: JAR, DICTIONARY, or PYTHON.
  6. On the Resources pane, locate the new resource, and move the pointer over More in the Actions column.
  7. Select Reference from the drop-down list to present the resource code in the code editor.
  8. In the code editor, declare the UDX at the beginning. An example is as follows:
    CREATE FUNCTION stringLengthUdf AS 'com.hjc.test.blink.sql.udx.StringLengthUdf';

Types of parameters and return values

When you define Java UDXs in Realtime Compute, you can use Java data types in parameters and return values. The following table lists the mappings between Realtime Compute and Java data types.
Realtime Compute data type Java data type
TINYINT java.lang.Byte
SMALLINT java.lang.Short
INT java.lang.Integer
BIGINT java.lang.Long
FLOAT java.lang.Float
DOUBLE java.lang.Double
DECIMAL java.math.BigDecimal
BOOLEAN java.lang.Boolean
DATE java.sql.Date
TIME java.sql.Time
TIMESTAMP java.sql.Timestamp
CHAR java.lang.Character
STRING java.lang.String
VARBINARY java.lang.byte[]
ARRAY Not supported
MAP Not supported

Obtain parameters in UDXs

UDXs support an optional open(FunctionContext context) method. You can use FunctionContext to pass parameters of custom configuration items.

For example, you want to add the following two parameters to a job:
Take a UDTF as an example. You can use context.getJobParameter in the open method to obtain parameters. An example is as follows:
public void open(FunctionContext context) throws Exception {
      String key1 = context.getJobParameter("testKey1", "empty");
      String key2 = context.getJobParameter("test.key2", "empty");
      System.err.println(String.format("end open: key1:%s, key2:%s", key1, key2));
Note For more information, see job parameters.