All Products
Search
Document Center

MaxCompute:Java UDTFs

Last Updated:Mar 26, 2026

A user-defined table-valued function (UDTF) returns a table — one or more rows — from a single input row. This makes it fundamentally different from a scalar UDF, which returns one value per call. In SQL, a UDTF is invoked as a table source: use it with the AS clause in a SELECT statement.

MaxCompute supports writing UDTFs in Java. This topic covers the code structure, required methods, type mappings, SQL usage constraints, and an end-to-end example.

Code structure

Use Maven in IntelliJ IDEA or MaxCompute Studio to write UDTF code in Java. A Java UDTF consists of the following components:

Component Required Description
Java package No Packages Java classes into a JAR file for reuse.
Base UDTF classes Yes com.aliyun.odps.udf.UDTF, com.aliyun.odps.udf.annotation.Resolve, and com.aliyun.odps.udf.UDFException. For additional classes or complex data types, see Overview.
Custom Java class Yes The organizational unit of UDTF code. Defines the variables and methods for your business logic.
@Resolve annotation Yes Declares the input and output types of the UDTF. The format is @Resolve(<signature>).
Implementation methods Yes setup, process, close, and forward. See Methods below.

Methods

Method Description
public void setup(ExecutionContext ctx) throws UDFException Initialization method. Called once per worker before the UDTF processes any input data.
public void process(Object[] args) throws UDFException Called once per input SQL record. Input parameters are passed as Object[]. Call forward inside this method to emit output rows.
public void close() throws UDFException Termination method. Called only once, after the last record has been processed.
forward(Object... args) Emits one output row per call. Use the AS clause in SQL to name the output columns.
Note

Data loss may occur if you do not use the process or close method to call the forward function. If a background thread executes the forward call, process must not return until forward has finished.

You can use Java data types or Java writable types in a Java UDTF. For type mappings, see Data types.

The following example shows the complete code structure of a Java UDTF:

// Package Java classes into a JAR file named org.alidata.odps.udtf.examples.
package org.alidata.odps.udtf.examples;
// Base UDTF classes.
import com.aliyun.odps.udf.UDTF;
import com.aliyun.odps.udf.UDTFCollector;
import com.aliyun.odps.udf.annotation.Resolve;
import com.aliyun.odps.udf.UDFException;

// Splits a string by whitespace and emits one row per token, paired with the original bigint value.
@Resolve("string,bigint->string,bigint")
public class MyUDTF extends UDTF {
    @Override
    public void process(Object[] args) throws UDFException {
        String a = (String) args[0];
        Long b = (Long) args[1];
        for (String t : a.split("\\s+")) {
            forward(t, b);
        }
    }
}

@Resolve annotation

The @Resolve annotation declares the UDTF's type contract. MaxCompute checks type consistency at semantic parsing time. If the actual types do not match the declared signature, an error is returned.

Format:

@Resolve('<arg_type_list> -> <type_list>')
Parameter Description
arg_type_list Data types of input parameters, separated by commas. Supported types: BIGINT, STRING, DOUBLE, BOOLEAN, DATETIME, DECIMAL, FLOAT, BINARY, DATE, DECIMAL(precision, scale), CHAR, VARCHAR, ARRAY, MAP, STRUCT, and nested complex types. Use * to accept any number of parameters of any type. Leave blank ('') to accept no input parameters.
type_list Data types of return values, separated by commas. Supported types: BIGINT, STRING, DOUBLE, BOOLEAN, DATETIME, DECIMAL, FLOAT, BINARY, DATE, DECIMAL(precision, scale), ARRAY, MAP, STRUCT, and nested complex types.

Examples:

Annotation Input types Return types
@Resolve('bigint,boolean->string,datetime') BIGINT, BOOLEAN STRING, DATETIME
@Resolve('*->string,datetime') Any STRING, DATETIME
@Resolve('->double,bigint,string') None DOUBLE, BIGINT, STRING
@Resolve("array<string>,struct<a1:bigint,b1:string>,string->map<string,bigint>,struct<b1:bigint>") ARRAY, STRUCT, STRING MAP, STRUCT

For dynamic parameter syntax extensions, see Dynamic parameters of UDAFs and UDTFs.

Data types

MaxCompute data types, Java types, and Java writable types are mapped as follows. Write Java UDTFs based on these mappings to ensure type consistency.

MaxCompute type Java type Java writable type
TINYINT java.lang.Byte ByteWritable
SMALLINT java.lang.Short ShortWritable
INT java.lang.Integer IntWritable
BIGINT java.lang.Long LongWritable
FLOAT java.lang.Float FloatWritable
DOUBLE java.lang.Double DoubleWritable
DECIMAL java.math.BigDecimal BigDecimalWritable
BOOLEAN java.lang.Boolean BooleanWritable
STRING java.lang.String Text
VARCHAR com.aliyun.odps.data.Varchar VarcharWritable
BINARY com.aliyun.odps.data.Binary BytesWritable
DATE java.sql.Date DateWritable
DATETIME java.util.Date DatetimeWritable
TIMESTAMP java.sql.Timestamp TimestampWritable
INTERVAL_YEAR_MONTH N/A IntervalYearMonthWritable
INTERVAL_DAY_TIME N/A IntervalDayTimeWritable
ARRAY java.util.List N/A
MAP java.util.Map N/A
STRUCT com.aliyun.odps.data.Struct N/A
Note

Java writable types are supported as input or return types only when your MaxCompute project uses the MaxCompute V2.0 data type edition. For more information, see Data type editions.

Three additional rules apply to data types in Java UDTFs:

  • Input and return value types are always objects. Use object types such as String, not primitive types such as string.

  • Primitive Java types (such as int, long, boolean) cannot represent SQL NULL. Do not use them.

  • NULL values in MaxCompute SQL are represented by NULL in Java.

Limitations

The following SQL usage restrictions apply to UDTFs.

No other columns or expressions in the same SELECT

A UDTF must be the only item in the SELECT clause. The following statement is invalid:

-- Invalid: mixing a UDTF with another column.
SELECT value, user_udtf(key) AS mycol ...

No nesting

UDTFs cannot be used as input to other UDTFs:

-- Invalid: nesting UDTFs.
SELECT user_udtf1(user_udtf2(key)) AS mycol ...;

No GROUP BY, DISTRIBUTE BY, or SORT BY in the same SELECT

-- Invalid: combining a UDTF with GROUP BY.
SELECT user_udtf(key) AS mycol ... GROUP BY mycol;

No Internet access

UDFs cannot access the Internet by default. To enable Internet access, submit the network connection application form. After approval, the MaxCompute technical support team will help you establish the connection. For instructions, see Network connection process.

Usage notes

  • Do not package classes with the same name but different logic into the JAR files of different UDTFs. For example, if udtf1.jar and udtf2.jar both contain com.aliyun.UserFunction.class but with different logic, calling both UDTFs in the same SQL statement causes MaxCompute to load only one version. This leads to incorrect behavior and may cause a compilation error.

Call a Java UDTF

After developing a Java UDTF following the Development process, call it in MaxCompute SQL using one of the following methods:

Example

This example walks through building and calling a Java UDTF with MaxCompute Studio. The UDTF splits a string column by whitespace and emits one row per token.

Prerequisites

Before you begin, ensure that you have:

Write UDTF code

  1. In the Project tab, navigate to src > main > java, right-click java, and choose New > MaxCompute Java.

    新建Java Class

  2. In the Create new MaxCompute java class dialog box, click UDTF, enter a name in the Name field, and press Enter. This example uses MyUDTF as the class name. If you have not created a package yet, specify the name in packagename.classname format. MaxCompute Studio generates the package automatically.

    选择类型并填写名称

  3. Write the following code in the editor:

    package org.alidata.odps.udtf.examples;
    import com.aliyun.odps.udf.UDTF;
    import com.aliyun.odps.udf.UDTFCollector;
    import com.aliyun.odps.udf.annotation.Resolve;
    import com.aliyun.odps.udf.UDFException;
    
    // Splits the first string argument by whitespace and emits one row per token,
    // pairing each token with the original bigint value.
    @Resolve("string,bigint->string,bigint")
    public class MyUDTF extends UDTF {
        @Override
        public void process(Object[] args) throws UDFException {
            String a = (String) args[0];
            Long b = (Long) args[1];
            for (String t : a.split("\\s+")) {
                forward(t, b);
            }
        }
    }

    编写UDTF代码

Debug locally

Run the UDTF on your local machine to verify that the code works before uploading it.

For debug instructions, see Perform a local run to debug the UDF.

本地调试UDTF
Note

The parameter settings in the preceding figure are for reference only.

Register the UDTF

Package the UDTF into a JAR file, upload it to your MaxCompute project, and register it as a function. This example registers the function as user_udtf.

For packaging instructions, see Procedure.

注册函数

Call the UDTF in SQL

In Project Explorer, right-click your MaxCompute project and select Open in Console to open the MaxCompute client.

The input table my_table has the following structure:

+------------+------------+
| col0       | col1       |
+------------+------------+
| A B        | 1          |
| C D        | 2          |
+------------+------------+

Run the following statement to call the UDTF:

SELECT user_udtf(col0, col1) AS (c0, c1) FROM my_table;

Expected output:

+----+------------+
| c0 | c1         |
+----+------------+
| A  | 1          |
| B  | 1          |
| C  | 2          |
| D  | 2          |
+----+------------+

What's next