All Products
Search
Document Center

MaxCompute:Code-embedded UDFs

Last Updated:Mar 26, 2026

Code-embedded UDFs let you define Java or Python logic directly inside a SQL script, eliminating the separate compile-upload-register workflow required by standard UDFs. SQL statements and function code live in the same file, making the logic immediately visible and easier to maintain.

All examples on this page must be submitted in script mode.

When to use code-embedded UDFs

Use code-embedded UDFs when:

  • The function is used only in the current script and does not need to be reused across multiple SQL jobs.

  • You want to keep the implementation and the query together in one file for readability.

  • You are prototyping or iterating quickly and want to skip the JAR packaging step.

Use a standard UDF instead when:

  • Multiple SQL scripts share the same function logic.

  • The Java code relies on features not supported by Janino-compiler (see Limitations).

  • The codebase is too large to embed inline.

How it works

When MaxCompute compiles a script that contains an embedded code block, Janino-compiler identifies and extracts the embedded code, compiles it, dynamically generates resources, and creates a temporary function. The temporary function is not stored as a persistent entry in the MaxCompute metadata system.

Syntax quick reference

ParameterValuesDescription
langJAVA, PYTHONLanguage of the embedded code
filenameAny stringVirtual file name. Required for Python UDFs (used in the AS clause). Also used in Java UDTFs (in @Resolve). Optional for Java UDFs.

Embedded code block syntax:

#CODE ('lang'='JAVA')
  <your Java or Python code>
#END CODE;

Code block placement:

PlacementScope
After USING in CREATE TEMPORARY FUNCTIONApplies only to that CREATE TEMPORARY FUNCTION statement
At the end of the scriptApplies to the entire script

Limitations

Embedded Java code is compiled by Janino-compiler, which supports a subset of standard JDK syntax. The following features are not available:

LimitationWorkaround
Lambda expressions are not supported.
Multiple exception types in a single catch block are not supported. catch(Exception1 | Exception2 e) is not allowed.
Generic argument inference is not supported. Map map = new HashMap<>() is not allowed.Specify the type explicitly: Map<String, String> map = new HashMap<String, String>().
Expressions for type argument inference are ignored.Use cast expressions: (String) myMap.get(key).
Assertions are always enabled, regardless of whether the -ea JVM option is set.
Java versions later than Java 8 are not supported.

Reference embedded code in a user-defined type (UDT)

Place the embedded code block at the end of the script. It applies to the entire script.

SELECT
  s,
  com.mypackage.Foo.extractNumber(s)
FROM VALUES ('abc123def'), ('apple') AS t(s);

#CODE ('lang'='JAVA')
package com.mypackage;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Foo {
  final static Pattern compile = Pattern.compile(".*?([0-9]+).*");
  public static String extractNumber(String input) {
    final Matcher m = compile.matcher(input);
    if (m.find()) {
      return m.group(1);
    }
    return null;
  }
}
#END CODE;

Call the embedded class using its fully qualified name (com.mypackage.Foo.extractNumber) directly in the SQL query.

Define and call a Java UDF

The embedded code block can be placed immediately after USING, which scopes it to the CREATE TEMPORARY FUNCTION statement, or at the end of the script.

CREATE TEMPORARY FUNCTION foo AS 'com.mypackage.Reverse' USING
#CODE ('lang'='JAVA')
package com.mypackage;
import com.aliyun.odps.udf.UDF;
public class Reverse extends UDF {
  public String evaluate(String input) {
    if (input == null) return null;
    StringBuilder ret = new StringBuilder();
    for (int i = input.toCharArray().length - 1; i >= 0; i--) {
      ret.append(input.toCharArray()[i]);
    }
    return ret.toString();
  }
}
#END CODE;

SELECT foo('abdc');

CREATE TEMPORARY FUNCTION creates a temporary function that is valid only for the current execution. It is not stored in the MaxCompute metadata system. To create a permanent function instead, see CREATE SQL FUNCTION.

Define and call a Java user-defined table-valued function (UDTF)

CREATE TEMPORARY FUNCTION foo AS 'com.mypackage.Reverse' USING
#CODE ('lang'='JAVA', 'filename'='embedded.jar')
package com.mypackage;

import com.aliyun.odps.udf.UDTF;
import com.aliyun.odps.udf.UDFException;
import com.aliyun.odps.udf.annotation.Resolve;

@Resolve({"string->string,string"})
public class Reverse extends UDTF {
  @Override
  public void process(Object[] objects) throws UDFException {
    String str = (String) objects[0];
    String[] split = str.split(",");
    forward(split[0], split[1]);
  }
}

#END CODE;

SELECT foo('ab,dc') AS (a,b);
Janino-compiler cannot recognize "string->string,string" as a string[] type. Enclose @Resolve parameters in braces {} — for example, @Resolve({"string->string,string"}). This workaround is only required for code-embedded UDTFs; standard UDTFs do not need the braces.

Define and call a Python UDF

CREATE TEMPORARY FUNCTION foo AS 'embedded.UDFTest' USING
#CODE ('lang'='PYTHON', 'filename'='embedded')
from odps.udf import annotate
@annotate("bigint->bigint")
class UDFTest(object):
  def evaluate(self, a):
    return a * a
#END CODE;

SELECT foo(4);

Two details are specific to Python UDFs:

  • Virtual file name: The class name in the AS clause must include the Python source file name. Use 'filename'='embedded' to specify that virtual file name.

  • Indentation: Python code inside the embedded block must follow standard Python indentation rules.

For Python version-specific development guides, see Develop a UDF in Python 2 and Develop a UDF in Python 3.