MaxCompute offers a variety of built-in functions to meet your business requirements. If the built-in functions do not meet your business requirements, you can write code to create user-defined functions (UDFs). This topic describes the types, scenarios, development process, and usage notes of UDFs that are supported in MaxCompute.

Background information

In the broad sense, UDFs include user-defined scalar functions, user-defined aggregation functions (UDAFs), and user-defined table-valued functions (UDTFs). In the narrow sense, UDFs refer only to user-defined scalar functions. The following table describes the types of MaxCompute UDFs.

UDF type Scenario
UDF A one-to-one mapping is established between the input and output data of a UDF. Each time a UDF reads a row of data, it returns an output value.
UDTF A one-to-many mapping is established between the input and output data of a UDTF. Each time a UDTF reads a row of data, it returns multiple values, which are considered a table.
UDAF A many-to-one mapping is established between the input and output data of a UDAF. Multiple input records are aggregated to generate one output value.

In addition to the preceding UDFs, MaxCompute offers the following UDFs for special scenarios.

UDF type Scenario
Code-embedded UDFs If you want to simplify the development process of MaxCompute UDFs and view the code logic, you can embed Java or Python code into SQL scripts.
SQL functions If your code contains duplicate code, you can use SQL UDFs to improve the code reuse rate and simplify the development process.
Open source geospatial UDFs You can use Hive geospatial functions to analyze spatial data in MaxCompute.


You cannot access the Internet by using UDFs. If you want to access the Internet by using UDFs, fill in the network connection application form based on your business requirements and submit the application. After the application is approved, the MaxCompute technical support team will contact you and help you establish network connections. For more information about how to fill in the network connection application form, see Network connection process.

Usage notes

Before you use UDFs, take note of the following items:
  • UDFs cannot compete with built-in functions in performance. We recommend that you preferentially use built-in functions to implement your business logic.
  • If you use a UDF in SQL statements, the memory usage of a computing job may exceed the default allocated memory size if a large amount of data is computed and data skew occurs. In this case, you can run the set odps.sql.udf.joiner.jvm.memory=xxxx; command at the session level to resolve the issue. For more information about the MaxCompute UDF FAQ, see FAQ about MaxCompute UDFs.
  • If the name of a UDF is the same as that of a built-in function, the UDF is preferentially called. For example, if UDF CONCAT and built-in function CONCAT both exist in MaxCompute, the system automatically calls UDF CONCAT instead of the built-in function CONCAT. If you want to call the built-in function, you must add the symbol :: before the built-in function, for example, select ::concat('ab', 'c');.

Development process

This section describes how to develop a UDF, UDTF, or UDAF. The development process of Code-embedded UDFs, SQL functions, and Open source geospatial UDFs is different. For more information, see the related documentation.


Use the following methods to call UDFs:
  • Use a UDF in a MaxCompute project: The method is similar to that of using built-in functions.
  • Use a UDF across projects: Use a UDF of Project B in Project A. The following statement shows an example: select B:udf_in_other_project(arg0, arg1) as res from table_t;. For more information about resource sharing across projects, see Package-based resource sharing across projects.

MaxCompute SDK

The following table describes the SDKs provided by MaxCompute. For more information about the packages included in each SDK and the classes in the packages, see MaxCompute SDK.

SDK name Description
odps-sdk-core Provides classes for managing basic resources of MaxCompute.
odps-sdk-commons Provides common Utils for Java.
odps-sdk-udf Provides UDFs.
odps-sdk-mapred Provides the MapReduce API.
odps-sdk-graph Provides the Graph API.