MaxCompute provides a wide range of built-in functions that cover most data processing requirements. This page describes the available function types and the usage notes that apply across all built-in functions.
Function types
MaxCompute built-in functions fall into three broad categories:
-
Scalar functions — operate on a single row and return one value per invocation. This includes date and time, mathematical, string, encryption, network, ARRAY, MAP, STRUCT, JSON, and unstructured data processing functions.
-
Aggregate functions — take multiple input rows and return a single output value.
-
Window functions — operate on a subset of rows within a defined window column and return one value per row.
For a mapping between MaxCompute built-in functions and equivalent open source functions, see Mapping between built-in functions of MaxCompute, Hive, MySQL, and Oracle.
| Function type | Description |
|---|---|
| Date and time functions | Manipulate DATE, DATETIME, and TIMESTAMP data — add or subtract dates, calculate differences, extract fields, and convert formats. |
| Mathematical functions | Perform numeric operations on BIGINT, DOUBLE, DECIMAL, and FLOAT data — convert bases, round numbers, and generate random values. |
| Window functions | Calculate aggregations, rankings, offsets, and samples within a specified window column. |
| Aggregate functions | Aggregate multiple input records into a single output value — sums, averages, min/max, aggregate parameters, and string concatenation. |
| HyperLogLog++ functions | Approximate aggregate functions that remove duplicates and accelerate queries on large datasets using minimal memory. |
| String functions | Manipulate STRING data — truncate, replace, search, change case, and convert formats. See Limits of string functions. |
| ARRAY functions | Manipulate ARRAY data — construct, deduplicate, aggregate, sort, and merge arrays. |
| MAP functions | Manipulate MAP data — extract key-value pairs, construct maps, and merge maps. |
| STRUCT functions | Manipulate STRUCT data — expand STRUCT arrays and construct structs. |
| JSON functions | Manipulate JSON data — extract field values, generate objects or arrays, insert or update data, and handle complex data structures. See Limits of JSON functions. |
| Encryption functions | Encrypt and decrypt STRING and BINARY table data. |
| Network functions | Process network-related STRING and BINARY data — convert IP address formats, parse URLs, and get network masks. |
| Unstructured data processing functions | Connect to unstructured data and its metadata stored in data warehouses or data lakes in multiple ways. |
| Other functions | Additional functions for miscellaneous business scenarios. |
Usage notes
Note the following when using built-in functions:
-
Input parameters: The type, number, and format of input parameters must conform to the specified syntax. If they don't, MaxCompute cannot parse the function and the SQL statement fails.
-
Data types 2.0: If a built-in function uses data types 2.0 — TINYINT, SMALLINT, INT, FLOAT, VARCHAR, TIMESTAMP, or BINARY — enable data types 2.0 before running the function. Otherwise, an error is reported. Enable data types 2.0 at one of the following levels:
-
Session level(applies to the current SQL statement only): Add the following statement before your SQL statement and submit them together.
SET odps.sql.type.system.odps2=true; -
Project level(applies to all subsequent SQL statements, takes effect in 10 to 15 minutes): The project owner runs the following command.
SETPROJECT odps.sql.type.system.odps2=true;
-
-
Disabled implicit type conversions: When data types 2.0 is enabled, the following implicit type conversions are disabled to prevent precision loss or errors: STRING to BIGINT, STRING to DATETIME, DOUBLE to BIGINT, DECIMAL to DOUBLE, and DECIMAL to BIGINT. Use the CAST function for explicit conversions, or disable data types 2.0.
-
UDF and built-in function name conflicts: If a user-defined function (UDF) has the same name as a built-in function, the UDF takes precedence. For example, if a UDF named CONCAT exists in a project, the system calls the UDF instead of the built-in CONCAT function. To call the built-in function, prefix the function name with
:::SELECT ::CONCAT('ab', 'c'); -
Global properties: Built-in function results may vary depending on the global properties configured for the MaxCompute project. Run
SETPROJECT;to view the current global properties.
Limits of JSON functions
SDK version requirements
-
Java SDK V0.44.0 or later
-
PyODPS V0.11.4.1 or later
Table operation limits
-
Adding a JSON column to a table is not supported.
-
Clustered tables are not supported.
-
Tables of the Delta Table type are not supported.
SQL operation limits
-
Comparison operations on the JSON type are not supported.
-
ORDER BYandGROUP BYclauses cannot be used on the JSON type. -
A JSON type column cannot be used as a
JOINkey.
Data precision
-
The integer part of a JSON NUMBER is stored as BIGINT. An overflow occurs if the value is outside the BIGINT range.
-
The decimal part of a JSON NUMBER is stored as DOUBLE. Precision loss may occur during conversion.
Character limits
The Unicode character \u0000 is not supported in strings used to generate JSON data.
Engine compatibility
Other engines — for example, Hologres — cannot read the JSON data type from a MaxCompute table.
UDF support
Java UDFs and Python UDFs do not support the JSON type.
Nesting depth
The JSON data type can be nested up to 20 levels deep.
Supported development tools
Supported tools include the MaxCompute client (odpscmd), MaxCompute Studio, and DataWorks. External ecosystems such as Dataphin are not supported. Confirm compatibility before using the JSON data type with any external system.
When using odpscmd, note the following:
-
Upgrade the client to V0.46.5 or later. Earlier versions cannot run the
DESC json_tablecommand or download JSON data using Tunnel. -
In
conf\odps_config.iniin the client installation path, setuse_instance_tunneltofalse. Otherwise, queries will fail.
Limits of string functions
The following string functions support only English characters:
-
TRIM / RTRIM / LTRIM: The
trimCharsparameter supports only English characters. -
REVERSE: Supports only English characters in Hive mode.
-
SOUNDEX: Converts only English characters.
-
TOLOWER: Converts English uppercase letters to lowercase.
-
TOUPPER: Converts English lowercase letters to uppercase.
-
INITCAP: Converts the first English letter of each word to uppercase and the rest to lowercase.