MaxCompute provides a large number of built-in functions to meet data processing requirements in most business scenarios. This topic describes the types of built-in functions that are provided by MaxCompute. This topic also describes how to use the built-in functions.
Background information
The following table describes the types of built-in functions that are provided by MaxCompute.
Type | Description |
---|---|
Date functions | Used to process data of a date type, such as DATE, DATETIME, or TIMESTAMP. For example, you can use these functions to add and subtract date values, calculate date value differences, extract date fields, obtain the current time, and convert date formats. |
Mathematical functions | Used to process data of a numeric type, such as BIGINT, DOUBLE, DECIMAL, or FLOAT. For example, you can use these functions to convert numeral systems, perform mathematical operations, round values, and obtain random numbers. |
Window functions | Used to process the data of columns in a window. For example, you can use these functions to calculate the sum, maximum value, minimum value, average value, and median value of column data, sort column data, obtain the data of columns at a given offset, and sample column data. |
Aggregate functions | Used to aggregate multiple input records into an output value. For example, you can use these functions to calculate the sum, maximum value, minimum value, and average value of data, aggregate parameters, and concatenate strings. |
String functions | Used to process data of the STRING type. For example, you can use these functions to truncate strings, replace strings, search for strings, convert uppercase and lowercase letters, and convert string formats. |
Complex type functions | Used to process data of the MAP, ARRAY, STRUCT, or JSON type. For example, you can use these functions to deduplicate, aggregate, sort, and merge elements. |
Other functions | Used to process data in other business scenarios. |
For more information about the mappings between the built-in functions of MaxCompute and the built-in functions of open source systems, see Mappings between built-in functions of MaxCompute and built-in functions of Hive, MySQL, and Oracle.
Usage notes
- For a built-in function, the types and number of input parameters and function format must meet the function syntax requirements. If the function syntax requirements are not met, MaxCompute cannot parse the built-in function and an error may occur when you execute the SQL statement in which the built-in function is called.
- If the input parameters of a built-in function are of a type that is supported by
the MaxCompute V2.0 data type edition, you must enable the MaxCompute V2.0 data type
edition. The data types supported by the MaxCompute V2.0 data type edition include
TINYINT, SMALLINT, INT, FLOAT, VARCHAR, TIMESTAMP, and BINARY. If you do not enable
the MaxCompute V2.0 data type edition, an error may occur when you execute the SQL
statement in which the built-in function is called. You can enable the MaxCompute
V2.0 data type edition at the session or project level.
- Session level: Add
set odps.sql.type.system.odps2=true;
before the SQL statement in which a built-in function is called. Then, commit and execute them together. This configuration is valid for only the current SQL statement. - Project level: The owner of a project can enable the MaxCompute V2.0 data type edition
for the project based on the project requirements. The configuration takes effect
after 10 to 15 minutes. This configuration is valid for all the subsequent SQL statements.
setproject odps.sql.type.system.odps2=true;
- Session level: Add
- If you enable the MaxCompute V2.0 data type edition for a project, some implicit conversions are disabled, such as the conversions from STRING to BIGINT, STRING to DATETIME, DOUBLE to BIGINT, DECIMAL to DOUBLE, and DECIMAL to BIGINT. This may cause a loss of precision or errors. In this case, you can use the CAST function to forcefully convert the data types to resolve these issues. You can also disable the MaxCompute V2.0 data type edition.
- If the name of a UDF is the same as that of a built-in function, the UDF is preferentially
called. For example, if UDF CONCAT and built-in function CONCAT both exist in MaxCompute,
the system automatically calls UDF CONCAT instead of the built-in function CONCAT.
If you want to call the built-in function, you must add the symbol
::
before the built-in function, for example,select ::concat('ab', 'c');
. - If the settings of global properties of MaxCompute projects are different, the execution
results of built-in functions may be different. You can run the
setproject;
command to configure the global properties of a MaxCompute project.