Search
Document Center

# MaxCompute:Overview

Last Updated:Jul 28, 2023

MaxCompute provides a large number of built-in functions to meet data processing requirements in most business scenarios. This topic describes the types of built-in functions that are provided by MaxCompute. This topic also describes how to use the built-in functions.

## Background information

For more information about the precautions you must take into account when you use built-in functions provided by MaxCompute, see Precautions.

The following table describes the types of built-in functions that are provided by MaxCompute.

 Type Description Date functions Used to process data of a date type, such as DATE, DATETIME, or TIMESTAMP. For example, you can use these functions to add and subtract date values, calculate date value differences, extract date fields, obtain the current time, and convert date formats. Mathematical functions Used to process data of a numeric type, such as BIGINT, DOUBLE, DECIMAL, or FLOAT. For example, you can use these functions to convert numeral systems, perform mathematical operations, round values, and obtain random numbers. Window functions Used to process the data of columns in a window. For example, you can use these functions to calculate the sum, maximum value, minimum value, average value, and median value of column data, sort column data, obtain the data of columns at a given offset, and sample column data. Aggregate functions Used to aggregate multiple input records into an output value. For example, you can use these functions to calculate the sum, maximum value, minimum value, and average value of data, aggregate parameters, and concatenate strings. String functions Used to process data of the STRING type. For example, you can use these functions to truncate strings, replace strings, search for strings, convert uppercase and lowercase letters, and convert string formats. Complex type functions Used to process data of the MAP, ARRAY, STRUCT, or JSON type. For example, you can use these functions to deduplicate, aggregate, sort, and merge elements. Encryption and decryption functions Used to encrypt and decrypt data of the STRING or BINARY type in a table. Other functions Used to process data in other business scenarios.

For more information about typical cases, error codes, and FAQs of built-in functions that are provided by MaxCompute, see Fix the precision issue of the ROUND function, Implement capabilities provided by the GROUP_CONCAT function, Common errors for built-in functions, and FAQ about built-in functions.

## Precautions

When you use built-in functions that are provided by MaxCompute, take note of the following items:

• For a built-in function, the types and number of input parameters and function format must meet the function syntax requirements. If the function syntax requirements are not met, MaxCompute cannot parse the built-in function and an error may occur when you execute the SQL statement in which the built-in function is called.

• If the input parameters of a built-in function are of a type that is supported by the MaxCompute V2.0 data type edition, you must enable the MaxCompute V2.0 data type edition. The data types supported by the MaxCompute V2.0 data type edition include TINYINT, SMALLINT, INT, FLOAT, VARCHAR, TIMESTAMP, and BINARY. If you do not enable the MaxCompute V2.0 data type edition, an error may occur when you execute the SQL statement in which the built-in function is called. You can enable the MaxCompute V2.0 data type edition at the session or project level.

• Session level: Add `set odps.sql.type.system.odps2=true;` before the SQL statement in which a built-in function is called. Then, commit and execute them together. This configuration is valid for only the current SQL statement.

• Project level: The owner of a project can enable the MaxCompute V2.0 data type edition for the project based on the project requirements. The configuration takes effect after 10 to 15 minutes. This configuration is valid for all the subsequent SQL statements.

``setproject odps.sql.type.system.odps2=true;``
• If you enable the MaxCompute V2.0 data type edition for a project, some implicit conversions are disabled, such as the conversions from STRING to BIGINT, STRING to DATETIME, DOUBLE to BIGINT, DECIMAL to DOUBLE, and DECIMAL to BIGINT. This may cause a loss of precision or errors. In this case, you can use the CAST function to forcefully convert the data types to resolve these issues. You can also disable the MaxCompute V2.0 data type edition.

• If the name of a UDF is the same as that of a built-in function, the UDF is preferentially called. For example, if UDF CONCAT and built-in function CONCAT both exist in MaxCompute, the system automatically calls UDF CONCAT instead of the built-in function CONCAT. If you want to call the built-in function, you must add the symbol `::` before the built-in function. For example, you can use `select ::concat('ab', 'c');`.

• If the settings of global properties of MaxCompute projects are different, the execution results of built-in functions may be different. You can run the `setproject;` command to configure the global properties of a MaxCompute project.

For more information about the mappings between the built-in functions of MaxCompute and the built-in functions of open source systems, see Mappings between built-in functions of MaxCompute and built-in functions of Hive, MySQL, and Oracle.

## Date functions

The following table describes date functions provided by MaxCompute SQL. You can select an appropriate date function based on your business requirements to complete date calculations and conversions.

 Function Description DATEADD Changes a date value based on the time unit specified by datepart and the interval specified by delta. DATE_ADD Adds or subtracts a number of days to or from a date value based on the interval specified by delta. The DATE_ADD function is the inverse of the `DATE_SUB` function. DATE_FORMAT Converts a date value into a string in a specified format. DATE_SUB Adds or subtracts a number of days to or from a date value based on the interval specified by delta. The DATE_SUB function is the inverse of the `DATE_ADD` function. DATEDIFF Calculates the difference between two date values based on the time unit specified by datepart. DATEPART Returns a specified component of a date value based on the time unit specified by datepart. DATETRUNC Truncates a date value based on the time unit specified by datepart. FROM_UNIXTIME Converts a UNIX timestamp of the BIGINT type into a date value of the DATETIME type. GETDATE Returns the current system time as a date value. ISDATE Determines whether a date string can be converted into a date value in a specified format. LASTDAY Returns the last day of the month in which a date value falls. TO_DATE Converts a string into a date value in a specified format. TO_CHAR Converts a date value into a string in a specified format. UNIX_TIMESTAMP Converts a date value into a UNIX timestamp that is an integer. WEEKDAY Returns a number that represents the day of the week in which a date value falls. WEEKOFYEAR Returns a number that represents the week of the year in which a date value falls. ADD_MONTHS Returns a date value that is obtained after a number of months are added to a specified date. CURRENT_TIMESTAMP Returns the current timestamp. CURRENT_TIMEZONE Returns the time zone of the current system. DAY Returns the day in which a date value falls. DAYOFMONTH Returns the day component of a date value. DAYOFWEEK Returns the day of the week in which a date value falls. DAYOFYEAR Returns the sequence number of the day in the year. EXTRACT Returns a specified component of a timestamp. FROM_UTC_TIMESTAMP Converts a UTC timestamp into a timestamp for a specified time zone. HOUR Returns the hour component of a date value. LAST_DAY Returns the last day of the month in which a date value falls. MINUTE Returns the minute component of a date value. MONTH Returns the month in which a date value falls. MONTHS_BETWEEN Returns the number of months between specified date values. NEXT_DAY Returns the date of the first weekday that is later than a date value and matches the specified week. QUARTER Returns the quarter in which a date value falls. SECOND Returns the second component of a date value. TO_MILLIS Converts a date value into a UNIX timestamp that is accurate to the millisecond. YEAR Returns the year in which a date value falls.

## Mathematical functions

The following table describes mathematical functions that are provided by MaxCompute SQL for you to use during development. You can select mathematical functions based on your business requirements to compute data or convert data types.

Note

For more information about operators, such as the operator that is used to calculate remainders, see Arithmetic operators.

 Function Description ABS Calculates the absolute value. ACOS Calculates the arccosine. ASIN Calculates the arcsine. ATAN Calculates the arctangent. ATAN2 Calculates the arctangent of expr1/expr2. CEIL Rounds up a number and returns the nearest integer. CONV Converts a number from one number system to another. COS Calculates the cosine. COSH Calculates the hyperbolic cosine. COT Calculates the cotangent. EXP Calculates the exponential value. FLOOR Rounds down a number and returns the nearest integer. ISNAN Checks whether the value of an expression is NaN. LN Calculates the natural logarithm. LOG Calculates the logarithm. NEGATIVE Returns the negative value of an expression. POSITIVE Returns the value of an expression. POW Calculates the nth power of a value. RAND Returns a random number. ROUND Returns a value rounded to the specified decimal place. SIN Calculates the sine. SINH Calculates the hyperbolic sine. SQRT Calculates the square root. TAN Calculates the tangent. TANH Calculates the hyperbolic tangent. TRUNC Truncates the input value to the specified decimal place. BIN Calculates the binary code. CBRT Calculates the cube root. CORR Calculates the Pearson correlation coefficient. DEGREES Converts a radian value into a degree. E Calculates the value of e. FACTORIAL Calculates the factorial. FORMAT_NUMBER Converts a number into a string in the specified format. HEX Converts an integer or a string into a hexadecimal number. LOG2 Calculates the logarithm of a number with the base number of 2. LOG10 Calculates the logarithm of a number with the base number of 10. PI Calculates the value of π. RADIANS Converts a degree into a radian value. SIGN Returns the sign of the input value. SHIFTLEFT Shifts a value left by a specific number of places. SHIFTRIGHT Shifts a value right by a specific number of places. SHIFTRIGHTUNSIGNED Shifts an unsigned value right by a specific number of places. UNHEX Converts a hexadecimal string into a string. WIDTH_BUCKET Returns the ID of the bucket into which the value of a specific expression falls.

## Window functions

The following table describes window functions that are provided by MaxCompute SQL for you to flexibly analyze and process data of specific columns in a window.

 Function Description ROW_NUMBER Calculates the sequence number of a row. The row number starts from 1. RANK Calculates the rank of a row in an ordered group of rows. The ranks may not be consecutive integers. DENSE_RANK Calculates the rank of a row in an ordered group of rows. The ranks are consecutive integers. PERCENT_RANK Calculates the percentile rank of a row in an ordered group of rows. CUME_DIST Calculates the cumulative distribution of data in a partition. NTILE Splits rows of data in a partition into N groups of equal size and returns the number of the group to which the current row belongs. The group number ranges from 1 to N. LAG Obtains the calculated result of the Nth row of data that precedes the current row at a given offset in a window. LEAD Obtains the calculated result of the Nth row of data that follows the current row at a given offset in a window. FIRST_VALUE Obtains the calculated result of the first row of data in the window to which the current row belongs. LAST_VALUE Obtains the calculated result of the last row of data in the window to which the current row belongs. NTH_VALUE Obtains the calculated result of the Nth row of data in a window to which the current row belongs. CLUSTER_SAMPLE Samples random rows of data. If true is returned, the specified row of data is sampled. COUNT Calculates the number of rows in a window. MIN Calculates the minimum value in a window. MAX Calculates the maximum value in a window. AVG Calculates the average value of data in a window. SUM Calculates the sum of data in a window. MEDIAN Calculates the median in a window. STDDEV Returns the population standard deviation of all input values. This function is also called STDDEV_POP. STDDEV_SAMP Returns the sample standard deviation of all input values.
• Syntax

Syntax of window functions:

``````<function_name>([distinct][<expression> [, ...]]) over (<window_definition>)
<function_name>([distinct][<expression> [, ...]]) over <window_name>``````
• function_name: the name of a built-in window function, aggregate function, or user-defined aggregate function (UDAF).

• expression: the format of a window function. The format is subject to the function syntax.

• window_name: the name of a window. You can use the `window` keyword to configure a window and use windowing_definition to specify the name of the window. Syntax of named_window_def:

``window <window_name> as (<window_definition>)``

Position of named_window_def in an SQL statement:

``select ... from ... [where ...] [group by ...] [having ...] named_window_def [order by ...] [limit ...]``
• windowing_definition

Syntax

``````--partition_clause:
[partition by <expression> [, ...]]
--orderby_clause:
[order by <expression> [asc|desc][nulls {first|last}] [, ...]]
[<frame_clause>]``````

If you use a window function in a SELECT statement, data is partitioned and sorted based on `PARTITION BY` and `ORDER BY` in windowing_definition when the window function is executed. If the SELECT statement does not include `PARTITION BY`, only one partition exists and the partition contains all data. If the SELECT statement does not include `ORDER BY`, data in a partition is arranged in a random order, and a data stream is generated. After the data stream is generated, a group of rows is extracted from the data stream based on `frame_clause` in windowing_definition to create a window for the current row. The window function calculates the data included in the window to which the current row belongs.

• partition by <expression> [, ...]: optional. This parameter specifies the partition information. If the values of partition key columns are the same for a group of rows, these rows are included in the same window. For more information about the format of PARTITION BY, see Table operations.

• order by <expression> [asc|desc][nulls {first|last}] [, ...]: optional. This parameter specifies how to sort rows of data in a window.

Note

If the values of the column that is specified in `order by` are the same, the sorting result may not be accurate. To reduce the random ordering of data, make sure that the values of the column that is specified in `order by` are unique.

• frame_clause: optional. This parameter is used to determine the data boundaries of a window. The frame_clause section in this topic provides details about this parameter.

• frame_clause

Syntax

``````-- Syntax 1
{ROWS|RANGE|GROUPS} <frame_start> [<frame_exclusion>]
-- Syntax 2
{ROWS|RANGE|GROUPS} between <frame_start> and <frame_end> [<frame_exclusion>]``````

frame_clause is a closed interval that is used to determine the data boundaries of a window. The data boundaries are determined based on the rows that are specified by frame_start and frame_end.

• ROWS|RANGE|GROUPS: required. ROWS, RANGE, and GROUPS indicate the types of frame_clause. The implementation rules of frame_start and frame_end vary based on the type of frame_clause. Take note of the following points:

• ROWS: The data boundaries of a window are determined based on the number of rows.

• RANGE: The data boundaries of a window are determined based on the comparison results of the values of the column that is specified in `order by`. In most cases, `order by` is specified in windowing_definition. If `order by` is not specified in windowing_definition, the values of the column that is specified in `order by` are the same for all rows in a partition. NULL values are considered equivalent.

• GROUPS: In a partition, rows that have the same value of the column specified in `order by` form a group. If `order by` is not specified, all rows in the partition form a group. NULL values are considered equivalent.

• frame_start and frame_end: the start and end rows of a window. frame_start is required. frame_end is optional. If frame_end is not specified, the default value CURRENT ROW is used.

The row specified by frame_start must precede or be the same as the row specified by frame_end. Compared with the row specified by frame_end, the row specified by frame_start is closer to the first row in a window after all data in the window is sorted based on the column that is specified in `order by` of windowing_definition. The following table describes the valid values and logic of frame_start and frame_end when the type of frame_clause is ROWS, RANGE, or GROUPS.

 frame_clause type Value of frame_start or frame_end Description ROWS, RANGE, and GROUPS UNBOUNDED PRECEDING Indicates the first row of a partition. Rows are counted from 1. UNBOUNDED FOLLOWING Indicates the last row of a partition. ROWS CURRENT ROW Indicates the current row. Each row of data corresponds to a result calculated by a window function. The current row indicates the row whose data is calculated by using a window function. offset PRECEDING Indicates the Nth row that precedes the current row at a given `offset`. For example, if `0 PRECEDING` indicates the current row, `1 PRECEDING` indicates the previous row. `offset` must be a non-negative integer. offset FOLLOWING Indicates the Nth row that follows the current row at a given `offset`. For example, if `0 FOLLOWING` indicates the current row, `1 FOLLOWING` indicates the next row. `offset` must be a non-negative integer. RANGE CURRENT ROW If frame_start is set to CURRENT ROW, it indicates the first row that has the same value of the column specified in `order by` as the current row. If frame_end is set to CURRENT ROW, it indicates the last row that has the same value of the column specified in `order by` as the current row. offset PRECEDING The rows that are specified by frame_start and frame_end are determined based on the sorting order that is specified by `order by`. For example, if data in a window is sorted by X, Xi indicates the X value that corresponds to the ith row, and Xc indicates the X value that corresponds to the current row. Positions of rows specified by frame_start and frame_end:`order by` is set to asc:frame_start indicates the first row that meets the following requirement: `Xc - Xi ≤ offset`. frame_end indicates the last row that meets the following requirement: `Xc - Xi ≥ offset`. `order by` is set to desc:frame_start indicates the first row that meets the following requirement: `Xi - Xc ≤ offset`. frame_end indicates the last row that meets the following requirement: `Xi - Xc ≥ offset`. The column that is specified in `order by` can be of the following data types: TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATETIME, DATE, and TIMESTAMP. Syntax for `offset` of the DATE type:`N`: indicates N days or N seconds. It must be a non-negative integer. For an offset of the DATETIME or TIMESTAMP type, it indicates N seconds. For an offset of the DATE type, it indicates N days. `interval 'N' {YEAR\MONTH\DAY\HOUR\MINUTE\SECOND}`: indicates N years, months, days, hours, minutes, or seconds. For example, `INTERVAL '3' YEAR` indicates 3 years. `INTERVAL 'N-M' YEAR TO MONTH`: indicates N years and M months. For example, `INTERVAL '1-3' YEAR TO MONTH` indicates 1 year and 3 months. `INTERVAL 'D[ H[:M[:S[:N]]]]' DAY TO SECOND`: indicates D days, H hours, M minutes, S seconds, and N nanoseconds. For example, `INTERVAL '1 2:3:4:5' DAY TO SECOND` indicates 1 day, 2 hours, 3 minutes, 4 seconds, and 5 nanoseconds. offset FOLLOWING The rows that are specified by frame_start and frame_end are determined based on the sorting order that is specified by `order by`. For example, if data in a window is sorted by X, Xi indicates the X value that corresponds to the ith row, and Xc indicates the X value that corresponds to the current row. Positions of rows specified by frame_start and frame_end:`order by` is set to asc:frame_start indicates the first row that meets the following requirement: `Xi - Xc ≥ offset`. frame_end indicates the last row that meets the following requirement: `Xi - Xc ≤ offset`. `order by` is set to desc:frame_start indicates the first row that meets the following requirement: `Xc - Xi >= offset`. frame_end indicates the last row that meets the following requirement: `Xc - Xi <= offset`. GROUPS CURRENT ROW If frame_start is set to CURRENT ROW, it indicates the first row of the group to which the current row belongs. If frame_end is set to CURRENT ROW, it indicates the last row of the group to which the current row belongs. offset PRECEDING If frame_start is set to offset PRECEDING, it indicates the first row of the Nth group that precedes the group of the current row at a given `offset`. If frame_end is set to offset PRECEDING, it indicates the last row of the Nth group that precedes the group of the current row at a given `offset`. Note You cannot set frame_start to UNBOUNDED FOLLOWING, and you cannot set frame_end to UNBOUNED PRECEDING. offset FOLLOWING If frame_start is set to offset FOLLOWING, it indicates the first row of the Nth group that follows the group of the current row at a given `offset`. If frame_end is set to offset FOLLOWING, it indicates the last row of the Nth group that follows the group of the current row at a given `offset`. Note You cannot set frame_start to UNBOUNDED FOLLOWING, and you cannot set frame_end to UNBOUNED PRECEDING.
• frame_exclusion: optional. This parameter is used to remove specific rows from a window. Valid values:

• EXCLUDE NO OTHERS: No rows are excluded from the window.

• EXCLUDE CURRENT ROW: The current row is excluded from the window.

• EXCLUDE GROUP: indicates that an entire group of rows in a partition is excluded from the window. In the group, all rows have the same value of the column that is specified in `order by` as the current row.

• EXCLUDE TIES: An entire group of rows, except for the current row, are excluded from the window.

Default frame_clause

If you do not specify frame_clause, MaxCompute uses the default frame_clause to determine the data boundaries of a window. Values of the default frame_clause:

• If `odps.sql.hive.compatible` is set to true, the following default frame_clause is used. This rule applies to most SQL systems.

``RANGE between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE NO OTHERS``
• If `odps.sql.hive.compatible` is set to false, `order by` is specified in windowing_definition, and one of the following window functions is used, the default frame_clause in ROWS mode is used: AVG, COUNT, MAX, MIN, STDDEV, STEDEV_POP, STDDEV_SAMP, and SUM.

``ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE NO OTHERS``

Example of data boundaries of a window

In this example, a table named tbl contains three columns that are of the BIGINT type: `pid, oid, and rid`. The tbl table contains the following data:

``````+------------+------------+------------+
| pid        | oid        | rid        |
+------------+------------+------------+
| 1          | NULL       | 1          |
| 1          | NULL       | 2          |
| 1          | 1          | 3          |
| 1          | 1          | 4          |
| 1          | 2          | 5          |
| 1          | 4          | 6          |
| 1          | 7          | 7          |
| 1          | 11         | 8          |
| 2          | NULL       | 9          |
| 2          | NULL       | 10         |
+------------+------------+------------+``````

You can replace ellipses (`...`) in the following SQL statements with windowing_definition to display the data in the windows in which each row of data is included.

Note

If a value in the window column in the returned result is NULL, no data is contained in the window.

• Windows in ROWS mode

• windowing_definition 1

``````partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1]    |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | [1, 2, 3] |
| 1          | 1          | 4          | [1, 2, 3, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9]    |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+``````
• windowing_definition 2

``````partition by pid order by oid ROWS between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid ROWS between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | NULL       | 2          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 1          | 3          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 1          | 4          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+``````
• windowing_definition 3

``````partition by pid order by oid ROWS between 1 FOLLOWING and 3 FOLLOWING
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid ROWS between 1 FOLLOWING and 3 FOLLOWING) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [2, 3, 4] |
| 1          | NULL       | 2          | [3, 4, 5] |
| 1          | 1          | 3          | [4, 5, 6] |
| 1          | 1          | 4          | [5, 6, 7] |
| 1          | 2          | 5          | [6, 7, 8] |
| 1          | 4          | 6          | [7, 8] |
| 1          | 7          | 7          | [8]    |
| 1          | 11         | 8          | NULL   |
| 2          | NULL       | 9          | [10]   |
| 2          | NULL       | 10         | NULL   |
+------------+------------+------------+--------+``````
• windowing_definition 4

``````partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE CURRENT ROW
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE CURRENT ROW) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | NULL   |
| 1          | NULL       | 2          | [1]    |
| 1          | 1          | 3          | [1, 2] |
| 1          | 1          | 4          | [1, 2, 3] |
| 1          | 2          | 5          | [1, 2, 3, 4] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7] |
| 2          | NULL       | 9          | NULL   |
| 2          | NULL       | 10         | [9]    |
+------------+------------+------------+--------+``````
• windowing_definition 5

``````partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE GROUP
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE GROUP) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | NULL   |
| 1          | NULL       | 2          | NULL   |
| 1          | 1          | 3          | [1, 2] |
| 1          | 1          | 4          | [1, 2] |
| 1          | 2          | 5          | [1, 2, 3, 4] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7] |
| 2          | NULL       | 9          | NULL   |
| 2          | NULL       | 10         | NULL   |
+------------+------------+------------+--------+``````
• windowing_definition 6

``````partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE TIES
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE TIES) as window from tbl;                            ``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1]    |
| 1          | NULL       | 2          | [2]    |
| 1          | 1          | 3          | [1, 2, 3] |
| 1          | 1          | 4          | [1, 2, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9]    |
| 2          | NULL       | 10         | [10]   |
+------------+------------+------------+--------+``````

The differences between EXCLUDE CURRENT ROW and EXCLUDE GROUP can be obtained based on the comparison between the `window` column values of rows with the `rid` column values of 2, 4, and 10 in windowing_definition 5 and windowing_definition 6. If frame_exclusion is set to EXCLUDE GROUP, the rows that have the same `pid` column value in a partition are extracted when the rows have the same `oid` column value as the current row.

• Windows in RANGE mode

• windowing_definition 1

``````partition by pid order by oid RANGE between UNBOUNDED PRECEDING and CURRENT ROW
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid RANGE between UNBOUNDED PRECEDING and CURRENT ROW) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2] |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | [1, 2, 3, 4] |
| 1          | 1          | 4          | [1, 2, 3, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [1, 2, 3, 4, 5, 6] |
| 1          | 7          | 7          | [1, 2, 3, 4, 5, 6, 7] |
| 1          | 11         | 8          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+``````

If frame_end is set to CURRENT ROW, the last row that has the same value of the `oid` column in `order by` as the current row is obtained. Therefore, the `window` column value of the row whose `rid` column value is 1 is [1, 2].

• windowing_definition 2

``````partition by pid order by oid RANGE between CURRENT ROW and UNBOUNDED FOLLOWING
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid RANGE between CURRENT ROW and UNBOUNDED FOLLOWING) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | NULL       | 2          | [1, 2, 3, 4, 5, 6, 7, 8] |
| 1          | 1          | 3          | [3, 4, 5, 6, 7, 8] |
| 1          | 1          | 4          | [3, 4, 5, 6, 7, 8] |
| 1          | 2          | 5          | [5, 6, 7, 8] |
| 1          | 4          | 6          | [6, 7, 8] |
| 1          | 7          | 7          | [7, 8] |
| 1          | 11         | 8          | [8]    |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+``````
• windowing_definition 3

``````partition by pid order by oid RANGE between 3 PRECEDING and 1 PRECEDING
-- Sample SQL statement:

select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid RANGE between 3 PRECEDING and 1 PRECEDING) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2] |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | NULL   |
| 1          | 1          | 4          | NULL   |
| 1          | 2          | 5          | [3, 4] |
| 1          | 4          | 6          | [3, 4, 5] |
| 1          | 7          | 7          | [6]    |
| 1          | 11         | 8          | NULL   |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+``````

For the row whose value of `oid` in `order by` is NULL, if frame_start is set to `offset PRECEDING` or offset FOLLOWING, the row is the first row whose value of oid in `order by` is NULL only if the `offset` is not unbounded. If frame_end is set to offset PRECEDING or offset FOLLOWING, the row is the last row whose value of oid in `order by` is NULL only if the offset is not unbounded.

• Windows in GROUPS mode

windowing_definition

``````partition by pid order by oid GROUPS between 2 PRECEDING and CURRENT ROW
-- Sample SQL statement:
select pid,
oid,
rid,
collect_list(rid) over(partition by pid order by
oid GROUPS between 2 PRECEDING and CURRENT ROW) as window from tbl;``````

The following result is returned:

``````+------------+------------+------------+--------+
| pid        | oid        | rid        | window |
+------------+------------+------------+--------+
| 1          | NULL       | 1          | [1, 2] |
| 1          | NULL       | 2          | [1, 2] |
| 1          | 1          | 3          | [1, 2, 3, 4] |
| 1          | 1          | 4          | [1, 2, 3, 4] |
| 1          | 2          | 5          | [1, 2, 3, 4, 5] |
| 1          | 4          | 6          | [3, 4, 5, 6] |
| 1          | 7          | 7          | [5, 6, 7] |
| 1          | 11         | 8          | [6, 7, 8] |
| 2          | NULL       | 9          | [9, 10] |
| 2          | NULL       | 10         | [9, 10] |
+------------+------------+------------+--------+``````

## Aggregate functions

The following table describes the aggregate functions that are supported by MaxCompute SQL.

 Function Description AVG Returns the average value of a column. COUNT Returns the number of records that match the specified criteria. COUNT_IF Returns the number of records whose expr value is True. MAX Returns the maximum value of a column. MIN Returns the minimum value of a column. MEDIAN Returns the median value of a column. STDDEV Returns the population standard deviation of all input values. STDDEV_SAMP Returns the sample standard deviation of all input values. SUM Returns the sum of a column. WM_CONCAT Concatenates strings with a specified delimiter. ANY_VALUE Returns a random value from a specified column. APPROX_DISTINCT Returns the approximate number of distinct input values in a specified column. ARG_MAX Returns the column value of the row that corresponds to the maximum value of a specified column. ARG_MIN Returns the column value of the row that corresponds to the minimum value of a specified column. MAX_BY Returns the column value of the row that corresponds to the maximum value of a specified column. MIN_BY Returns the column value of the row that corresponds to the minimum value of a specified column. COLLECT_LIST Aggregates values from a specified column into an array. COLLECT_SET Aggregates only distinct values from a specified column into an array. NUMERIC_HISTOGRAM Returns the approximate histogram of a specified column. PERCENTILE_APPROX Returns approximate percentiles. This function applies to scenarios in which a large amount of data is calculated. BITWISE_OR_AGG Aggregates input values based on the bitwise OR operation. BITWISE_AND_AGG Aggregates input values based on the bitwise AND operation. MAP_AGG Returns a map that is created by using a and b. a is the key in the map. b is the value of the key in the map. MULTIMAP_AGG Returns a map that is created by using a and b. a is the key in the map. b is used to create an array, which is used as the value of the key in the map. MAP_UNION Returns a new map that is the union of all input maps. MAP_UNION_SUM Returns a new map that is the union of all input maps. The output map sums the values of the matching keys in all input maps. HISTOGRAM Returns a map that contains the number of times each input value appears.
• Syntax

Syntax of an aggregate function:

``<aggregate_name>(<expression>[,..]) [within group (order by <col1>[,<col2>…])] [filter (where <where_condition>)]``
• `<aggregate_name>(<expression>[,..])`: a built-in aggregate function or a UDAF. The format of an aggregate function is based on its syntax.

• `within group (order by <col1>[,<col2>…])`: If the syntax of an aggregate function includes this expression, the system automatically sorts the input data of `<col1>[,<col2>…]` in ascending order. To sort the input data in descending order, use the expression `within group (order by <col1>[,<col2>…] [desc])`.

Before you use this expression, take note of the following points:

• You can use this expression only for WM_CONCAT, COLLECT_LIST, COLLECT_SET, and UDAFs.

• If multiple aggregate functions of a SELECT statement include the expression `within group (order by <col1>[,<col2>…])`, `order by <col1>[,<col2>…]` must be the same for these functions.

• If the parameters of an aggregate function include the DISTINCT keyword, columns with distinct values must be specified in the expression `order by <col1>[,<col2>…]`.

Examples:

``````-- Example 1: Sort the input data in ascending order and return the output data.
select
x,
wm_concat(',', y) within group (order by y)
from values('k', 1),('k', 3),('k', 2) as t(x, y)
group by x;
-- The following result is returned:
+------------+------------+
| x          | _c1        |
+------------+------------+
| k          | 1,2,3      |
+------------+------------+

-- Example 2: Sort the input data in descending order and return the output data.
select
x,
wm_concat(',', y) within group (order by y desc)
from values('k', 1),('k', 3),('k', 2) as t(x, y)
group by x;
-- The following result is returned:
+------------+------------+
| x          | _c1        |
+------------+------------+
| k          | 3,2,1      |
+------------+------------+``````
• `[filter (where <where_condition>)]`: If an aggregate function includes this expression, the aggregate function processes only the data that meets the condition specified by `<where_condition>`. For more information about `<where_condition>`, see WHERE clause (where_condition).

Before you use this expression, take note of the following points:

• Only built-in aggregate functions support this expression. UDAFs do not support this expression.

• `count(*)` does not support the `[filter (where <where_condition>)]` expression. To add filter conditions to `count(*)`, you can use the COUNT_IF function.

• The COUNT_IF function does not support this expression.

Examples:

``````-- Example 1: Filter and aggregate data.
select
sum(x),
sum(x) filter (where y > 1),
sum(x) filter (where y > 2)
from values(null, 1),(1, 2),(2, 3),(3, null) as t(x, y);
-- The following result is returned:
+------------+------------+------------+
| _c0        | _c1        | _c2        |
+------------+------------+------------+
| 6          | 3          | 2          |
+------------+------------+------------+

-- Example 2: Use multiple aggregate functions to filter and aggregate data.
select
count_if(x > 2),
sum(x) filter (where y > 1),
sum(x) filter (where y > 2)
from values(null, 1),(1, 2),(2, 3),(3, null) as t(x, y);
-- The following result is returned:
+------------+------------+------------+
| _c0        | _c1        | _c2        |
+------------+------------+------------+
| 1          | 3          | 2          |
+------------+------------+------------+``````
• Filter expressions

• Limits

• Only built-in aggregate functions of MaxCompute support filter expressions. UDAFs do not support filter expressions.

• `COUNT(*)` cannot be used with filter expressions. Use the COUNT_IF function with filter expressions.

• Syntax

``<aggregate_name>(<expression>[,...]) [filter (where <where_condition>)]``
• Description

All aggregate functions support filter expressions. If you specify a filter condition, only the row data that meets the filter condition can be passed to the related aggregate function for data processing.

• Parameters

• aggregate_name: required. The name of the aggregate function. Select an aggregate function that is described in this topic based on your business requirements.

• expression: required. The parameters of the aggregate function that you select. Specify this parameter based on the description of the aggregate function that you select.

• Return value

For more information, see the description of the return value for each aggregate function.

• Sample statement:

``select sum(sal) filter (where deptno=10), sum(sal) filter (where deptno=20), sum(sal) filter (where deptno=30) from emp;``

The following result is returned:

``````+------------+------------+------------+
| _c0        | _c1        | _c2        |
+------------+------------+------------+
| 17500      | 10875      | 9400       |
+------------+------------+------------+``````

## String functions

The following table describes the string functions that are supported by MaxCompute SQL. For more information about the limits on string functions, see Limits on string functions.

 Function Description ASCII Returns the ASCII code of the first character in a specified string. CHAR_MATCHCOUNT Calculates the number of characters of String A that appear in String B. CHR Converts an ASCII code into a character. CONCAT Concatenates all the specified strings and returns the final string. CONCAT_WS Concatenates all input strings in an array by using a specified delimiter. DECODE Decodes a string in the specified encoding format. ENCODE Encodes a string in the specified encoding format. FIND_IN_SET Returns the position of the specified string among multiple strings that are separated by commas (,). FORMAT_NUMBER Converts a number into a string in the specified format. FROM_JSON Returns data of the ARRAY, MAP, or STRUCT type based on a given JSON string and a given output format. GET_JSON_OBJECT Extracts a single string from a standard JSON string by using a specific method. INITCAP Converts a string into a string with words in title caps and separated by spaces. In title caps, the first letter of each word is capitalized, and the other letters are in lowercase. INSTR Returns the position of String A in String B. IS_ENCODING Determines whether a string can be converted from one character set to another character set. KEYVALUE Splits a string into key-value pairs, separates the key-value pairs, and then returns the value that corresponds to a specified key. KEYVALUE_TUPLE Splits a string into key-value pairs, separates the key-value pairs, and then returns the values that correspond to the keys. LENGTH Calculates the length of a string. LENGTHB Calculates the length of a string in bytes. LOCATE Returns the position of a specified string in another string. LTRIM Removes the characters from the left side of a string. MD5 Returns the MD5 value of a string. PARSE_URL Parses a URL and returns the specified part of the URL. PARSE_URL_TUPLE Parses a URL and returns multiple parts of the URL. REGEXP_COUNT Returns the number of substrings that match a specified pattern from a specified position. REGEXP_EXTRACT Splits a string into groups based on a specified pattern and returns the string in a specified group. REGEXP_INSTR Returns the start or end position of a substring that starts at a specified position and matches a specified pattern for a specified number of times. REGEXP_REPLACE Uses a string to replace a substring of another string if the substring matches a specified pattern for a specified number of times. REGEXP_SUBSTR Returns a substring in a string that matches a specified pattern for a specified number of times from a specified position. REPEAT Returns a string that repeats a specified string for a specified number of times. REVERSE Returns a string in reverse order. RTRIM Removes the characters from the right side of a string. SPACE Generates a space string. SPLIT_PART Uses a delimiter to split a string into substrings and returns a substring of the specified part of the string. SUBSTR Returns a substring that has a specified length from a specified position of a string. The string is of the STRING type. SUBSTRING Returns a substring that has a specified length from a specified position of a string. The string is of the STRING or BINARY type. TO_CHAR Converts data of the BOOLEAN, BIGINT, DECIMAL, or DOUBLE type into the STRING type. TO_JSON Converts data of a complex data type into a JSON string. TOLOWER Converts uppercase letters in a string into lowercase letters. TOUPPER Converts lowercase letters in a string into uppercase letters. TRIM Removes the characters from both of the left and right sides of a string. URL_DECODE Converts an input string that is in the `application/x-www-form-urlencoded MIME` format into a standard string. URL_ENCODE Encodes the input string in the `application/x-www-form-urlencoded MIME` format and returns the encoded string. JSON_TUPLE Extracts strings from a standard JSON string based on a set of input keys. LPAD Left pads a string to a specified length. RPAD Right pads a string to a specified length. REPLACE Replaces a substring in String A that matches String B with another substring. SOUNDEX Converts a string into a string of the SOUNDEX type. SUBSTRING_INDEX Truncates a string from a specified delimiter. TRANSLATE Replaces the part of String A that appears in String B with String C. REGEXP_EXTRACT_ALL Finds all substrings that match the pattern of a regular expression in a string and returns the substrings as an array.

## Complex type functions

The following table describes the complex type functions that are supported by MaxCompute SQL. For more information about the limits on JSON functions, see Limits on JSON functions.

 Function type Function Description ARRAY functions ALL_MATCH Checks whether all elements in an array meet a specific condition. ANY_MATCH Checks whether an element in an array meets a specific condition. ARRAY Creates an array based on given values. ARRAY_CONTAINS Checks whether an array contains a given value. ARRAY_DISTINCT Removes duplicate elements from an array. ARRAY_EXCEPT Finds the elements that exist in Array A but do not exist in Array B and returns the elements as a new array without duplicates. ARRAY_INTERSECT Calculates the intersection of two arrays. ARRAY_JOIN Concatenates the elements in an array by using a delimiter. ARRAY_MAX Returns the largest element in an array. ARRAY_MIN Returns the smallest element in an array. ARRAY_POSITION Returns the position of the first occurrence of a given element in an array. ARRAY_REDUCE Aggregates the elements in an array. ARRAY_REMOVE Removes a given element from an array. ARRAY_REPEAT Returns a new array in which a given element is repeated for a specified number of times. ARRAY_SORT Sorts the elements in an array based on a comparator. ARRAY_UNION Calculates the union of two arrays and returns the union as a new array without duplicates. ARRAYS_OVERLAP Checks whether two arrays contain the same element. ARRAYS_ZIP Merges multiple arrays. CONCAT Concatenates multiple arrays or strings. EXPLODE Transposes one row of data into multiple rows. This function is a user-defined table-valued function (UDTF). FILTER Filters the elements in an array. INDEX Returns the element at a specific position in an array. POSEXPLODE Converts an array into a table that has two columns. The first column lists the position of each element in the array, starting from 0. The second column lists the elements. SIZE Returns the number of elements in an array. SLICE Copies the elements in an array from a specific position based on a specific length and returns the elements as a new array. SORT_ARRAY Sorts the elements in an array. TRANSFORM Transforms the elements in an array. ZIP_WITH Merges two arrays at the element level based on element positions and returns a new array. MAP functions EXPLODE Transposes one row of data into multiple rows. This function is a UDTF. INDEX Returns the value that meets a specific condition in a map. MAP Creates a map based on given key-value pairs. MAP_CONCAT Returns the union of multiple maps. MAP_ENTRIES Converts key-value pairs in a map into a struct array. MAP_FILTER Filters the elements in a map. MAP_FROM_ARRAYS Creates a map based on given arrays. MAP_FROM_ENTRIES Creates a map based on given struct arrays. MAP_KEYS Returns all keys in a map as an array. MAP_VALUES Returns all values in a map as an array. MAP_ZIP_WITH Merges two given maps into a single map. SIZE Returns the number of key-value pairs in a map. TRANSFORM_KEYS Transforms the keys in a map by using a given function. The values in the map are not changed. TRANSFORM_VALUES Transforms the values in a map by using a given function. The keys in the map are not changed. STRUCT functions FIELD Obtains the value of a member variable in a struct. INLINE Expands a given struct array. Each array element corresponds to a row and each struct element corresponds to a column in each row. STRUCT Creates a struct based on a given value list. NAMED_STRUCT Creates a struct based on given name-value pairs. JSON functions FROM_JSON Returns data of the ARRAY, MAP, or STRUCT type based on a given JSON string and a given output format. GET_JSON_OBJECT Extracts a single string from a standard JSON string by using a specific method. JSON_TUPLE Extracts strings from a standard JSON string based on a set of input keys. TO_JSON Converts data of a complex data type into a JSON string. JSON_OBJECT Returns a JSON object that contains key-value pairs. JSON_ARRAY Evaluates a possibly empty list of values and returns a JSON array that contains these values. JSON_EXTRACT Parses the value of json_path in a JSON expression. Note that an error is returned if the value of json_path is invalid. JSON_EXISTS Determines whether the JSON value of json_path exists. JSON_PRETTY Returns a JSON value in a format that is easy to read by adding line breaks and spaces. JSON_TYPE Returns the data type of a JSON value. JSON_FORMAT Converts a value of the JSON data type into a value of the STRING data type. By default, JSON data is not automatically prettified. JSON_PARSE Converts a value of the STRING data type into a value of the JSON data type. If a non-JSON-formatted value is converted into a value of the STRING data type, an error is returned. JSON_VALID Determines whether a string is in a valid JSON format. CAST Supports conversion between basic data types and JSON data types.

## Encryption and decryption functions

The following table describes the encryption function and decryption function that are supported by MaxCompute SQL.

 Function Description SYM_ENCRYPT Encrypts data in specified columns of a table by using a random key and returns the ciphertext of the BINARY type. SYM_DECRYPT Decrypts the encrypted data in the specified columns and returns the plaintext of the BINARY type.

## Other functions

The following table describes other types of functions that are supported by MaxCompute SQL.

 Function Description BASE64 Converts a binary value into a Base64-encoded string. BETWEEN AND Returns the values that fall in or fall out of the specified range. CASE WHEN Returns values based on the computing result of an expression. CAST Converts the result of an expression into the specified data type. COALESCE Returns the first non-null value in the parameter list. COMPRESS Uses the GZIP algorithm to compress input parameters of the STRING or BINARY type. CRC32 Calculates the cyclic redundancy check (CRC) value of a value that is of the STRING or BINARY type. DECODE Implements the `IF-THEN-ELSE` logic. DECOMPRESS Uses the GZIP algorithm to decompress input parameters of the BINARY type. GET_IDCARD_AGE Returns an age in years based on the ID card number. GET_IDCARD_BIRTHDAY Returns the date of birth based on the ID card number. GET_IDCARD_SEX Returns the gender based on the ID card number. GET_USER_ID Obtains the ID of the current account. GREATEST Returns the maximum value of the input parameters. HASH Calculates a hash value based on the input parameters. IF Checks whether a specified condition is true. LEAST Returns the minimum value of the input parameters. MAX_PT Returns the name of the largest level-1 partition in a partitioned table. NULLIF Checks whether the values of two input parameters are the same. NVL Specifies the return values of the parameters whose values are null. ORDINAL Sorts the values of the input variables in ascending order and returns the value that is ranked at a specified position. PARTITION_EXISTS Checks whether a specified partition exists in a table. SAMPLE Samples all column values that are read and filters out the rows that do not meet sampling conditions. SHA Calculates the SHA-1 hash value of a value that is of the STRING or BINARY type. SHA1 Calculates the SHA-1 hash value of a value that is of the STRING or BINARY type. SHA2 Calculates the SHA-2 hash value of a value that is of the STRING or BINARY type. SIGN Determines the sign of a value. The sign indicates whether a value is positive or negative. SPLIT Splits a string with a specified delimiter and returns an array. STACK Splits a specified parameter group into a specified number of rows. STR_TO_MAP Splits a string with a specified delimiter and returns a key-value pair. TABLE_EXISTS Checks whether a specified table exists. TRANS_ARRAY Transposes one row of data into multiple rows. This function is a UDTF that transposes an array separated by fixed delimiters in a column into multiple rows. TRANS_COLS Transposes one row of data into multiple rows. This function is a UDTF that transposes columns into rows. UNBASE64 Converts a Base64-encoded string into a binary value. UNIQUE_ID Returns a unique ID. This function is more efficient than the UUID function. UUID Returns a random ID.

## Limits on JSON functions

• The development tools that are supported for the JSON type include odpscmd and MaxCompute Studio. External ecosystems such as DataWorks and Dataphin are not supported. If you need to use the JSON type together with an external ecosystem, you must check whether the external ecosystem is supported for the JSON type. The following table describes the items that you must take note of when you use odpscmd and MaxCompute Studio as development tools for the JSON type.

 odpscmd MaxCompute Studio You must upgrade the client to the latest version. Otherwise, the `desc json_table` command cannot be used. You must set the `se_instance_tunnel` parameter in the conf\odps_config.ini file in the installation path of the client to false. Otherwise, an error is reported when you perform a query. MaxCompute Studio allows you only to query JSON data. It does not allow you to upload or download JSON data.
• If you want to use another type of engine, such as Hologres, to read data from a MaxCompute table, JSON data in the table cannot be read.

• Columns of the JSON type cannot be added to a MaxCompute table.

• You are not allowed to compare data of the JSON type with data of other types, execute SQL statements that contain the `ORDER BY` or `GROUP BY` clause on data of the JSON type, or use columns of the JSON type as `join` keys.

• The integer part and decimal part of a value of the JSON NUMBER type are stored by using the BIGINT type and the DOUBLE type separately. If the integer part exceeds the range that is supported by the BIGINT type, an integer overflow error occurs. Conversion of the decimal part to the DOUBLE type leads to precision loss.

• Strings that can be used to generate JSON data do not include `\u0000`, which is the Unicode character for NULL.

• Java UDFs and Python UDFs do not support the JSON type.

## Limits on string functions

The following functions support only English characters:

• TRIM, RTRIM, and LTRIM: The value of the trimChars parameter can contain only English characters.

• REVERSE: This function supports only English characters in the Hive-compatible data type edition.

• SOUNDEX: This function supports only English characters.

• TOLOWER: This function is used to convert English characters in a string into lowercase characters.

• TOUPPER: This function is used to convert English characters in a string into uppercase characters.

• INITCAP: This function is used to convert the first letter of each word in English in a string into an uppercase letter and the other letters of each word into lowercase characters.