All Products
Search
Document Center

MaxCompute:Other functions

Last Updated:Jun 26, 2026

MaxCompute SQL provides other common functions for development. You can select the appropriate function as needed. This topic describes the command format, parameters, and examples for functions such as CAST, FAILIF, and HASH that are supported by MaxCompute SQL.

Function

Features

BETWEEN AND expression

Filters data that meets the specified range condition.

CASE WHEN expression

Returns different values based on the result of an expression.

CAST

Converts the result of an expression to a target data type.

COALESCE

Returns the first non-NULL value in a parameter list.

COMPRESS

Compresses a STRING or BINARY input parameter using the GZIP algorithm.

CRC32

Calculates the cyclic redundancy check (CRC) value of a string or binary data.

DECOMPRESS

Decompresses a BINARY input parameter using the GZIP algorithm.

FAILIF

Returns true or a custom error message based on the result of an expression.

GET_IDCARD_AGE

Returns the current age based on a Chinese ID card number.

GET_IDCARD_BIRTHDAY

Returns the date of birth based on a Chinese ID card number.

GET_IDCARD_SEX

Returns the sex based on a Chinese ID card number.

GET_USER_ID

Obtains the ID of the current account.

HASH

Calculates a hash value based on input parameters.

IF

Checks whether a specified condition is true.

MAX_PT

Returns the maximum value of the level-1 partition in a partitioned table.

NULLIF

Determines whether the two input parameters are equal.

NVL

Specifies the return value for a parameter that is NULL.

ORDINAL

Sorts input variables in ascending order and returns the value at a specified position.

PARTITION_EXISTS

Queries whether a specified partition exists.

SAMPLE

Samples all read column values and filters out rows that do not meet the sampling conditions.

SHA

Calculates the SHA-1 hash value of a string or binary data.

SHA1

Calculates the SHA-1 hash value of a string or binary data.

SHA2

Calculates the SHA-2 hash value of a string or binary data.

STACK

Splits a specified group of parameters into a specified number of rows.

STR_TO_MAP

Splits a string into keys and values based on specified separators.

TABLE_EXISTS

Queries whether a specified table exists.

TRANS_ARRAY

A user-defined table-valued function (UDTF) that converts one row of data into multiple rows. It converts an array stored in a column with a fixed separator into multiple rows.

TRANS_COLS

A UDTF that converts one row of data into multiple rows. It splits different columns into different rows.

UNIQUE_ID

Returns a random ID. This function is more efficient than the UUID function.

UUID

Returns a random ID.

BETWEEN AND expression

  • Command format

    <a> [NOT] BETWEEN <b> AND <c>
  • Description

    Filters data where the value of a is between b and c, or is not between b and c.

  • Parameters

    • a: Required. The field to be filtered.

    • b and c: Required. The specified range. The data types of b and c must be the same as the data type of a.

  • Return value

    Returns the data that meets the condition.

    If a, b, or c is null, the result is null.

  • Examples

    The following data is in the emp table.

    | empno | ename | job | mgr | hiredate| sal| comm | deptno |
    7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
    7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
    7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
    7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
    7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
    7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
    7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
    7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
    7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
    7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
    7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
    7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
    7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
    7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
    7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
    7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
    7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10

    The following command queries for data where the value of sal is greater than or equal to 1000 and less than or equal to 1500.

    select * from emp where sal between 1000 and 1500;

    The following result is returned.

    +-------+-------+-----+------------+------------+------------+------------+------------+
    | empno | ename | job | mgr        | hiredate   | sal        | comm       | deptno     |
    +-------+-------+-----+------------+------------+------------+------------+------------+
    | 7521  | WARD  | SALESMAN | 7698  | 1981-02-22 00:00:00 | 1250.0     | 500.0      | 30  |
    | 7654  | MARTIN | SALESMAN | 7698 | 1981-09-28 00:00:00 | 1250.0     | 1400.0     | 30 |
    | 7844  | TURNER | SALESMAN | 7698 | 1981-09-08 00:00:00 | 1500.0     | 0.0        | 30 |
    | 7876  | ADAMS | CLERK | 7788  | 1987-05-23 00:00:00 | 1100.0     | NULL     | 20   |
    | 7934  | MILLER | CLERK | 7782  | 1982-01-23 00:00:00 | 1300.0     | NULL      | 10  |
    | 7956  | TEBAGE | CLERK | 7748  | 1982-12-30 00:00:00 | 1300.0     | NULL      | 10  |
    +-------+-------+-----+------------+------------+------------+------------+------------+

CASE WHEN expression

  • Command format

    MaxCompute provides the following two CASE WHEN formats:

    • CASE <value>
      WHEN <value1> THEN <result1>
      WHEN <value2> THEN <result2>
      ...
      ELSE <resultn>
      END
    • CASE
      WHEN (<_condition1>) THEN <result1>
      WHEN (<_condition2>) THEN <result2>
      WHEN (<_condition3>) THEN <result3>
      ...
      ELSE <resultn>
      END
  • Description

    Returns different result values based on the result of value or _condition.

  • Parameters

    • value: Required. The value to compare.

    • _condition: Required. The condition to evaluate.

    • result: Required. The return value.

  • Return value

    • If result contains only BIGINT and DOUBLE types, all values are converted to DOUBLE before the result is returned.

    • If result includes the STRING type, all values are converted to the STRING type before the result is returned. If a data type conversion is not supported, an error is returned. For example, data of the BOOLEAN type cannot be converted to the STRING type.

    • Other type conversions are not allowed.

  • Examples

    The sale_detail table has the fields shop_name string, customer_id string, total_price double and contains the following data.

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    | null       | c5          | NULL        | 2014       | shanghai   |
    | s6         | c6          | 100.4       | 2014       | shanghai   |
    | s7         | c7          | 100.5       | 2014       | shanghai   |
    +------------+-------------+-------------+------------+------------+

    The following is a command example.

    select 
    case  
    when region='china' then 'default_region'
    when region like 'shang%' then 'sh_region'
    end as region 
    from sale_detail;

    The following result is returned.

    +------------+
    | region     |
    +------------+
    | default_region |
    | default_region |
    | default_region |
    | sh_region  |
    | sh_region  |
    | sh_region  |
    +------------+

CAST

  • Command format

    CAST(<expr> AS <type>)
  • Description

    Converts the result of expr to the target data type type.

  • Parameters

    • expr: Required. The source data to convert.

    • type: Required. The target data type. The usage is as follows:

      • cast(double as bigint): Converts a value of the DOUBLE data type to the BIGINT data type.

      • cast(string as bigint): When you convert a string to the BIGINT data type, if the string contains a number expressed as an integer, it is directly converted to the BIGINT type. If the string contains a number expressed in floating-point or exponential form, it is first converted to the DOUBLE data type and then to the BIGINT data type.

      • cast(string as datetime) or cast(datetime as string): The default date format yyyy-mm-dd hh:mi:ss is used.

  • Return value

    • The return value is of the target data type.

    • If you run setproject odps.function.strictmode=false, the function returns the digits before the letters.

    • If you run setproject odps.function.strictmode=true, an error is returned.

    • When you convert a value to the DECIMAL type, if you set odps.sql.decimal.tostring.trimzero=true, trailing zeros after the decimal point are removed. If you set odps.sql.decimal.tostring.trimzero=false, trailing zeros after the decimal point are retained.

      Important

      The odps.sql.decimal.tostring.trimzero parameter takes effect for both data retrieved from tables and static values.

  • Examples

    • Example 1: Common usage.

      --Returns 1.
      select cast('1' as bigint);
    • Example 2: Convert a STRING value to a BOOLEAN value. If the STRING is empty, false is returned. Otherwise, true is returned.

      • The STRING is empty.

        select cast("" as boolean);
        --Returns
        +------+
        | _c0  |
        +------+
        | false |
        +------+
      • The STRING is not empty.

        select cast("false" as boolean);
        --Returns true
        +------+
        | _c0  |
        +------+
        | true |
        +------+
    • Example 3: Convert a string to a date.

      --Convert a string to a date.
      select cast("2022-12-20" as date);
      --Returns
      +------------+
      | _c0        |
      +------------+
      | 2022-12-20 |
      +------------+
      
      --Convert a date string with a time part to a date.
      select cast("2022-12-20 00:01:01" as date);
      --Returns
      +------------+
      | _c0        |
      +------------+
      | NULL       |
      +------------+
      --To ensure the value is displayed correctly, set the following parameter:
      set odps.sql.executionengine.enable.string.to.date.full.format= true;
      select cast("2022-12-20 00:01:01" as date);
      --Returns
      +------------+
      | _c0        |
      +------------+
      | 2022-12-20 |
      +------------+
      Note

      By default, the odps.sql.executionengine.enable.string.to.date.full.format parameter is false. To convert a date string that includes a time part, set this parameter to true.

    • Example 4 (Incorrect command example): Invalid usage. An exception is thrown if a conversion fails or is not supported. The following command is an example of incorrect usage.

      select cast('abc' as bigint);
    • Example 5: Example of a scenario in which setproject odps.function.strictmode=false is set.

      setprojectodps.function.strictmode=false;
      select cast('123abc'as bigint);
      --Returns
      +------------+
      |_c0|
      +------------+
      |123|
      +------------+
    • Example 6: Example of a scenario in which setproject odps.function.strictmode=true is set.

      setprojectodps.function.strictmode=true;
      select cast('123abc' as bigint);
      --Returns
      FAILED:ODPS-0130071:[0,0]Semanticanalysisexception-physicalplangenerationfailed:java.lang.NumberFormatException:ODPS-0123091:Illegaltypecast-Infunctioncast,value'123abc'cannotbecastedfromStringtoBigint.
    • Example 7: Example of a scenario in which odps.sql.decimal.tostring.trimzero is set.

      --Create a table.
      create table mf_dot (dcm1 decimal(38,18),
                           dcm2 decimal(38,18));
      --Insert data.
      insert into table mf_dot values (12.45500BD,12.3400BD);
      
      --When the flag is true or not set.
      set odps.sql.decimal.tostring.trimzero=true;
      --Remove trailing zeros after the decimal point.
      select cast(round(dcm1,3) as string),cast(round(dcm2,3) as string) from mf_dot;
      --Return value
      +------------+------------+
      | _c0        | _c1        |
      +------------+------------+
      | 12.455     | 12.34      |
      +------------+------------+
      
      --When the flag is false.
      set odps.sql.decimal.tostring.trimzero=false;
      --Keep trailing zeros after the decimal point.
      select cast(round(dcm1,3) as string),cast(round(dcm2,3) as string) from mf_dot;
      --Return value
      +------------+------------+
      | _c0        | _c1        |
      +------------+------------+
      | 12.455     | 12.340     |
      +------------+------------+
      
      --This parameter also applies to static values.
      set odps.sql.decimal.tostring.trimzero=false;
      select cast(round(12345.120BD,3) as string);
      --Returns:
      +------------+
      | _c0        |
      +------------+
      | 12345.120  |
      +------------+

COALESCE

  • Command format

    COALESCE(<expr1>, <expr2>, ...)
  • Description

    Returns the first non-NULL value in the list of <expr1>, <expr2>, ....

  • Parameters

    expr: Required. The value to check.

  • Return value

    The data type of the return value is the same as the data types of the parameters.

  • Examples

    • Example 1: Common usage. The following is a command example.

      --Returns 1.
      select coalesce(null,null,1,null,3,5,7);
    • Example 2: An error is returned if the data type of a parameter value is not defined.

      • Invalid command

        -- The data type of the parameter abc is not defined. The system engine cannot recognize it and returns an error.
        select coalesce(null,null,1,null,abc,5,7);
      • Valid command

        select coalesce(null,null,1,null,'abc',5,7);
    • Example 3: If all parameter values are null when you do not read data from a table, an error is returned. The following is an invalid command.

      -- An error is returned, indicating that at least one parameter value must be non-NULL.
      select coalesce(null,null,null,null);
    • Example 4: If all parameter values are null when you read data from a table, NULL is returned.

      Source table:

      +-----------+-------------+------------+
      | shop_name | customer_id | toal_price |
      +-----------+-------------+------------+
      | ad        | 10001       | 100.0      |
      | jk        | 10002       | 300.0      |
      | ad        | 10003       | 500.0      |
      | tt        | NULL        | NULL       |
      +-----------+-------------+------------+

      As shown in the source table, all values for tt are null. After you run the following statement, NULL is returned.

      select coalesce(customer_id,total_price) from sale_detail where shop_name='tt';

COMPRESS

  • Command format

    BINARY COMPRESS(STRING <str>)
    BINARY COMPRESS(BINARY <bin>)
  • Description

    Compresses str or bin using the GZIP algorithm.

  • Parameters

    • str: Required. A value of the STRING type.

    • bin: Required. A value of the BINARY type.

  • Return value

    Returns a value of the BINARY type. If the input parameter is NULL, NULL is returned.

  • Examples

    • Example 1: Compresses the string hello using the GZIP algorithm. The following is a command example.

      -- The return value is =1F=8B=08=00=00=00=00=00=00=03=CBH=CD=C9=C9=07=00=86=A6=106=05=00=00=00.
      select compress('hello');
    • Example 2: The input parameter is an empty string. The following is a command example.

      -- The return value is =1F=8B=08=00=00=00=00=00=00=03=03=00=00=00=00=00=00=00=00=00.
      select compress('');
    • Example 3: The input parameter is NULL. The following is a command example.

      -- Returns NULL.
      select compress(null);

CRC32

  • Command format

    BIGINT CRC32(STRING|BINARY <expr>)
  • Description

    Calculates the cyclic redundancy check (CRC) value of the STRING or BINARY expression expr.

  • Parameters

    expr: Required. A value of the STRING or BINARY type.

  • Return value

    Returns a value of the BIGINT type. The following rules apply:

    • If the input parameter is NULL, NULL is returned.

    • If the input parameter is an empty string, 0 is returned.

  • Examples

    • Example 1: Calculates the CRC value of the string ABC. The following is a command example.

      -- Returns 2743272264.
      select crc32('ABC');
    • Example 2: The input parameter is NULL. The following is a command example.

      -- Returns NULL.
      select crc32(null);

DECOMPRESS

  • Command format

    BINARY DECOMPRESS(BINARY <bin>)
  • Description

    Decompresses bin using the GZIP algorithm.

  • Parameters

    bin: Required. A value of the BINARY type.

  • Return value

    Returns a value of the BINARY type. If the input parameter is NULL, NULL is returned.

  • Examples

    • Example 1: Decompresses the compressed result of the string hello, world and converts it to the STRING format. The following is a command example.

      -- Returns hello, world.
      select cast(decompress(compress('hello, world')) as string);
    • Example 2: The input parameter is NULL. The following is a command example.

      -- Returns NULL.
      select decompress(null);

GET_IDCARD_AGE

  • Command format

    get_idcard_age(<idcardno>)
  • Description

    Returns the current age based on a Chinese ID card number. The age is calculated by subtracting the birth year from the ID card number from the current year.

  • Parameters

    idcardno: Required. A 15-digit or 18-digit Chinese ID card number of the STRING type. The function checks the validity of the ID card number based on the province code and the last check digit. If the check fails, NULL is returned.

  • Return value

    Returns a value of the BIGINT type. If the input is NULL, NULL is returned.

GET_IDCARD_BIRTHDAY

  • Command format

    get_idcard_birthday(<idcardno>)
  • Description

    Returns the date of birth based on a Chinese ID card number.

  • Parameters

    idcardno: Required. A 15-digit or 18-digit Chinese ID card number of the STRING type. The function checks the validity of the ID card number based on the province code and the last check digit. If the check fails, NULL is returned.

  • Return value

    Returns a value of the DATETIME type. If the input is NULL, NULL is returned.

GET_IDCARD_SEX

  • Command format

    get_idcard_sex(<idcardno>)
  • Description

    Returns the sex based on a Chinese ID card number. The value is M (male) or F (female).

  • Parameters

    idcardno: Required. A 15-digit or 18-digit Chinese ID card number of the STRING type. The function checks the validity of the ID card number based on the province code and the last check digit. If the check fails, NULL is returned.

  • Return value

    Returns a value of the STRING type. If the input is NULL, NULL is returned.

GET_USER_ID

  • Command format

    get_user_id()
  • Description

    Retrieves the ID of the current account. This is also known as the user ID or UID.

  • Parameters

    No parameters are required.

  • Return value

    Returns the ID of the current account.

  • Examples

    select get_user_id();
    -- The following result is returned.
    +------------+
    | _c0        |
    +------------+
    | 1117xxxxxxxx8519 |
    +------------+

HASH

  • Command format

    • If the MaxCompute project is in Hive-compatible mode, the command format is as follows.

      INT HASH(<value1>, <value2>[, ...]);
    • If the MaxCompute project is not in Hive-compatible mode, the command format is as follows.

      BIGINT HASH(<value1>, <value2>[, ...]);
  • Description

    Calculates the hash value of value1 and value2.

  • Parameters

    value1 and value2: Required. The parameters for which to calculate the hash value. The parameters can be of different data types. The supported data types vary between Hive-compatible and non-Hive-compatible modes:

    • Hive-compatible mode: TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, BOOLEAN, STRING, CHAR, VARCHAR, DATETIME, and DATE.

    • Non-Hive-compatible mode: BIGINT, DOUBLE, BOOLEAN, STRING, and DATETIME.

    Note

    For the same inputs, the returned hash values are always the same. However, if two hash values are the same, it does not guarantee that the input values are the same. A hash collision may occur.

  • Return value

    Returns a value of the INT or BIGINT type. If an input parameter is an empty string or NULL, 0 is returned.

  • Examples

    • Example 1: Calculate the hash value of input parameters of the same data type. The following is a command example.

      -- Returns 66.
      SELECT HASH(0L, 2L, 4L);
    • Example 2: Calculate the hash value of input parameters of different data types. The following is a command example.

      -- Returns 97.
      SELECT HASH(0L, 'a');
    • Example 3: An input parameter is an empty string or NULL. The following is a command example.

      -- Returns 0.
      SELECT HASH(0L, null);
      -- Returns 0.
      SELECT HASH(0L, '');

IF

  • Command format

    IF(<testCondition>, <valueTrue>, <valueFalseOrNull>)
  • Description

    Checks whether testCondition is true. If it is true, the function returns the value of valueTrue. Otherwise, it returns the value of valueFalseOrNull.

  • Parameters

    • testCondition: Required. The expression to evaluate. It must be of the BOOLEAN type.

    • valueTrue: Required. The value to return if the testCondition expression is true.

    • valueFalseOrNull: The value to return if the testCondition expression is false. This can be set to NULL.

  • Return value

    The data type of the return value is the same as the data type of the valueTrue or valueFalseOrNull parameter.

  • Examples

    -- Returns 200.
    select if(1=2, 100, 200); 

MAX_PT

  • Command format

    MAX_PT(<table_full_name>)
  • Description

    Returns the maximum value of a level-1 partition that contains data in a partitioned table. The values are sorted alphabetically. The function then reads the data from that partition.

  • Notes

    • The MAX_PT function can also be implemented using standard SQL. SELECT * FROM table WHERE pt=MAX_PT("table"); can be rewritten as SELECT * FROM table WHERE pt = (SELECT MAX(pt) FROM table);.

      Note

      MaxCompute does not provide a MIN_PT function. You cannot use the SQL statement SELECT * FROM table WHERE pt=MIN_PT("table"); to achieve a similar function as MAX_PT to retrieve the minimum partition that contains data. However, you can use the standard SQL statement SELECT * FROM table WHERE pt= (SELECT MIN(pt) FROM table); to achieve the same effect.

    • If all partitions in the table are empty, the MAX_PT function fails. Make sure that at least one partition is not empty.

    • OSS foreign tables also support the MAX_PT function. The behavior is the same as for internal tables.

  • Parameters

    table_full_name: Required. A value of the STRING type. The name of the table. You must have read permissions on the table.

  • Return value

    Returns the value of the largest level-1 partition.

    Note

    If you create a partition only using ALTER TABLE and the partition does not contain any data, the partition is not returned.

  • Examples

    • Example 1: The table tbl is a partitioned table. Its partitions are 20120901 and 20120902, and both contain data. In the following statement, MAX_PT returns '20120902'. The MaxCompute SQL statement reads the data from the pt='20120902' partition. The following is a command example.

      SELECT * FROM tbl WHERE pt= MAX_PT('tbl');
      -- This is equivalent to the following statement.
      SELECT * FROM tbl WHERE pt= (SELECT MAX(pt) FROM tbl);
    • Example 2: In a multi-level partitioning scenario, use standard SQL to get data from the largest partition. The following is a command example.

      SELECT * FROM table WHERE pt1 = (SELECT MAX(pt1) FROM table) AND pt2 = (SELECT MAX(pt2) FROM table WHERE pt1= (SELECT MAX(pt1) FROM table));

NULLIF

  • Command format

    T NULLIF(T <expr1>, T <expr2>)
  • Description

    Compares the values of expr1 and expr2. If they are equal, the function returns NULL. Otherwise, it returns expr1.

  • Parameters

    expr1 and expr2: Required. Expressions of any type. T indicates the input data type. It can be any data type that MaxCompute supports.

  • Return value

    Returns NULL or expr1.

  • Examples

    -- Returns 2.
    select nullif(2, 3);
    -- Returns NULL.
    select nullif(2, 2);
    -- Returns 3.
    select nullif(3, null);

NVL

  • Command format

    nvl(T <value>, T <default_value>)
  • Description

    If the value of value is NULL, the function returns default_value. Otherwise, it returns value. The two parameters must be of the same data type.

  • Parameters

    • value: Required. The input parameter. T indicates the input data type. It can be any data type that MaxCompute supports.

    • default_value: Required. The replacement value. It must be of the same data type as value.

  • Examples

    The table t_data has three columns: c1 string, c2 bigint, and c3 datetime. The table contains the following data.

    +----+------------+------------+
    | c1 | c2 | c3 |
    +----+------------+------------+
    | NULL | 20 | 2017-11-13 05:00:00 |
    | ddd | 25 | NULL |
    | bbb | NULL | 2017-11-12 08:00:00 |
    | aaa | 23 | 2017-11-11 00:00:00 |
    +----+------------+------------+

    You can use the nvl function to output 00000 for NULL values in c1, 0 for NULL values in c2, and - for NULL values in c3. The following is a command example.

    select nvl(c1,'00000'),nvl(c2,0),nvl(c3,'-') from nvl_test;
    -- The following result is returned.
    +-----+------------+-----+
    | _c0 | _c1 | _c2 |
    +-----+------------+-----+
    | 00000 | 20 | 2017-11-13 05:00:00 |
    | ddd | 25 | - |
    | bbb | 0 | 2017-11-12 08:00:00 |
    | aaa | 23 | 2017-11-11 00:00:00 |
    +-----+------------+-----+

ORDINAL

  • Command format

    ORDINAL(BIGINT <nth>, <var1>, <var2>[,...])
  • Description

    Sorts the input variables in ascending order and returns the value at the nth position.

  • Parameters

    • nth: Required. The position number, starting from 1. A value of the BIGINT type. If the value at the specified position is NULL, NULL is returned.

    • var: Required. The values to be sorted. They can be of the BIGINT, DOUBLE, DATETIME, or STRING type.

  • Return value

    • The value at the nth position. If no implicit conversion occurs, the data type of the return value is the same as the data type of the input parameters.

    • If type conversion occurs, a conversion between DOUBLE, BIGINT, and STRING returns a DOUBLE type. A conversion between STRING and DATETIME returns a DATETIME type. Other implicit conversions are not allowed.

    • NULL is treated as the minimum value.

  • Examples

    -- Returns 3. 
    SELECT ORDINAL(CAST(3 AS BIGINT), CAST(1 AS BIGINT), cast(3 AS BIGINT), cast(7 AS BIGINT), cast(5 AS BIGINT), cast(2 AS BIGINT), cast(4 AS BIGINT), cast(6 AS BIGINT));

PARTITION_EXISTS

  • Command format

    boolean partition_exists(string <table_name>, string... <partitions>)
  • Description

    Checks whether a specified partition exists.

  • Parameters

    • table_name: Required. The table name. A value of the STRING type. The table name can include the project name, such as my_proj.my_table. If you do not specify a project name, the current project is used by default.

    • partitions: Required. The partition name. A value of the STRING type. Specify the partition values in the order of the partition key columns. The number of partition values must be the same as the number of partition key columns.

  • Return value

    Returns a value of the BOOLEAN type. The function returns True if the specified partition exists. Otherwise, it returns False.

  • Examples

    -- Create a partitioned table named foo.
    create table foo (id bigint) partitioned by (ds string, hr string);
    -- Add a partition to the foo table.
    alter table foo add partition (ds='20190101', hr='1');
    -- Query whether the partition ds='20190101' and hr='1' exists. The result is True.
    select partition_exists('foo', '20190101', '1');

SAMPLE

  • Command format

    boolean sample(<x>, <y>, [<column_name1>, <column_name2>[,...]])
  • Description

    The system samples all read values of column_name based on the settings of x and y, and filters out rows that do not meet the sampling conditions.

  • Parameters

    • x and y: x is required. A BIGINT constant greater than 0. This indicates that the data is hashed into x parts, and the yth part is selected.

      y is optional. If omitted, the first part is selected by default. If you omit the y parameter, you must also omit column_name.

      If x or y is of another type or is less than or equal to 0, an exception is thrown. If y is greater than x, an exception is also returned. If either x or y is NULL, NULL is returned.

    • column_name: Optional. The target column for sampling. If this parameter is omitted, random sampling is performed based on the values of x and y. This parameter can be of any data type, and its value can be NULL. No implicit type conversion is performed. If column_name is a constant NULL, an error is returned.

      Note
      • To prevent data skew caused by NULL values, NULL values in column_name are evenly hashed into x parts. If you do not specify column_name, the output may not be uniform when the data volume is small. In this case, you can specify column_name to obtain a better output result.

      • Currently, random sampling is supported only for columns of the following data types: bigint, datetime, boolean, double, string, binary, char, and varchar.

  • Return value

    Returns a value of the BOOLEAN type.

  • Examples

    Assume that the table tbla contains a column named cola.

    -- The values are hashed into 4 parts based on the cola column, and the 1st part is taken. The return value is True.
    select * from tbla where sample (4, 1 , cola);
    -- Each row of data is randomly hashed into 4 parts, and the 2nd part is taken. The return value is True.
    select * from tbla where sample (4, 2);

SHA

  • Command format

    STRING SHA(STRING|BINARY <expr>)
  • Description

    Calculates the SHA-1 hash value of the STRING or BINARY expression expr and returns it as a hexadecimal string.

  • Parameters

    expr: Required. A value of the STRING or BINARY type.

  • Return value

    Returns a value of the STRING type. If the input parameter is NULL, NULL is returned.

  • Examples

    • Example 1: Calculates the SHA hash value of the string ABC. The following is a command example.

      -- Returns 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8.
      select sha('ABC');
    • Example 2: The input parameter is NULL. The following is a command example.

      -- Returns NULL.
      select sha(null);

SHA1

  • Command format

    string sha1(string|binary <expr>)
  • Description

    Calculates the SHA-1 hash value of the STRING or BINARY expression expr and returns it as a hexadecimal string.

  • Parameters

    expr: Required. A value of the STRING or BINARY type.

  • Return value

    Returns a value of the STRING type. If the input parameter is NULL, NULL is returned.

  • Examples

    • Example 1: Calculates the SHA-1 hash value of the string ABC. The following is a command example.

      -- Returns 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8.
      select sha1('ABC');
    • Example 2: The input parameter is NULL. The following is a command example.

      -- Returns NULL.
      select sha1(null);

SHA2

  • Command format

    string sha2(string|binary <expr>, bigint <number>)
  • Description

    Calculates the SHA-2 hash value of the STRING or BINARY expression expr and returns it in the format specified by number.

  • Parameters

    • expr: Required. A value of the STRING or BINARY type.

    • number: Required. A value of the BIGINT type. The hash bit length. The value must be 224, 256, 384, 512, or 0 (which is the same as 256).

  • Return value

    Returns a value of the STRING type. The following rules apply:

    • If any input parameter is NULL, NULL is returned.

    • If the value of number is not within the allowed range, NULL is returned.

  • Examples

    • Example 1: Calculates the SHA-2 hash value of the string ABC. The following is a command example.

      -- Returns b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78.
      select sha2('ABC', 256);
    • Example 2: An input parameter is NULL. The following is a command example.

      -- Returns NULL.
      select sha2('ABC', null);

STACK

  • Command format

    stack(n, expr1, ..., exprk) 
  • Description

    Splits expr1, ..., exprk into n rows. Unless otherwise specified, the output uses the default column names col0, col1, ....

  • Parameters

    • n: Required. The number of rows to split into.

    • expr: Required. The parameters to be split, expr1, ..., exprk, must be integers. The number of parameters must be an integer multiple of n to be split into n complete rows. Otherwise, an error is returned.

  • Return value

    Returns a dataset with n rows. The number of columns is the quotient of the number of parameters divided by n.

  • Examples

    -- Arrange 1, 2, 3, 4, 5, 6 into 3 rows.
    select stack(3, 1, 2, 3, 4, 5, 6);
    -- The following result is returned.
    +------+------+
    | col0 | col1 |
    +------+------+
    | 1    | 2    |
    | 3    | 4    |
    | 5    | 6    |
    +------+------+
    
    -- Arrange 'A',10,date '2015-01-01','B',20,date '2016-01-01' into two rows.
    select stack(2,'A',10,date '2015-01-01','B',20,date '2016-01-01') as (col0,col1,col2);
    -- The following result is returned.
    +------+------+------+
    | col0 | col1 | col2 |
    +------+------+------+
    | A    | 10   | 2015-01-01 |
    | B    | 20   | 2016-01-01 |
    +------+------+------+
    
    -- Arrange a, b, c, d into two rows. If the source table has multiple rows, the stack operation is performed row by row.
    select stack(2,a,b,c,d) as (col,value)
    from values 
        (1,1,2,3,4),
        (2,5,6,7,8),
        (3,9,10,11,12),
        (4,13,14,15,null)
    as t(key,a,b,c,d);
    -- The following result is returned.
    +------+-------+
    | col  | value |
    +------+-------+
    | 1    | 2     |
    | 3    | 4     |
    | 5    | 6     |
    | 7    | 8     |
    | 9    | 10    |
    | 11   | 12    |
    | 13   | 14    |
    | 15   | NULL  |
    +------+-------+
    
    -- Use with LATERAL VIEW.
    select tf.* from (select 0) t lateral view stack(2,'A',10,date '2015-01-01','B',20, date '2016-01-01') tf as col0,col1,col2;
    -- The following result is returned.
    +------+------+------+
    | col0 | col1 | col2 |
    +------+------+------+
    | A    | 10   | 2015-01-01 |
    | B    | 20   | 2016-01-01 |
    +------+------+------+

STR_TO_MAP

  • Command format

    STR_TO_MAP([STRING <mapDupKeyPolicy>,] <text> [, <delimiter1> [, <delimiter2>]])
  • Description

    Uses delimiter1 to split text into key-value pairs, and then uses delimiter2 to split each key-value pair into a key and a value.

  • Parameters

    • mapDupKeyPolicy: optional. A value of the STRING type. This parameter specifies the method that is used to process duplicate keys. Valid values:

      • exception: An error is returned.

      • last_win: The latter key overwrites the former key.

      You can also specify the odps.sql.map.key.dedup.policy parameter at the session level to configure the method that is used to process duplicate keys. For example, you can set odps.sql.map.key.dedup.policy to exception. If you do not specify this parameter, the default value last_win is used.

      Note

      The behavior implementation of MaxCompute is determined based on mapDupKeyPolicy. If you do not specify mapDupKeyPolicy, the value of odps.sql.map.key.dedup.policy is used.

    • text: Required. A value of the STRING type. The string to be split.

    • delimiter1: Optional. A value of the STRING type. The separator. If not specified, the default value is a comma (,).

    • delimiter2: Optional. A value of the STRING type. The separator. If not specified, the default value is an equal sign (=).

      Note

      If the separator is a regular expression or a special character, you must escape it with two backslashes (\\). Special characters include the colon (:), period (.), question mark (?), plus sign (+), and asterisk (*).

  • Return value

    The return value is of the map<string, string> type. The return value is the result of splitting text by delimiter1 and delimiter2.

  • Examples

    -- Returns {test1:1, test2:2}.
    select str_to_map('test1&1-test2&2','-','&');
    -- Returns {test1:1, test2:2}.
    select str_to_map("test1.1,test2.2", ",", "\\.");
    -- Returns {test1:1, test2:3}.
    select str_to_map("test1.1,test2.2,test2.3", ",", "\\.");

TABLE_EXISTS

  • Command format

    BOOLEAN TABLE_EXISTS(STRING <table_name>)
  • Description

    Checks whether a specified table exists.

  • Parameters

    table_name: Required. The table name. A value of the STRING type. The table name can include the project name, such as my_proj.my_table. If you do not specify a project name, the current project is used by default.

  • Return value

    Returns a value of the BOOLEAN type. The function returns True if the specified table exists. Otherwise, it returns False.

  • Examples

    -- Use in a SELECT list.
    select if(table_exists('abd'), col1, col2) from src;

TRANS_ARRAY

  • Limits

    • All columns that are used as keys must be placed first, and the columns to be transposed must be placed after them.

    • A SELECT statement can contain only one UDTF. No other columns can be included.

    • Cannot be used with GROUP BY, CLUSTER BY, DISTRIBUTE BY, or SORT BY.

  • Command format

    TRANS_ARRAY (<num_keys>, <separator>, <key1>,<key2>,…,<col1>,<col2>,<col3>) AS (<key1>,<key2>,...,<col1>, <col2>)
  • Description

    A user-defined table-valued function (UDTF) that converts one row of data into multiple rows. It converts an array that is stored in a column and delimited by a fixed separator into multiple rows.

  • Parameters

    • num_keys: Required. A BIGINT constant. The value must be >=0. The number of columns to be used as transpose keys when converting to multiple rows.

    • separator: Required. A STRING constant. The separator used to split the string into multiple elements. If this parameter is empty, an error is returned.

    • keys: Required. The columns to be used as keys during transposition. The number of columns is specified by num_keys. If num_keys specifies that all columns are used as keys (that is, num_keys is equal to the total number of columns), only one row is returned.

    • cols: Required. The arrays to be converted into rows. All columns after the keys are considered arrays to be transposed. They must be of the STRING type and store arrays in string format, such as Hangzhou;Beijing;Shanghai, which is an array delimited by semicolons (;).

  • Return value

    Returns the transposed rows. The new column names are specified by AS. The data types of the columns used as keys remain unchanged. All other columns are of the STRING type. The number of resulting rows is determined by the array with the most elements. Shorter arrays are padded with NULL.

  • Examples

    • Example 1: The table t_table contains the following data.

      +----------+----------+------------+
      | login_id | login_ip | login_time |
      +----------+----------+------------+
      | wangwangA | 192.168.0.1,192.168.0.2 | 20120101010000,20120102010000 |
      | wangwangB | 192.168.45.10,192.168.67.22,192.168.6.3 | 20120111010000,20120112010000,20120223080000 |
      +----------+----------+------------+
      -- Run the SQL statement.
      select trans_array(1, ",", login_id, login_ip, login_time) as (login_id,login_ip,login_time) from t_table;
      -- The following result is returned.
      +----------+----------+------------+
      | login_id | login_ip | login_time |
      +----------+----------+------------+
      | wangwangB | 192.168.45.10 | 20120111010000 |
      | wangwangB | 192.168.67.22 | 20120112010000 |
      | wangwangB | 192.168.6.3 | 20120223080000 |
      | wangwangA | 192.168.0.1 | 20120101010000 |
      | wangwangA | 192.168.0.2 | 20120102010000 |
      +----------+----------+------------+
      
      -- If the table contains the following data.
      Login_id LOGIN_IP LOGIN_TIME 
      wangwangA 192.168.0.1,192.168.0.2 20120101010000
      -- The insufficient data in the array is padded with NULL. 
      Login_id Login_ip Login_time 
      wangwangA 192.168.0.1 20120101010000
      wangwangA 192.168.0.2 NULL
    • Example 2: The table mf_fun_array_test_t contains the following data.

      +------------+------------+------------+------------+
      | id         | name       | login_ip   | login_time |
      +------------+------------+------------+------------+
      | 1          | Tom        | 192.168.100.1,192.168.100.2 | 20211101010101,20211101010102 |
      | 2          | Jerry      | 192.168.100.3,192.168.100.4 | 20211101010103,20211101010104 |
      +------------+------------+------------+------------+
      
      -- Use two keys, id and name, to convert to an array. Run the SQL statement.
      select trans_array(2, ",", Id,Name, login_ip, login_time) as (Id,Name,login_ip,login_time) from mf_fun_array_test_t;
      -- The following result is returned. The data is split and grouped by the keys id and name.
      +------------+------------+------------+------------+
      | id         | name       | login_ip   | login_time |
      +------------+------------+------------+------------+
      | 1          | Tom        | 192.168.100.1 | 20211101010101 |
      | 1          | Tom        | 192.168.100.2 | 20211101010102 |
      | 2          | Jerry      | 192.168.100.3 | 20211101010103 |
      | 2          | Jerry      | 192.168.100.4 | 20211101010104 |
      +------------+------------+------------+------------+

TRANS_COLS

  • Limits

    • All columns that are used as keys must be placed first, and the columns to be transposed must be placed after them.

    • A SELECT statement can contain only one UDTF. No other columns can be included.

  • Command format

    TRANS_COLS (<num_keys>, <key1>,<key2>,…,<col1>, <col2>,<col3>) AS (<idx>, <key1>,<key2>,…,<col1>, <col2>)
  • Description

    A UDTF that converts one row of data into multiple rows. It splits different columns into different rows.

  • Parameters

    • num_keys: Required. A BIGINT constant. The value must be >=0. The number of columns to be used as transpose keys when converting to multiple rows.

    • keys: Required. The columns to be used as keys during transposition. The number of columns is specified by num_keys. If num_keys specifies that all columns are used as keys (that is, num_keys is equal to the total number of columns), only one row is returned.

    • idx: Required. The row number after conversion.

    • cols: Required. The columns to be converted into rows.

  • Return value

    Returns the transposed rows. The new column names are specified by AS. The first column of the output is the transpose index, which starts from 1. The data types of the columns used as keys remain unchanged. All other columns retain their original data types.

  • Examples

    The table t_table contains the following data.

    +----------+----------+------------+
    | Login_id | Login_ip1 | Login_ip2 |
    +----------+----------+------------+
    | wangwangA | 192.168.0.1 | 192.168.0.2 |
    +----------+----------+------------+
    -- Run the SQL statement.
    select trans_cols(1, login_id, login_ip1, login_ip2) as (idx, login_id, login_ip) from t_table;
    -- The following result is returned.
    idx    login_id    login_ip
    1    wangwangA    192.168.0.1
    2    wangwangA    192.168.0.2

UNIQUE_ID

  • Command format

    string unique_id()
  • Description

    Returns a random unique ID, such as 29347a88-1e57-41ae-bb68-a9edbdd9****_1. This function is more efficient than the UUID function and returns a longer ID. Compared with a UUID, this ID includes an additional underscore (_) and a number, such as _1.

UUID

  • Command format

    string uuid()
  • Description

    Returns a random ID, such as 29347a88-1e57-41ae-bb68-a9edbdd9****.

    Note

    UUID returns a random global ID that has a very low probability of being repeated.