All Products
Search
Document Center

MaxCompute:Other functions

Last Updated:Mar 20, 2024

MaxCompute SQL provides other functions that are commonly used in the development process. You can use these functions based on your business requirements. This topic describes the command syntax and parameters of these functions, such as CAST, DECODE, and LEAST. This topic also provides examples on how to use these functions.

Function

Description

BASE64

Converts a binary value into a Base64-encoded string.

BETWEEN AND expression

Returns the values that fall in or fall out of the specified range.

CASE WHEN expression

Returns values based on the computing result of an expression.

CAST

Converts the result of an expression into the specified data type.

COALESCE

Returns the first non-null value in the parameter list.

COMPRESS

Uses the GZIP algorithm to compress input parameters of the STRING or BINARY type.

CRC32

Calculates the cyclic redundancy check (CRC) value of a value that is of the STRING or BINARY type.

DECODE

Implements the IF-THEN-ELSE logic.

DECOMPRESS

Uses the GZIP algorithm to decompress input parameters of the BINARY type.

GET_IDCARD_AGE

Returns an age in years based on the ID card number.

GET_IDCARD_BIRTHDAY

Returns the date of birth based on the ID card number.

GET_IDCARD_SEX

Returns the gender based on the ID card number.

GET_USER_ID

Obtains the ID of the current account.

GREATEST

Returns the maximum value of the input parameters.

HASH

Calculates a hash value based on the input parameters.

IF

Checks whether a specified condition is true.

LEAST

Returns the minimum value of the input parameters.

MAX_PT

Returns the name of the largest level-1 partition in a partitioned table.

NULLIF

Checks whether the values of two input parameters are the same.

NVL

Specifies the return values of the parameters whose values are null.

ORDINAL

Sorts the values of the input variables in ascending order and returns the value that is ranked at a specified position.

PARTITION_EXISTS

Checks whether a specified partition exists in a table.

SAMPLE

Samples all column values that are read and filters out the rows that do not meet sampling conditions.

SHA

Calculates the SHA-1 hash value of a value that is of the STRING or BINARY type.

SHA1

Calculates the SHA-1 hash value of a value that is of the STRING or BINARY type.

SHA2

Calculates the SHA-2 hash value of a value that is of the STRING or BINARY type.

STACK

Splits a specified parameter group into a specified number of rows.

STR_TO_MAP

Splits a string with a specified delimiter and returns a key-value pair.

TABLE_EXISTS

Checks whether a specified table exists.

TRANS_ARRAY

Transposes one row of data into multiple rows. This function is a user-defined table-valued function (UDTF) that transposes an array separated by fixed delimiters in a column into multiple rows.

TRANS_COLS

Transposes one row of data into multiple rows. This function is a UDTF that transposes columns into rows.

UNBASE64

Converts a Base64-encoded string into a binary value.

UNIQUE_ID

Returns a unique ID. This function is more efficient than the UUID function.

UUID

Returns a random ID.

BASE64

  • Syntax

    string base64(binary <value>)
  • Description

    Converts a binary value into a Base64-encoded string.

  • Parameters

    value: required. A value of the BINARY type, which is the value that you want to convert.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to null, null is returned.

  • Examples

    • Example 1: Convert the binary result of cast ('alibaba' as binary) into a Base64-encoded string. Sample statement:

      -- The return value is YWxpYmFiYQ==. 
      select base64(cast ('alibaba' as binary));
    • Example 2: The input parameter is set to null. Sample statement:

      -- The return value is null. 
      select base64(null);

BETWEEN AND expression

  • Syntax

    <a> [NOT] between <b> and <c>
  • Description

    Returns the values of field a. The values must fall in or fall out of the range between b and c.

  • Parameters

    • a: required. The field whose values you want to obtain.

    • b and c: required. The two parameters specify a value range. The data types of the two parameters must be the same as the data type of the a parameter.

  • Return value

    The values that fall in or fall out of the specified range are returned.

    If the a, b, or c parameter is empty, null is returned.

  • Examples

    The emp table contains the following data:

    | empno | ename | job | mgr | hiredate| sal| comm | deptno |
    7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
    7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
    7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
    7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
    7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
    7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
    7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
    7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
    7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
    7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
    7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
    7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
    7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
    7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
    7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
    7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
    7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10

    Query the data whose sal is from 1000 to 1500. Sample statement:

    select * from emp where sal between 1000 and 1500;

    The following result is returned:

    +-------+-------+-----+------------+------------+------------+------------+------------+
    | empno | ename | job | mgr        | hiredate   | sal        | comm       | deptno     |
    +-------+-------+-----+------------+------------+------------+------------+------------+
    | 7521  | WARD  | SALESMAN | 7698  | 1981-02-22 00:00:00 | 1250.0     | 500.0      | 30  |
    | 7654  | MARTIN | SALESMAN | 7698 | 1981-09-28 00:00:00 | 1250.0     | 1400.0     | 30 |
    | 7844  | TURNER | SALESMAN | 7698 | 1981-09-08 00:00:00 | 1500.0     | 0.0        | 30 |
    | 7876  | ADAMS | CLERK | 7788  | 1987-05-23 00:00:00 | 1100.0     | NULL     | 20   |
    | 7934  | MILLER | CLERK | 7782  | 1982-01-23 00:00:00 | 1300.0     | NULL      | 10  |
    | 7956  | TEBAGE | CLERK | 7748  | 1982-12-30 00:00:00 | 1300.0     | NULL      | 10  |
    +-------+-------+-----+------------+------------+------------+------------+------------+

CASE WHEN expression

  • Syntax

    MaxCompute provides the following CASE WHEN syntax:

    • case <value>
      when <value1> then <result1>
      when <value2> then <result2>
      ...
      else <resultn>
      end
    • case
      when (<_condition1>) then <result1>
      when (<_condition2>) then <result2>
      when (<_condition3>) then <result3>
      ...
      else <resultn>
      end
  • Description

    Returns the value of result based on the calculation result of value or _condition.

  • Parameters

    • value: required. The value that is used for comparison.

    • _condition: required. The condition that is used for comparison.

    • result: required. The return value.

  • Return value

    • If the data types of all result values are only BIGINT and DOUBLE, the values are returned after their data types are converted into the DOUBLE type.

    • If result values of the STRING type exist, the values are converted into the STRING type before they are returned. If a data type conversion is not supported, an error is returned. For example, data of the BOOLEAN type cannot be converted into the STRING type.

    • Conversions between other data types are not allowed.

  • Examples

    The sale_detail table contains the shop_name (STRING), customer_id (STRING), and total_price (DOUBLE) columns. This table contains the following data:

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    | null       | c5          | NULL        | 2014       | shanghai   |
    | s6         | c6          | 100.4       | 2014       | shanghai   |
    | s7         | c7          | 100.5       | 2014       | shanghai   |
    +------------+-------------+-------------+------------+------------+

    Sample statement:

    select 
    case  
    when region='china' then 'default_region'
    when region like 'shang%' then 'sh_region'
    end as region 
    from sale_detail;

    The following result is returned:

    +------------+
    | region     |
    +------------+
    | default_region |
    | default_region |
    | default_region |
    | sh_region  |
    | sh_region  |
    | sh_region  |
    +------------+

CAST

  • Syntax

    cast(<expr> as <type>)
  • Description

    Converts the data type of the data source specified by expr into the data type specified by type.

  • Parameters

    • expr: required. The data source whose data type you want to convert.

    • type: required. The data type into which you want to convert the data. Usage:

      • cast(double as bigint): converts a value of the DOUBLE type into the BIGINT type.

      • cast(string as bigint): converts a value of the STRING type into the BIGINT type. If the string consists of numerals expressed in the INTEGER form, the string is converted into the BIGINT type. If the string consists of numerals expressed in the FLOAT or EXPONENTIAL form, the string is converted into the DOUBLE type and then into the BIGINT type.

      • The default date format, yyyy-mm-dd hh:mi:ss, is used for cast(string as datetime) and cast(datetime as string).

  • Return value

    • A value of the specified data type is returned.

    • If you specify setproject odps.function.strictmode=false, the number that precedes the letter is returned.

    • If you specify setproject odps.function.strictmode=true, an error is returned.

    • If you convert a value into the DECIMAL type and set odps.sql.decimal.tostring.trimzero to true, a value with 0s after the decimal point removed is returned. If you convert a value into the DECIMAL type and set odps.sql.decimal.tostring.trimzero to false, a value with 0s after the decimal point retained is returned.

      Important

      The odps.sql.decimal.tostring.trimzero parameter takes effect only when data is read from a table. This parameter does not take effect on static values.

  • Examples

    • Example 1: common usage. Sample statement:

      -- The return value is 1. 
      select cast('1' as bigint);
    • Example 2: Convert a value of the STRING type into the BOOLEAN type. If the value of the STRING type is an empty string, false is returned. Otherwise, true is returned. Sample statement:

      • The value of the STRING type is an empty string.

        select cast("" as boolean);
        -- The return value is false.
        +------+
        | _c0  |
        +------+
        | false |
        +------+
      • The value of the STRING type is a non-empty string.

        select cast("false" as boolean);
        -- The return value is true.
        +------+
        | _c0  |
        +------+
        | true |
        +------+
    • Example 3: Convert a string into a date.

      -- Convert a string into a date.
      select cast("2022-12-20" as date);
      -- The following result is returned:
      +------------+
      | _c0        |
      +------------+
      | 2022-12-20 |
      +------------+
      
      -- Convert a date string that contains the hour, minute, and second parts into a date.
      select cast("2022-12-20 00:01:01" as date);
      -- The following result is returned:
      +------------+
      | _c0        |
      +------------+
      | NULL       |
      +------------+
      -- To ensure that a valid date is returned, run the following commands:
      set odps.sql.executionengine.enable.string.to.date.full.format= true;
      select cast("2022-12-20 00:01:01" as date);
      -- The following result is returned:
      +------------+
      | _c0        |
      +------------+
      | 2022-12-20 |
      +------------+
      Note

      By default, the odps.sql.executionengine.enable.string.to.date.full.format parameter is set to false. If you want to convert a date string that contains the hour, minute, and second parts, you must set this parameter to true.

    • Example 4: (Incorrect usage) If a type conversion fails or is not supported, an error is returned. Incorrect sample statement:

      select cast('abc' as bigint);
    • Example 5: setproject odps.function.strictmode=false is specified.

      setprojectodps.function.strictmode=false;
      select cast('123abc'as bigint);
      -- The following result is returned:
      +------------+
      |_c0|
      +------------+
      |123|
      +------------+
    • Example 6: setproject odps.function.strictmode=true is specified.

      setprojectodps.function.strictmode=true;
      select cast('123abc' as bigint);
      -- The following result is returned:
      FAILED:ODPS-0130071:[0,0]Semanticanalysisexception-physicalplangenerationfailed:java.lang.NumberFormatException:ODPS-0123091:Illegaltypecast-Infunctioncast,value'123abc'cannotbecastedfromStringtoBigint.
    • Example 7: The odps.sql.decimal.tostring.trimzero parameter is specified.

      -- Create a table.
      create table mf_dot (dcm1 decimal(38,18),
                           dcm2 decimal(38,18));
      -- Insert data into the table.
      insert into table mf_dot values (12.45500BD,12.3400BD);
      
      -- Set the odps.sql.decimal.tostring.trimzero parameter to true, or do not configure the odps.sql.decimal.tostring.trimzero parameter.
      set odps.sql.decimal.tostring.trimzero=true;
      -- Remove the 0s at the end of the decimal.
      select cast(round(dcm1,3) as string),cast(round(dcm2,3) as string) from mf_dot;
      -- The following result is returned:
      +------------+------------+
      | _c0        | _c1        |
      +------------+------------+
      | 12.455     | 12.34      |
      +------------+------------+
      
      -- Set the odps.sql.decimal.tostring.trimzero parameter to false.
      set odps.sql.decimal.tostring.trimzero=false;
      -- Retain the 0s at the end of the decimal.
      select cast(round(dcm1,3) as string),cast(round(dcm2,3) as string) from mf_dot;
      -- The following result is returned:
      +------------+------------+
      | _c0        | _c1        |
      +------------+------------+
      | 12.455     | 12.340     |
      +------------+------------+
      
      -- The odps.sql.decimal.tostring.trimzero parameter does not take effect for static values.
      set odps.sql.decimal.tostring.trimzero=false;
      select cast(round(12345.120BD,3) as string);
      -- The following result is returned:
      +------------+
      | _c0        |
      +------------+
      | 12345.12   |
      +------------+

COALESCE

  • Syntax

    coalesce(<expr1>, <expr2>, ...)
  • Description

    Returns the first non-null value in <expr1>, <expr2>, ....

  • Parameters

    expr: required. The values that you want to check.

  • Return value

    The data type of the return value is the same as the data type of the input parameter.

  • Examples

    • Example 1: common usage. Sample statement:

      -- The return value is 1. 
      select coalesce(null,null,1,null,3,5,7);
    • Example 2: If the data types of parameter values are not defined, an error is returned.

      • Incorrect sample statement:

        -- The value abc cannot be identified because the data type of the value abc is not defined. An error is returned. 
        select coalesce(null,null,1,null,abc,5,7);
      • Correct sample statement:

        select coalesce(null,null,1,null,'abc',5,7);
    • Example 3: If data is not read from a table and all the values of the input parameters are null, an error is returned. Incorrect sample statement:

      -- An error is returned because non-null values do not exist. 
      select coalesce(null,null,null,null);
    • Example 4: If data is read from a table and all the values of the input parameters are null, null is returned.

      Original data table:

      +-----------+-------------+------------+
      | shop_name | customer_id | toal_price |
      +-----------+-------------+------------+
      | ad        | 10001       | 100.0      |
      | jk        | 10002       | 300.0      |
      | ad        | 10003       | 500.0      |
      | tt        | NULL        | NULL       |
      +-----------+-------------+------------+

      The field values for the tt shop in the original table are all null. After the following statement is executed, the value null is returned.

      select coalesce(customer_id,total_price) from sale_detail where shop_name='tt';

COMPRESS

  • Syntax

    binary compress(string <str>)
    binary compress(binary <bin>)
  • Description

    Uses the GZIP algorithm to compress str or bin.

  • Parameters

    • str: required. A value of the STRING type.

    • bin: required. A value of the BINARY type,

  • Return value

    A value of the BINARY type is returned. If the input parameter is set to null, null is returned.

  • Examples

    • Example 1: Use the GZIP algorithm to compress string hello. Sample statement:

      -- The return value is =1F=8B=08=00=00=00=00=00=00=03=CBH=CD=C9=C9=07=00=86=A6=106=05=00=00=00. 
      select compress('hello');
    • Example 2: The input parameter is empty. Sample statement:

      -- The return value is =1F=8B=08=00=00=00=00=00=00=03=03=00=00=00=00=00=00=00=00=00. 
      select compress('');
    • Example 3: The input parameter is set to null. Sample statement:

      -- The return value is null. 
      select compress(null);

CRC32

  • Syntax

    bigint crc32(string|binary <expr>)
  • Description

    Calculates the CRC value of a value specified by expr. The value is of the STRING or BINARY type.

  • Parameters

    expr: required. A value of the STRING or BINARY type.

  • Return value

    A value of the BIGINT type is returned. The return value varies based on the following rules:

    • If the input parameter is set to null, null is returned.

    • If an input parameter is left empty, 0 is returned.

  • Examples

    • Example 1: Calculate the CRC value of string ABC. Sample statement:

      -- The return value is 2743272264. 
      select crc32('ABC');
    • Example 2: The input parameter is set to null. Sample statement:

      -- The return value is null. 
      select crc32(null);

DECODE

  • Syntax

    decode(<expression>, <search>, <result>[, <search>, <result>]...[, <default>])
  • Description

    Implements the IF-THEN-ELSE logic.

  • Parameters

    • expression: required. The expression that you want to compare.

    • search: required. The search item that is used to compare with expression.

    • result: required. The value that is returned when the value of search is the same as the value of expression.

    • default: optional. If no search items match the expression, the value of default is returned. If no value is specified for this parameter, null is returned.

    Note
    • Except for the NULL values, all other values of the result parameter must be of the same data type. If the values are of different data types, an error is returned.

    • The values of search and expression must be of the same data type. Otherwise, an error is returned.

  • Return value

    • If a search item matches the expression, result is returned.

    • If no search item matches the expression, default is returned.

    • If no value is specified for the default parameter, null is returned.

    • If duplicate search items match the expression, the value of the first search item is returned.

    • In most cases, null is returned when MaxCompute SQL calculates NULL=NULL. However, the DECODE function considers that the two null values are the same.

  • Examples

    The sale_detail table contains the shop_name (STRING), customer_id (STRING), and total_price (DOUBLE) columns. This table contains the following data:

    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    | null       | c5          | NULL        | 2014       | shanghai   |
    | s6         | c6          | 100.4       | 2014       | shanghai   |
    | s7         | c7          | 100.5       | 2014       | shanghai   |
    +------------+-------------+-------------+------------+------------+

    Sample statement:

    -- If the value of customer_id is c1, Taobao is returned. If the value is c2, Alipay is returned. If the value is c3, Aliyun is returned. If the value is null, N/A is returned. In other cases, Others is returned. 
    select
    decode(customer_id,
    'c1', 'Taobao',
    'c2', 'Alipay',
    'c3', 'Aliyun',
    Null, 'N/A',
    'Others') as result
    from sale_detail;
    -- The preceding statement is equivalent to the following statement: 
    if customer_id = c1 then
    result := 'Taobao';
    elsif customer_id = c2 then
    result := 'Alipay';
    elsif customer_id = c3 then
    result := 'Aliyun';
    ...
    else
    result := 'Others';
    end if;

    The following result is returned:

    +------------+
    | result     |
    +------------+
    | Others     |
    | Others     |
    | Others     |
    | Taobao     |
    | Alipay     |
    | Aliyun     |
    +------------+

DECOMPRESS

  • Syntax

    binary decompress(binary <bin>)
  • Description

    Uses the GZIP algorithm to decompress bin.

  • Parameters

    bin: required. A value of the BINARY type.

  • Return value

    A value of the BINARY type is returned. If the input parameter is set to null, null is returned.

  • Examples

    • Example 1: Decompress the compression result of string hello, world, and convert the decompression result into a string. Sample statement:

      -- The return value is hello, world. 
      select cast(decompress(compress('hello, world')) as string);
    • Example 2: The input parameter is set to null. Sample statement:

      -- The return value is null. 
      select decompress(null);

GET_IDCARD_AGE

  • Syntax

    get_idcard_age(<idcardno>)
  • Description

    Returns the current age based on the ID card number. The current age is the current year minus the birth year in the ID card number.

  • Parameters

    idcardno: required. A 15-digit or 18-digit ID card number of the STRING type. During calculation, the validity of the ID card is checked based on the province code and the last digit of the ID card number. If the check fails, null is returned.

  • Return value

    A value of the BIGINT type is returned. If the input parameter is set to null, null is returned.

GET_IDCARD_BIRTHDAY

  • Syntax

    get_idcard_birthday(<idcardno>)
  • Description

    Returns the date of birth based on the ID card number.

  • Parameters

    idcardno: required. A 15-digit or 18-digit ID card number of the STRING type. During calculation, the validity of the ID card is checked based on the province code and the last digit of the ID card number. If the check fails, null is returned.

  • Return value

    A value of the DATETIME type is returned. If the input parameter is set to null, null is returned.

GET_IDCARD_SEX

  • Syntax

    get_idcard_sex(<idcardno>)
  • Description

    Returns the gender based on the ID card number. Valid values: M and F. M indicates male and F indicates female.

  • Parameters

    idcardno: required. The 15-digit or 18-digit ID card number of the STRING type. During calculation, the validity of the ID card is checked based on the province code and the last digit of the ID card number. If the check fails, null is returned.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to null, null is returned.

GET_USER_ID

  • Syntax

    get_user_id()
  • Description

    Obtains the ID of the current account, which is the user ID or user identifier (UID).

  • Parameters

    No parameters are required.

  • Return value

    The ID of the current account is returned.

  • Examples

    select get_user_id();
    -- The following result is returned: 
    +------------+
    | _c0        |
    +------------+
    | 1117xxxxxxxx8519 |
    +------------+

GREATEST

  • Syntax

    greatest(<var1>, <var2>[,...])
  • Description

    Returns the maximum value of the input parameters.

  • Parameters

    var1 and var2: required. The parameters are of the BIGINT, DOUBLE, DECIMAL, DATETIME, or STRING type.

  • Return value

    • The maximum value of the input parameters is returned. If no implicit conversion is performed, the return value is of the same data type as the input parameter.

    • The value null is interpreted as the minimum value.

    • If the input parameters are of different data types, the input parameters of the DOUBLE, BIGINT, DECIMAL, and STRING types are converted into the DOUBLE type for comparison, and the input parameters of the STRING and DATETIME types are converted into the DATETIME type for comparison. Implicit conversions of other data types are not allowed.

    • If odps.sql.hive.compatible is set to true and an input parameter is set to null, null is returned.

HASH

  • Syntax

    • Syntax for MaxCompute projects that use the Hive-compatible data type edition:

      int hash(<value1>, <value2>[, ...]);
    • Syntax for MaxCompute projects that do not use the Hive-compatible data type edition:

      bigint hash(<value1>, <value2>[, ...]);
  • Description

    Returns a hash value that is obtained after a hash operation is performed on value1 and value2.

  • Parameters

    value1 and value2: required. These parameters specify the parameters on which you want to perform a hash operation. The parameters can be of different data types. The data types supported in the Hive-compatible data type edition and non-Hive-compatible data type editions are different.

    • Data types supported in the Hive-compatible data type edition: TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, BOOLEAN, STRING, CHAR, VARCHAR, DATETIME, and DATE.

    • Data types supported in non-Hive-compatible data type editions: BIGINT, DOUBLE, BOOLEAN, STRING, and DATETIME.

    Note

    If the two input parameters are set to the same value, the returned hash values are the same. However, if the two returned hash values are the same, the values of the two input parameters may not be the same, and a hash collision may occur.

  • Return value

    A value of the INT or BIGINT type is returned. If an input parameter is left empty or set to null, 0 is returned.

  • Examples

    • Example 1: Calculate the hash value of the input parameters that are of the same data type. Sample statement:

      -- The return value is 66. 
      select hash(0, 2, 4);
    • Example 2: Calculate the hash value of the input parameters that are of different data types. Sample statement:

      -- The return value is 97. 
      select hash(0, 'a');
    • Example 3: An input parameter is left empty or is set to null. Sample statement:

      -- The return value is 0. 
      select hash(0, null);
      -- The return value is 0. 
      select hash(0, '');

IF

  • Syntax

    if(<testCondition>, <valueTrue>, <valueFalseOrNull>)
  • Description

    Checks whether testCondition is true. If testCondition is true, the value of valueTrue is returned. Otherwise, the value of valueFalseOrNull is returned.

  • Parameters

    • testCondition: required. The expression that you want to evaluate. The value is of the BOOLEAN type.

    • valueTrue: required. The value that is returned when testCondition is true.

    • valueFalseOrNull: the value that is returned when testCondition is false. You can set this parameter to null.

  • Return value

    The data type of the return value is the same as the data type of valueTrue or valueFalseOrNull.

  • Examples

    -- The return value is 200. 
    select if(1=2, 100, 200); 

LEAST

  • Syntax

    least(<var1>, <var2>[,...])
  • Description

    Returns the minimum value of the input parameters.

  • Parameters

    var: required. The values of the input parameters. The parameters are of the BIGINT, DOUBLE, DECIMAL, DATETIME, or STRING type.

  • Return value

    • The minimum value of the input parameters is returned. If no implicit conversion is performed, the return value is of the same data type as the input parameter.

    • If a data type conversion is performed among the DOUBLE, BIGINT, and STRING types, a value of the DOUBLE type is returned. If a data type conversion is performed between the STRING and DATETIME types, a value of the DATETIME type is returned. If a data type conversion is performed among the DECIMAL, DOUBLE, BIGINT, and STRING types, a value of the DECIMAL type is returned. Implicit conversions of other data types are not allowed.

    • The value null is interpreted as the minimum value.

    • If the values of all input parameters are null, null is returned.

  • Examples

    -- The return value is 2. 
    select least(5, 2, 7);

MAX_PT

  • Syntax

    max_pt(<table_full_name>)
  • Description

    Returns the name of the largest level-1 partition that contains data in a partitioned table and reads the data of this partition. This function determines the largest partition by sorting partitions in alphabetical order.

    You can also use a standard SQL statement instead of the statement in which the MAX_PT function is used. For example, you can use select * from table where pt = (select max(pt) from table); instead of select * from table where pt = max_pt("table");.

    Note

    MaxCompute does not provide the MIN_PT function. If you need to obtain the smallest partition in which data is stored in a partitioned table, you cannot use the SQL statement select * from table where pt = min_pt("table");. Instead, you can use the standard SQL statement select * from table where pt = (select min(pt) from table); to achieve the same effect as the MAX_PT function.

  • Parameters

    table_full_name: required. A value of the STRING type. This parameter specifies the name of the table. You must have read permissions on the table.

  • Return value

    The name of the largest partition is returned.

    Note

    If a partition is added by using the ALTER TABLE statement and the partition contains no data, the name of this partition is not returned.

  • Examples

    • Example 1: The tbl table is a partitioned table. The partitions in the table are 20120901 and 20120902, both of which contain data. If you execute the following statement, this function returns '20120902', and the MaxCompute SQL statement reads data from the 20120902 partition. Sample statement:

      select * from tbl where pt=max_pt('myproject.tbl');
      -- The preceding statement is equivalent to the following statement: 
      select * from tbl where pt = (select max(pt) from myproject.tbl);
    • Example 2: If a partitioned table contains multiple levels of partitions, use the standard SQL statement to obtain data from the largest partition. Sample statement:

      select * from table where pt1 = (select max(pt1) from table) and pt2 = (select max(pt2) from table where pt1 = (select max(pt1) from table));

NULLIF

  • Syntax

    T nullif(T <expr1>, T <expr2>)
  • Description

    Compares the values of expr1 and expr2. If the values are the same, null is returned. If the values are different, the value of expr1 is returned.

  • Parameters

    expr1 and expr2: required. Expressions of any data type. T specifies the type of input data. The type can be a data type supported by MaxCompute.

  • Return value

    The value of expr1 or null is returned.

  • Examples

    -- The return value is 2. 
    select nullif(2, 3);
    -- The return value is null. 
    select nullif(2, 2);
    -- The return value is 3. 
    select nullif(3, null);

NVL

  • Syntax

    nvl(T <value>, T <default_value>)
  • Description

    Returns default_value if value is null. Otherwise, value is returned. The value and default_value parameters must be of the same data type.

  • Parameters

    • value: required. The input parameter. T specifies the type of input data. The type can be a data type supported by MaxCompute.

    • default_value: required. The value that is used to replace null. The data type of default_value must be the same as the data type of value.

  • Examples

    A table named t_data contains three columns: c1 string, c2 bigint, and c3 datetime. The c1 column is of the STRING type. The c2 column is of the BIGINT type. The c3 column is of the DATETIME type. This table contains the following data:

    +----+------------+------------+
    | c1 | c2 | c3 |
    +----+------------+------------+
    | NULL | 20 | 2017-11-13 05:00:00 |
    | ddd | 25 | NULL |
    | bbb | NULL | 2017-11-12 08:00:00 |
    | aaa | 23 | 2017-11-11 00:00:00 |
    +----+------------+------------+

    After the NVL function is called, the null value in c1 is returned as 00000, the null value in c2 is returned as 0, and the null value in c3 is returned as a hyphen (-). Sample statement:

    select nvl(c1,'00000'),nvl(c2,0),nvl(c3,'-') from nvl_test;
    -- The following result is returned: 
    +-----+------------+-----+
    | _c0 | _c1 | _c2 |
    +-----+------------+-----+
    | 00000 | 20 | 2017-11-13 05:00:00 |
    | ddd | 25 | - |
    | bbb | 0 | 2017-11-12 08:00:00 |
    | aaa | 23 | 2017-11-11 00:00:00 |
    +-----+------------+-----+

ORDINAL

  • Syntax

    ordinal(bigint <nth>, <var1>, <var2>[,...])
  • Description

    Sorts the values of the input variables in ascending order and returns the value that is ranked nth.

  • Parameters

    • nth: required. The position of the value, which starts from 1. A value of the BIGINT type. This parameter specifies the position of the value that you want to return. If the value is null, null is returned.

    • var: required. The values that you want to sort. A value of the BIGINT, DOUBLE, DATETIME, or STRING type.

  • Return value

    • The value that is ranked nth is returned. If no implicit conversion is performed, the return value is of the same data type as the input parameter.

    • If a data type conversion is performed between the DOUBLE, BIGINT, and STRING types, a value of the DOUBLE type is returned. If a data type conversion is performed between the STRING and DATETIME types, a value of the DATETIME type is returned. Implicit conversions of other data types are not allowed.

    • The value null is interpreted as the minimum value.

  • Examples

    -- The return value is 3. 
    SELECT ordinal(3, 1, 3, 7, 5, 2, 4, 6); 

PARTITION_EXISTS

  • Syntax

    boolean partition_exists(string <table_name>, string... <partitions>)
  • Description

    Checks whether a specified partition exists in a table.

  • Parameters

    • table_name: required. The table name, which is of the STRING type. You can specify a project name in the table name. The name of a table can be my_proj.my_table. If you do not specify a project name, the current project name is used.

    • partitions: required. The names of partitions, which are of the STRING type. In this parameter, you must specify the values of partition key columns in a table based on the sequence of the columns. The number of values must be the same as the number of partition key columns.

  • Return value

    A value of the BOOLEAN type is returned. If the specified partition exists, True is returned. Otherwise, False is returned.

  • Examples

    -- Create a partitioned table named foo. 
    create table foo (id bigint) partitioned by (ds string, hr string);
    -- Add partitions to the partitioned table foo. 
    alter table foo add partition (ds='20190101', hr='1');
    -- Check whether partitions 20190101 and 1 exist. True is returned. 
    select partition_exists('foo', '20190101', '1');

SAMPLE

  • Syntax

    boolean sample(<x>, <y>, [<column_name1>, <column_name2>[,...]])
  • Description

    Samples all values that are read from column_name based on x and y, and filters out the rows that do not meet sampling conditions.

  • Parameters

    • x and y: x is required. x and y are integer constants that are greater than 0. Their values are of the BIGINT type. The two parameters indicate that the values fall into x portions based on the hash function and the yth portion is used.

      y is optional. If no value is specified for the y parameter, the first option is used, and you do not need to specify column_name.

      If x or y is of another data type, the value of x or y is less than or equal to 0, or y is greater than x, an error is returned. If the value of x or y is null, null is returned.

    • column_name: optional. The name of the column on which sampling is performed. If no value is specified for this parameter, random sampling is performed based on the values of x and y. The name of the column can be of any data type, and the column value can be null. Implicit conversions are not performed. If column_name is set to null, an error is returned.

      Note
      • To prevent data skew due to the null value, uniform hashing is performed on the null values in column_name in x portions. If no value is specified for the column_name parameter and the amount of data is small, the output is not necessarily uniform. In this case, we recommend that you specify a value for column_name to obtain a uniform output.

      • Random sampling can be performed only on columns of the following data types: BIGINT, DATETIME, BOOLEAN, DOUBLE, STRING, BINARY, CHAR, and VARCHAR.

  • Return value

    A value of the BOOLEAN type is returned.

  • Examples

    The tbla table contains the cola column.

    -- The values in the cola column fall into four portions based on the hash function, and the first portion is used. True is returned. 
    select * from tbla where sample (4, 1 , cola);
    -- The values in each row are randomly hashed to four portions, and the second portion is used. True is returned. 
    select * from tbla where sample (4, 2);

SHA

  • Syntax

    string sha(string|binary <expr>)
  • Description

    Calculates the SHA-1 hash value of expr that is of the STRING or BINARY type and returns the SHA-1 hash value in hexadecimal string format.

  • Parameters

    expr: required. A value of the STRING or BINARY type.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to null, null is returned.

  • Examples

    • Example 1: Calculate the SHA hash value of string ABC. Sample statement:

      -- The return value is 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8. 
      select sha('ABC');
    • Example 2: The input parameter is set to null. Sample statement:

      -- The return value is null. 
      select sha(null);

SHA1

  • Syntax

    string sha1(string|binary <expr>)
  • Description

    Calculates the SHA-1 hash value of expr that is of the STRING or BINARY type and returns the SHA-1 hash value in hexadecimal string format.

  • Parameters

    expr: required. A value of the STRING or BINARY type.

  • Return value

    A value of the STRING type is returned. If the input parameter is set to null, null is returned.

  • Examples

    • Example 1: Calculate the SHA-1 hash value of string ABC. Sample statement:

      -- The return value is 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8. 
      select sha1('ABC');
    • Example 2: The input parameter is set to null. Sample statement:

      -- The return value is null. 
      select sha1(null);

SHA2

  • Syntax

    string sha2(string|binary <expr>, bigint <number>)
  • Description

    Calculates the SHA-2 hash value of expr that is of the STRING or BINARY type and returns the SHA-2 hash value in the format specified by number.

  • Parameters

    • expr: required. A value of the STRING or BINARY type.

    • number: required. A value of the BIGINT type. This parameter specifies the hash bit length, which must be 224, 256, 384, 512, or 0. The value that is returned when this parameter is set to 256 is the same as the value that is returned when this parameter is set to 0.

  • Return value

    A value of the STRING type is returned. The return value varies based on the following rules:

    • An input parameter is set to null.

    • The value of number does not fall in the valid value range.

  • Examples

    • Example 1: Calculate the SHA-2 hash value of string ABC. Sample statement:

      -- The return value is b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78. 
      select sha2('ABC', 256);
    • Example 2: An input parameter is set to null. Sample statement:

      -- The return value is null. 
      select sha2('ABC', null);

STACK

  • Syntax

    stack(n, expr1, ..., exprk) 
  • Description

    Splits expr1, ..., exprk into n rows. Unless otherwise specified, the output result uses the default column names col0, col1....

  • Parameters

    • n: required. The number of rows obtained after splitting.

    • expr: required. The parameter that you want to split. expr1,... exprk must be of the INTEGER type, and the number of parameters must be an integer multiple of n. The parameter must be able to be split into n complete rows. Otherwise, an error is returned.

  • Return value

    n rows with a specific number of columns are returned. The number of columns is equal to the number of parameters divided by n.

  • Examples

    -- Split the parameter group of 1, 2, 3, 4, 5, 6 into three rows. 
    select stack(3, 1, 2, 3, 4, 5, 6);
    -- The following result is returned: 
    +------+------+
    | col0 | col1 |
    +------+------+
    | 1    | 2    |
    | 3    | 4    |
    | 5    | 6    |
    +------+------+
    
    -- Split 'A',10,date '2015-01-01','B',20,date '2016-01-01' into two rows.
    select stack(2,'A',10,date '2015-01-01','B',20,date '2016-01-01') as (col0,col1,col2);
    -- The following result is returned: 
    +------+------+------+
    | col0 | col1 | col2 |
    +------+------+------+
    | A    | 10   | 2015-01-01 |
    | B    | 20   | 2016-01-01 |
    +------+------+------+
    
    -- Split the parameter group of a, b, c, and d into two rows. If the source table contains multiple rows, this function is called for each row. 
    select stack(2,a,b,c,d) as (col,value)
    from values 
        (1,1,2,3,4),
        (2,5,6,7,8),
        (3,9,10,11,12),
        (4,13,14,15,null)
    as t(key,a,b,c,d);
    -- The following result is returned: 
    +------+-------+
    | col  | value |
    +------+-------+
    | 1    | 2     |
    | 3    | 4     |
    | 5    | 6     |
    | 7    | 8     |
    | 9    | 10    |
    | 11   | 12    |
    | 13   | 14    |
    | 15   | NULL  |
    +------+-------+
    
    -- Use this function with the LATERAL VIEW clause. 
    select tf.* from (select 0) t lateral view stack(2,'A',10,date '2015-01-01','B',20, date '2016-01-01') tf as col0,col1,col2;
    -- The following result is returned: 
    +------+------+------+
    | col0 | col1 | col2 |
    +------+------+------+
    | A    | 10   | 2015-01-01 |
    | B    | 20   | 2016-01-01 |
    +------+------+------+

STR_TO_MAP

  • Syntax

    str_to_map([string <mapDupKeyPolicy>,] <text> [, <delimiter1> [, <delimiter2>]])
  • Description

    Splits text into key-value pairs by using delimiter1 and then separates keys from values in the key-value pairs by using delimiter2.

  • Parameters

    • mapDupKeyPolicy: optional. A value of the STRING type. This parameter specifies the method that is used to process duplicate keys. Valid values:

      • exception: An error is returned.

      • last_win: The latter key overwrites the former key.

      You can also specify the odps.sql.map.key.dedup.policy parameter at the session level to configure the method that is used to process duplicate keys. For example, you can set odps.sql.map.key.dedup.policy to exception. If you do not specify this parameter, the default value last_win is used.

      Note

      The behavior implementation of MaxCompute is determined based on mapDupKeyPolicy. If you do not specify mapDupKeyPolicy, the value of odps.sql.map.key.dedup.policy is used.

    • text: required. The string that you want to split, which is of the STRING type.

    • delimiter1: optional. The delimiter of the STRING type. If no value is specified for this parameter, commas (,) are used.

    • delimiter2: optional. The delimiter of the STRING type. If no value is specified for this parameter, equal signs (=) are used.

      Note

      If the delimiter is a regular expression or special character, you must add two backslashes (\\) before the delimiter for escaping. The following special characters can be used as a delimiter: . ? + * :.

  • Return value

    A value of the MAP<STRING, STRING> type is returned. The return value indicates that the string specified by text is split by using delimiter1 and delimiter2.

  • Examples

    -- The return value is {test1:1, test2:2}. 
    select str_to_map('test1&1-test2&2','-','&');
    -- The return value is {test1:1, test2:2}. 
    select str_to_map("test1.1,test2.2", ",", "\\.");
    -- The return value is {test1:1, test2:3}. 
    select str_to_map("test1.1,test2.2,test2.3", ",", "\\.");

TABLE_EXISTS

  • Syntax

    boolean table_exists(string <table_name>)
  • Description

    Checks whether a specified table exists.

  • Parameters

    table_name: required. The name of the table. A value of the STRING type. You can specify a project name in the table name. The name of a table can be my_proj.my_table. If you do not specify a project name, the current project name is used.

  • Return value

    A value of the BOOLEAN type is returned. If the specified table exists, True is returned. Otherwise, False is returned.

  • Examples

    -- Use this function for the list in the SELECT statement. 
    select if(table_exists('abd'), col1, col2) from src;

TRANS_ARRAY

  • Limits

    • All columns that are used as keys must be placed before the columns that are to be transposed.

    • Only one UDTF is allowed in a SELECT statement.

    • This function cannot be used with the GROUP BY, CLUSTER BY, DISTRIBUTE BY, or SORT BY clause.

  • Syntax

    trans_array (<num_keys>, <separator>, <key1>,<key2>,...,<col1>,<col2>,<col3>) as (<key1>,<key2>,...,<col1>, <col2>)
  • Description

    Transposes one row of data into multiple rows. This function is a UDTF that transposes an array separated by fixed delimiters in a column into multiple rows.

  • Parameters

    • num_keys: required. The value is a constant of the BIGINT type and must be greater than or equal to 0. This parameter specifies the number of columns that can be used as keys when you transpose one row into multiple rows.

    • separator: required. The value is a constant of the STRING type. This parameter is used to split a string into multiple elements. If this parameter is left empty, an error is returned.

    • keys: required. The columns that are used as keys when you transpose one row into multiple rows. The number of keys is specified by num_keys. If all columns are used as keys and num_keys is equal to the total number of all columns, only one row is returned.

    • cols: required. This parameter specifies the array that you want to transpose into rows. All columns that follow keys are considered arrays that you want to transpose. The value of this parameter must be of the STRING type to store arrays in the STRING format, such as Hangzhou;Beijing;Shanghai. The values in this array are separated by semicolons (;).

  • Return value

    Transposed rows are returned. The new column name is specified by as. The data types of columns that are used as keys remain unchanged. All other columns are of the STRING type. The number of transposed rows is based on the array with the maximum number of elements. If the number of rows is insufficient, the value null is added.

  • Examples

    • Example 1: The t_table table contains the following data:

      +----------+----------+------------+
      | login_id | login_ip | login_time |
      +----------+----------+------------+
      | wangwangA | 192.168.0.1,192.168.0.2 | 20120101010000,20120102010000 |
      | wangwangB | 192.168.45.10,192.168.67.22,192.168.6.3 | 20120111010000,20120112010000,20120223080000 |
      +----------+----------+------------+
      -- Execute the following SQL statement: 
      select trans_array(1, ",", login_id, login_ip, login_time) as (login_id,login_ip,login_time) from t_table;
      -- The following result is returned: 
      +----------+----------+------------+
      | login_id | login_ip | login_time |
      +----------+----------+------------+
      | wangwangB | 192.168.45.10 | 20120111010000 |
      | wangwangB | 192.168.67.22 | 20120112010000 |
      | wangwangB | 192.168.6.3 | 20120223080000 |
      | wangwangA | 192.168.0.1 | 20120101010000 |
      | wangwangA | 192.168.0.2 | 20120102010000 |
      +----------+----------+------------+
      
      -- The table contains the following data: 
      Login_id LOGIN_IP LOGIN_TIME 
      wangwangA 192.168.0.1,192.168.0.2 20120101010000
      -- The value null is added to supplement the array in which data is insufficient.  
      Login_id Login_ip Login_time 
      wangwangA 192.168.0.1 20120101010000
      wangwangA 192.168.0.2 NULL
    • Example 2: The mf_fun_array_test_t table contains the following data:

      +------------+------------+------------+------------+
      | id         | name       | login_ip   | login_time |
      +------------+------------+------------+------------+
      | 1          | Tom        | 192.168.100.1,192.168.100.2 | 20211101010101,20211101010102 |
      | 2          | Jerry      | 192.168.100.3,192.168.100.4 | 20211101010103,20211101010104 |
      +------------+------------+------------+------------+
      
      -- Use the id and name columns as keys to transpose data in the table. Execute the following SQL statement: 
      select trans_array(2, ",", Id,Name, login_ip, login_time) as (Id,Name,login_ip,login_time) from mf_fun_array_test_t;
      -- The following result is returned: 
      +------------+------------+------------+------------+
      | id         | name       | login_ip   | login_time |
      +------------+------------+------------+------------+
      | 1          | Tom        | 192.168.100.1 | 20211101010101 |
      | 1          | Tom        | 192.168.100.2 | 20211101010102 |
      | 2          | Jerry      | 192.168.100.3 | 20211101010103 |
      | 2          | Jerry      | 192.168.100.4 | 20211101010104 |
      +------------+------------+------------+------------+

TRANS_COLS

  • Limits

    • All columns that are used as keys must be placed before the columns that are to be transposed.

    • Only one UDTF is allowed in a SELECT statement.

  • Syntax

    trans_cols (<num_keys>, <key1>,<key2>,...,<col1>, <col2>,<col3>) as (<idx>, <key1>,<key2>,...,<col1>, <col2>)
  • Description

    Transposes one row of data into multiple rows. This function is a UDTF that transposes columns into rows.

  • Parameters

    • num_keys: required. The value is a constant of the BIGINT type and must be greater than or equal to 0. This parameter specifies the number of columns that can be used as keys when you transpose one row into multiple rows.

    • keys: required. This parameter specifies the columns that are used as keys when you transpose one row into multiple rows. The number of keys is specified by num_keys. If all columns are used as keys and the value of num_keys is equal to the total number of all columns, only one row is returned.

    • idx: required. The ID of a row after the row is transposed.

    • cols: required. The columns that you want to transpose into rows.

  • Return value

    Transposed rows are returned. The new column name is specified by as. The first output column is the transposed subscript, which starts from 1. The data types of the columns that are used as keys remain unchanged, and the data types of other columns remain unchanged.

  • Examples

    The t_table table contains the following data:

    +----------+----------+------------+
    | Login_id | Login_ip1 | Login_ip2 |
    +----------+----------+------------+
    | wangwangA | 192.168.0.1 | 192.168.0.2 |
    +----------+----------+------------+
    -- Execute the following SQL statement: 
    select trans_cols(1, login_id, login_ip1, login_ip2) as (idx, login_id, login_ip) from t_table;
    -- The following result is returned: 
    idx    login_id    login_ip
    1    wangwangA    192.168.0.1
    2    wangwangA    192.168.0.2

UNBASE64

  • Syntax

    binary unbase64(string <str>)
  • Description

    Converts a Base64-encoded string specified by str into a binary value.

  • Parameters

    str: required. A value of the STRING type. It is a Base64-encoded string that you want to convert.

  • Return value

    A value of the BINARY type is returned. If the input parameter is set to null, null is returned.

  • Examples

    • Example 1: Convert the string YWxpYmFiYQ== into a binary value. Sample statement:

      -- The return value is alibaba. 
      select unbase64('YWxpYmFiYQ==');
    • Example 2: The input parameter is set to null. Sample statement:

      -- The return value is null. 
      select unbase64(null);

UNIQUE_ID

  • Syntax

    string unique_id()
  • Description

    Returns a unique ID, such as 29347a88-1e57-41ae-bb68-a9edbdd9****_1. This function is more efficient than the UUID function, and the returned ID is longer. Compared with the UUID function, this function returns a unique ID that ends with the suffix "_Digit", such as _1.

UUID

  • Syntax

    string uuid()
  • Description

    Returns a random ID, such as 29347a88-1e57-41ae-bb68-a9edbdd9****.

    Note

    The return value is a random global ID, which is unique in most cases.