SQL functions allow you to reference SQL user-defined functions (UDFs) in SQL scripts. This topic describes how to manage SQL functions.

Background information

SQL functions of MaxCompute can be used to resolve the following issues:
  • In most cases, a large amount of similar code exists, which is inconvenient to maintain and prone to errors. To use UDFs, you must develop code, compile the code in Java, and then create resources and functions. The process is complicated. In addition, UDFs cannot catch up with built-in functions in terms of performance. The following code provides an example:
    SELECT
        NVL(STR_TO_MAP(GET_JSON_OBJECT(col, '$.key1')), 'default') AS key1,
        NVL(STR_TO_MAP(GET_JSON_OBJECT(col, '$.key2')), 'default') AS key2,
        ...
        NVL(STR_TO_MAP(GET_JSON_OBJECT(col, '$.keyN')), 'default') AS keyN
    FROM t;
  • One function can be passed to another function as a parameter to another parameter by using code similar to lambda expressions in Java.

Features

SQL functions are a type of UDFs. They allow you to create UDFs in SQL and use function type parameters and anonymous functions. This way, you can define your business logic more flexibly. You can use SQL functions to simplify feature implementation and improve code reuse. SQL functions provide the following features:
  • SQL functions allow you to reference and call SQL UDFs in SQL scripts.
  • You can specify built-in functions, UDFs, or SQL functions in function type parameters when you call SQL functions.
  • You can specify anonymous functions in function type parameters when you call SQL functions.

Create a permanent SQL function

After you create a permanent SQL function and store it in the metadata system, all query statements can reference this function. To reference an SQL function in a query statement, you must be granted relevant permissions on the function. For more information, see Authorize users and Grant access to a specific UDF to a specified user. Use the following syntax to create a permanent SQL function:
CREATE SQL FUNCTION function_name(@parameter_in1 datatype[, @parameter_in2 datatype...]) 
[RETURNS @parameter_out datatype] 
AS [BEGIN] 
function_expression 
[END];
  • function_name: the name of the SQL function that you want to create. Each function name must be unique and can be registered only once. The names of SQL functions must differ from those of built-in functions.
  • parameter_in: the input parameters of the SQL function.
  • RETURNS: the variable to be returned by the SQL function. If you do not specify RETURNS, the value of function_name is returned by default.
  • parameter_out: the output parameters of the SQL function.
  • function_expression: the expression of the SQL function.
The following code provides an example:
CREATE SQL FUNCTION MY_ADD(@a BIGINT) AS @a + 1;
In the preceding example, @a + 1 indicates the implementation logic of the SQL function. You can write it as an expression to specify a built-in operator, built-in function, or UDF.
If the implementation logic is complex, you can write multiple SQL statements and enclose them with BEGIN and END. RETURNS specifies the variable to be returned by the SQL function. If you do not specify RETURNS, the value of function_name is returned by default. The following code provides an example:
CREATE SQL FUNCTION MY_SUM(@a BIGINT, @b BIGINT, @c BIGINT) RETURNS @my_sum BIGINT
AS BEGIN 
    @temp := @a + @b;
    @my_sum := @temp + @c;
END;
Note To write multiple SQL statements, you must use Script Mode SQL. For more information, see Script Mode SQL.

Query an SQL function

You can query an SQL function the same way you query a Java or Python UDF. The following code provides an example:
DESC FUNCTION my_add;
Note To query an SQL function on a client, make sure that the client version is later than 0.34.0.

Delete an SQL function

You can delete an SQL function the same way you delete a Java or Python UDF. The following code provides an example:
DROP FUNCTION my_add;

Call an SQL function

You can call an SQL function the same way you call a built-in function. The following code provides an example:
SELECT my_sum(col1, col2 ,col3) from t;

Define a temporary SQL function

If you need only to call an SQL function in a specific SQL script, you can define a temporary SQL function in the script. The temporary SQL function is not stored in the MaxCompute metadata system, and applies only to the current SQL script. The following code provides an example:
FUNCTION MY_ADD(@a BIGINT) AS @a + 1;
SELECT MY_ADD(key), MY_ADD(value) FROM src;
Note To write multiple SQL statements, you must use Script Mode SQL. For more information, see Script Mode SQL.

Use function type parameters

You can specify built-in functions, UDFs, or SQL functions in function type parameters when you call SQL functions. The following code provides an example:
FUNCTION ADD(@a BIGINT) AS @a + 1;
FUNCTION OP(@a BIGINT, @fun FUNCTION (BIGINT) RETURNS BIGINT) AS @fun(@a);
SELECT OP(key, ADD), OP(key, ABS) FROM VALUES (1),(2) AS t (key);

-- Output
+------------+------------+
| _c0        | _c1        |
+------------+------------+
| 2          | 1          |
| 3          | 2          |
+------------+------------+
In the example, two input parameters are specified for the OP function. The @a parameter specifies a value of the BIGINT type. The @fun parameter specifies a function whose input and output parameters are both of the BIGINT type. The OP function transfers @a to the function that is specified by @fun and then to the ADD and ABS functions for processing. For more information about the ABS function, see Mathematical functions.

Use anonymous functions

You can specify anonymous functions in function type parameters when you call SQL functions. The following code provides an example:
FUNCTION OP(@a BIGINT, @fun FUNCTION (BIGINT) RETURNS BIGINT) AS @fun(@a);
SELECT OP(key, FUNCTION (@a) AS @a + 1) FROM VALUES (1),(2) AS t (key);
In the example, FUNCTION (@a) AS @a + 1 is an anonymous function. You do not need to specify a data type for the input parameter @a. The compiler infers the data type of @a based on the parameter definition of the OP function.

Examples

Scenario: Convert the format of a date from yyyy-mm-dd to yyyymmdd.

For example, the formats of the following dates need to be converted: 2020-11-21, 2020-1-01, 2019-5-1, and 2019-12-1.

Solutions:
  • Solution 1: Repeatedly call a function, which causes low code reuse rate. Therefore, we recommend that you do not use this solution. The following code provides an example:
    SELECT concat(lpad(split_part(d, '-', 1), 4, '0'), lpad(split_part(d, '-', 2), 2, '0'), lpad(split_part(d, '-', 3), 2, '0')) from values('2020-11-21'),('2020-1-01'), ('2019-5-1'), ('19-12-1') t (d);
  • Solution 2: Use an SQL function. We recommend that you use this solution. The following code provides an example:
    CREATE SQL FUNCTION y_m_d2yyyymmdd(@y_m_d string) RETURNS @yyyymmdd string
    AS BEGIN
        @yyyymmdd := concat(lpad(split_part(@y_m_d, '-', 1), 4, '0'), lpad(split_part(@y_m_d, '-', 2), 2, '0'), lpad(split_part(@y_m_d, '-', 3), 2, '0')) ;
    END;
    
    SELECT y_m_d2yyyymmdd(d) from values('2020-11-21'),('2020-1-01'), ('2019-5-1'), ('19-12-1') t (d);
    The following result is returned:
    +------------+
    | _c0        |
    +------------+
    | 20201121   |
    | 20200101   |
    | 20190501   |
    | 00191201   |
    +------------+