Creates a user-defined function (UDF) in a MaxCompute project.
Prerequisites
Before you begin, ensure that you have:
-
Added the required resources to your MaxCompute project using the
ADD JARcommand. For more information, see ADD JAR.
Limitations
-
Function names must be unique within a project. You cannot create a function with the same name as an existing function.
-
UDFs cannot overwrite built-in functions of MaxCompute. Only the project owner can use UDFs to overwrite built-in functions. If a UDF overwrites a built-in function, a warning appears in Summary of the Logview for the job after the SQL statement runs.
Syntax
CREATE FUNCTION <function_name> AS <'package_to_class'> USING <'resource_list'>;
Parameters
| Parameter | Required | Description |
|---|---|---|
function_name |
Yes | The name of the UDF to create. Must be unique within the project. |
package_to_class |
Yes | The class of the UDF. Case-sensitive. Must be enclosed in single quotation marks ('). For a Java UDF, specify the fully qualified class name from the top-level package to the UDF class. For a Python UDF, use the <Python_script_name>.<Class_name> format. |
resource_list |
Yes | The resources used by the UDF. Must be enclosed in single quotation marks ('). Separate multiple resources with commas (,). To reference a resource from another project, use the <project_name>/resources/<resource_name> format. |
Notes on package_to_class for Python UDFs
The Python script name refers to the underlying resource name that uniquely identifies the resource. For example, if you upload a resource as pyudf_test.py and then rename it to PYUDF_TEST.py in DataStudio or overwrite it using the MaxCompute client, the underlying resource name remains pyudf_test.py. When you register the UDF, the class name must therefore be pyudf_test.SampleUDF. Run LIST RESOURCES; to view the underlying names of all resources.
Notes on resource_list
-
The list must include the resources that contain the UDF code. Make sure those resources are already uploaded to MaxCompute.
-
If the UDF calls the Distributed Cache API to read resource files, include those resource files in the list as well.
-
If schema is enabled and you need to use resources from other projects, see Work with objects in a schema.
Examples
Create a Java UDF
The following example creates the my_lower function. The Java UDF class org.alidata.odps.udf.examples.Lower is in my_lower.jar.
CREATE FUNCTION my_lower AS 'org.alidata.odps.udf.examples.Lower' USING 'my_lower.jar';
Create a Python UDF from another project
The following example creates the my_lower function. The Python UDF class MyLower is defined in pyudf_test.py within the test_project project.
CREATE FUNCTION my_lower as 'pyudf_test.MyLower' using 'test_project/resources/pyudf_test.py';
Create a UDF with multiple resource dependencies
The following example creates the test_udtf function. The Java UDF class com.aliyun.odps.examples.udf.UDTFResource is in udtfexample1.jar. The function also depends on a FILE resource (file_resource.txt), a Table resource (table_resource1), and an Archive resource (test_archive.zip).
CREATE FUNCTION test_udtf AS 'com.aliyun.odps.examples.udf.UDTFResource' USING 'udtfexample1.jar, file_resource.txt, table_resource1, test_archive.zip';
What's next
-
FUNCTION: Create a temporary SQL function that applies only to the current SQL script and is not stored in the MaxCompute metadata system.
-
DROP FUNCTION: Delete an existing UDF from a MaxCompute project.
-
DELETE FUNCTION: Delete a function using the
delete_function()method of a MaxCompute entry object. -
DESC FUNCTION: View the details of a UDF, including its name, owner, creation time, class name, and resource list.
-
LIST FUNCTIONS: View all UDFs in a MaxCompute project.
-
UPDATE FUNCTION: Update a UDF using the
updatemethod of MaxCompute.