All Products
Search
Document Center

MaxCompute:CREATE FUNCTION

Last Updated:Jul 17, 2023

Creates a user-defined function (UDF) in a MaxCompute project.

Prerequisites

Resources that are required to create a UDF are added to the desired MaxCompute project by executing the add jar <localfile> [comment '<comment>'][-f]; statement. For more information, see ADD JAR.

Limits

  • Function names must be unique in a project. You cannot create a function that has the same name as an existing function in the project.

  • UDFs cannot overwrite built-in functions of MaxCompute. Only the project owner can use UDFs to overwrite built-in functions. If you use a UDF that overwrites a built-in function, warning information is displayed in Summary of the Logview of your job after the SQL statement is executed.

Syntax

create function <function_name> as <'package_to_class'> using <'resource_list'>;

Parameters

  • function_name: required. The name of the UDF that you want to create.

  • package_to_class: required. The class of the UDF that you want to create. This parameter is case-sensitive and must be enclosed in single quotation marks (').

    • For a Java UDF, specify this name as a fully qualified class name from the top-level package name to the UDF class name.

    • For a Python UDF, specify this name in the Python script name.Class name format.

      Note

      The Python script name refers to the underlying resource name that uniquely identifies a resource. The name of a MaxCompute resource is not case-sensitive. For example, the resource name is pyudf_test.py the first time you upload a resource. If you rename the resource to PYUDF_TEST.py in DataStudio or use PYUDF_TEST.py to overwrite pyudf_test.py on the MaxCompute client, the underlying resource name that uniquely identifies the resource is still pyudf_test.py. In this case, when you create a UDF based on the resource, the class name must be pyudf_test.SampleUDF. You can execute the list resource; statement to view the underlying resource names that uniquely identify all resources.

  • resource_list: required. The list of resources used by the UDF.

    • The resource list must include the resources that contain the UDF code. Make sure that the resources are uploaded to MaxCompute.

    • If the code calls the Distributed Cache API to read resource files, this resource list must also contain the list of resource files that are read by the UDF.

    • The resource list consists of multiple resource names and must be enclosed in single quotation marks ('). The resource names must be separated by commas (,).

    • To specify the project that contains the resource, configure the parameter in the <project_name>/resources/<resource_name> format.

Examples

  • Example 1: Create the my_lower function. In this example, the Java UDF class org.alidata.odps.udf.examples.Lower is in my_lower.jar.

    create function my_lower as 'org.alidata.odps.udf.examples.Lower' using 'my_lower.jar';
  • Example 2: Create the my_lower function. In this example, the Python UDF class MyLower is in the pyudf_test.py script of the test_project project.

    create function my_lower as 'pyudf_test.MyLower' using 'test_project/resources/pyudf_test.py';
  • Example 3: Create the test_udtf function. In this example, the Java UDF class com.aliyun.odps.examples.udf.UDTFResource is in udtfexample1.jar. The function depends on the file resource file_resource.txt, the table resource table_resource1, and the archive resource test_archive.zip.

    create function test_udtf as 'com.aliyun.odps.examples.udf.UDTFResource' using 'udtfexample1.jar, file_resource.txt, table_resource1, test_archive.zip';

Related statements

  • FUNCTION: If you do not need to store SQL functions in the metadata system of MaxCompute, you can create temporary SQL functions. These functions apply only to the current SQL script.

  • DELETE FUNCTION: Deletes a function. You can write a UDF and call the delete_function() method of a MaxCompute entry object to delete the UDF.

  • DROP FUNCTION: Deletes an existing UDF from a MaxCompute project.

  • DESC FUNCTION: Views the information of a specified UDF in a MaxCompute project. The information includes the name, owner, creation time, class name, and resource list of the UDF.

  • LIST FUNCTIONS: Views the information of all UDFs in a MaxCompute project.

  • UPDATE FUNCTION: Updates a function. You can write a UDF and call the update method of MaxCompute to update the UDF.