DataWorks supports Python and Java APIs. This topic describes how to register a function.

Prerequisites

The required resources are uploaded.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region in which the workspace that you want to manage resides. Find the workspace and click DataStudio in the Actions column.
  2. Create a workflow. For more information, see Create a workflow.
  3. Create a Java ARchive (JAR) or Python resource, and commit and deploy the resource. For more information, see Create a MaxCompute resource.
  4. Create a function.
    1. Expand the desired workflow, right-click MaxCompute, and then choose Create > Function.
    2. In the Create Function dialog box, configure the Function Name and Location parameters.
    3. Click Create.
    4. In the Register Function section of the configuration tab that appears, configure the parameters that are described in the following table.
      Register Function
      Parameter Description
      Function Type The type of the function. Valid values: Mathematical Operation Functions, Aggregate Functions, String Processing Functions, Date Functions, Window Functions, and Other Functions.
      Engine Instance MaxCompute The MaxCompute compute engine instance. The value of this parameter cannot be changed.
      Function Name The name of the user-defined function (UDF). You can use this name to reference the UDF in SQL statements. The UDF name must be globally unique and cannot be changed after the UDF is registered.
      Owner The owner of the UDF. The account that is used to log on to the DataWorks console is automatically displayed. You can change the value of this parameter.
      Class Name The name of the class that implements the UDF. The name is in the format of Resource name.Class name. The resource name can be the name of a Java or Python package.
      In DataWorks, you can reference MaxCompute resources including JAR and Python packages to create UDFs. The value format of this parameter varies based on the resource type:
      • If the resource type is JAR, set the Class Name parameter in the JAR package name.Actual class name format. You can query the class name by executing the copy reference statement in IntelliJ IDEA.

        For example, if com.aliyun.odps.examples.udf is the Java package name and UDAFExample is the class name, the value of the Class Name parameter is com.aliyun.odps.examples.udf.UDAFExample.

      • If the resource type is Python, set the Class Name parameter in the Python resource name.Actual class name format.
        For example, if LcLognormDist_sh is the Python resource name and LcLognormDist_sh is the class name, the value of the Class Name parameter is LcLognormDist_sh.LcLognormDist_sh.
        Note
        • You do not need to include the .jar or .py suffix in the resource name.
        • You can use a resource after the resource is committed and deployed. For more information about how to create a MaxCompute resource, see Create a MaxCompute resource.
      Resources Required. You can perform a fuzzy match to search for existing resources in the current workspace.
      Note
      • You do not need to specify the path of the added resources.
      • If multiple resources are referenced in the UDF, separate them with commas (,).
      Description The description of the UDF.
      Expression Syntax The syntax of the UDF. Example: test.
      Parameter Description The description of the input and output parameters that are supported.
      Return Value Optional. The return value. Example: 1.
      Example Optional. The example of the UDF.
  5. Click the Save icon in the top toolbar.
  6. Commit the UDF.
    1. Click the Submit icon in the top toolbar.
    2. In the Commit Node dialog box, enter your comments in the Change description field.
    3. Click OK.