DataWorks allows you to create functions in Python and Java. This topic describes how to create a MaxCompute function.

Prerequisites

The required resources are uploaded.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. Create a workflow. For more information, see Create a workflow.
  3. Create a JAR or Python resource, and commit and deploy the resource. For more information, see Create a MaxCompute resource.
  4. Create a function.
    1. Click the desired workflow, right-click MaxCompute, and then choose Create > Function.
    2. In the Create Function dialog box, set the Function Name and Location parameters.
    3. Click Create.
    4. In the Register Function section of the configuration tab that appears, set the parameters that are described in the following table.
      Register Function
      Parameter Description
      Function Type The type of the function. Valid values: Mathematical Operation Functions, Aggregate Functions, String Processing Functions, Date Functions, Window Functions, and Other Functions.
      Engine Instance MaxCompute By default, the value of this parameter cannot be changed.
      Function Name The name of the function. You can use this name to reference the function in SQL statements. The function name must be globally unique and cannot be changed after the function is created.
      Owner By default, the account that is used to log on to the DataWorks console is selected. You can change the value of this parameter.
      Class Name The name of the main class that implements the function. The name is in the format of Resource name.Class name. The resource name can be the name of a Java package or a Python package.
      In DataWorks, you can reference MaxCompute resources including JAR and Python packages to create user-defined functions (UDFs). The value format of this parameter varies based on the resource type:
      • If the resource type is JAR, set the Class Name parameter in the Java package name.Class name format. You can use copy reference to obtain the class name from IntelliJ IDEA.

        For example, if you set the Class Name parameter to com.aliyun.odps.examples.udf.UDAFExample, com.aliyun.odps.examples.udf is the Java package name, and UDAFExample is the class name.

      • If the resource type is Python, set the Class Name parameter in the Python resource name.Class name.
        For example, if you set the Class Name parameter to LcLognormDist_sh.LcLognormDist_sh, LcLognormDist_sh is the Python resource name, and LcLognormDist_sh is the class name.
        Note
        • You do not need to include the .jar or .py suffix in the resource name.
        • You can use a resource after the resource is committed and deployed. For more information about how to create a MaxCompute resource, see Create a MaxCompute resource.
      Resources Required. You can perform a fuzzy match to search for existing resources in the current workspace.
      Note You do not need to specify the path of the added resources.
      Description The description of the function.
      Expression Syntax The syntax of the function. Example: test.
      Parameter Description The description of the input and output parameters that are supported.
      Return Value Optional. The return value. Example: 1.
      Example Optional. The example of the function.
  5. Click the Save icon icon in the top toolbar.
  6. Commit the function.
    1. Click the Submit icon icon in the top toolbar.
    2. In the Commit Node dialog box, enter your comments in the Change description field.
    3. Click OK.