This topic describes how to develop a Java-based user-defined function (UDF) by using the Eclipse-integrated ODPS plug-in.

Preparations

Before developing a Java-based UDF using Eclipse, you need to make the following preparations:

  1. Use Eclipse to install the ODPS plug-in.
  2. Create an ODPS project.
    In Eclipse, choose File > New > ODPS Project, enter the project name, and click Config ODPS console installation path to configure the installation path of the odpscmd client.

    Enter the installation package path and click Apply. The ODPS plug-in automatically parses the version of the odpscmd client.

    Click Finish.

Procedure

  • 1. Create a Java-based UDF in the ODPS project.
    One the Package Explorer pane, right-click the ODPS Java-based UDF project you have created, and choose New > UDF.

    Set the UDF package to com.aliyun.example.udf and name to Upper2Lower, and click Finish.

    An automatic Java code is generated after you create a UDF. Do not change the name of the evaluate() function.

  • 2. Implement the evaluate() function contained in the UDF file.
    Write the function code to be implemented into the evaluate() function. Do not change the name of the evaluate() function. The following is an example of how to convert uppercase letters to lowercase letters.

    package com.aliyun.example.udf;
    
    import com.aliyun.odps.udf.UDF;
    
    public class Upper2Lower extends UDF {
        public String evaluate(String s) {
            if (s == null) { return null; }
            return s.toLowerCase();
        }
    }
    Save the code.

Test the Java-based UDF code

Before testing the Java-based UDF code, store some uppercase letters on MaxCompute. Create a test table named upperABC using the create table upperABC(upper string); SQL statement on the odpscmd client.

Use the insert into upperABC values('ALIYUN'); SQL statement to insert the string of uppercase letters 'ALIYUN'.
Choose Run > Run Configurations to set the test parameters.

Set the test parameters. Set Project to the name of the Java ODPS project you have created, and set Select ODPS project to the MaxCompute project name. Note that the project name needs to match the name of that connected to the odpscmd client. Set Table to upperABC. After completing all the settings, click Run.

You can view the test result in the Console pane, as shown in the following figure.
Note Eclipse obtains the string of uppercase letters from the table and converts them to a string of lowercase letters, which is 'aliyun'. However, the uppercase letters stored on MaxCompute are not converted.


Use the Java-based UDF

You can use the Java-based UDF after the test is successful. The procedure is as follows:
  1. Export the JAR package.
    Right-click the ODPS project you have created and select Export.

    On the displayed page, select JAR file and click Next.

    Enter the JAR package name and click Finish. Then, the JAR package is exported to your workspace directory.

  2. Upload the JAR package to DataWorks.
    Log on to the DataWorks console, find the MaxCompute_DOC project, and go to the Data Studio page. Choose Business Flow > Resource > Create Resource > JAR and create a JAR resource.

    On the displayed page, upload the JAR package you have exported.

    The JAR package is uploaded to DataWorks. To upload it to MaxCompute, click the JAR package and click Submit and Unlock.

    You can run the list resources command on the odpscmd client to view the uploaded JAR package.
  3. Create a resource function.
    After uploading the JAR package to your MaxCompute project, choose Business Flow > Function > Create Function and create a function named upperlower_Java. After completing these settings, click Save and Submit and Unlock.

    You can run the list functions command on the odpscmd client to view the registered function. Then, the upperlower_Java Java-based UDF registered using Eclipse can be used.

Check the Java-based UDF

In the odpscmd CLI, run the select upperlower_Java('ABCD') from dual; command. In the following figure, the output is abcd, indicating that the function has converted a string of uppercase letters to lowercase letters.

Additional information

For more information about how to develop Java-based UDFs, see Java UDF.

To use IntelliJ IDEA to develop a Java-based UDF, see Use IntelliJ IDEA to develop a Java-based UDF.