This topic describes how to develop Java user-defined functions (UDFs). It provides the sample code of UDFs, user-defined aggregate functions (UDAFs), and user-defined table-valued functions (UDTFs), and describes the complete process of developing a UDF.

Background information

MaxCompute UDFs consist of UDFs, UDAFs, and UDTFs. For more information, see Overview.

You can use MaxCompute Studio to develop Java UDFs. For more information, see Develop UDFs.
Note
  • For more information about the commands that are used to register, deregister, and view UDFs, see Function operations.
  • For more information about the mapping between Java and MaxCompute data types, see Parameter types and return value types.
  • If you use Maven to develop a Java UDF, you can search for odps-sdk-udf in the Maven repository to find the latest version of SDK for Java. The following example shows the dependency required for the specified version of SDK for Java.
    <dependency>
        <groupId>com.aliyun.odps</groupId>
        <artifactId>odps-sdk-udf</artifactId>
        <version>0.20.7</version>
    </dependency>

UDF development example

To use MaxCompute Studio to develop a UDF that converts uppercase letters to lowercase letters, perform the following steps:
  1. Prepare the tool environment and create a Java module.

    In this step, you must install MaxCompute Studio, create a MaxCompute project connection in MaxCompute Studio, and create a MaxCompute Java module.

  2. Write code.
    1. In the navigation tree of the Project pane, choose Module > src > main > java. Then, right-click java and choose New > MaxCompute Java.11
    2. In the Create new MaxCompute java class dialog box, configure Name and Kind, and click OK.
      • Name: the name of the MaxCompute Java class. If no package is created, enter packagename.classname. The system automatically creates a package.
      • Kind: the category of the MaxCompute Java class. Select UDF. Supported categories include UDFs (UDF, UDAF, and UDTF), MapReduce (Driver, Mapper, and Reducer), and non-structural development frameworks (StorageHandler, Extractor, and Outputer).
    3. After the UDF is created, edit the following code:
      package <Package name>;
      import com.aliyun.odps.udf.UDF;
      public final class Lower extends UDF {
          public String evaluate(String s) {
              if (s == null) { 
                 return null; 
              }
                 return s.toLowerCase();
          }
      }
      Note If you want to debug Java UDFs on your on-premises machine, see Develop and debug UDFs.
  3. Register a MaxCompute UDF.
    Right-click the Java file of the UDF and select Deploy to server. In the Package a jar and submit resource dialog box, configure the parameters as required. Click OK.
    • MaxCompute project: the name of the MaxCompute project to which the UDF belongs.
    • Resource name: the name of the resource that you want to register.
    • Function name: the name of the function that you want to register.
  4. Use the UDF.
    Open the SQL script and run the test code. In this example, run SELECT Lower_test('ABC');.
    Note For more information about how to write an SQL script in MaxCompute Studio, see Write SQL scripts.

UDAF sample code

UDAFs are registered in the same way as UDFs and are used in the same way as built-in aggregate functions. Sample code for developing a UDAF that calculates the average value:
package org.alidata.odps.udf.examples;
import com.aliyun.odps.io.LongWritable;
import com.aliyun.odps.io.Text;
import com.aliyun.odps.io.Writable;
import com.aliyun.odps.udf.Aggregator;
import com.aliyun.odps.udf.UDFException;
/**
 * project: example_project
 * table: wc_in2
 * partitions: p2=1,p1=2
 * columns: colc,colb,cola
 */
public class UDAFExample extends Aggregator {
  @Override
  public void iterate(Writable arg0, Writable[] arg1) throws UDFException {
    LongWritable result = (LongWritable) arg0;
    for (Writable item : arg1) {
      Text txt = (Text) item;
      result.set(result.get() + txt.getLength());
    }
  }
  @Override
  public void merge(Writable arg0, Writable arg1) throws UDFException {
    LongWritable result = (LongWritable) arg0;
    LongWritable partial = (LongWritable) arg1;
    result.set(result.get() + partial.get());
  }
  @Override
  public Writable newBuffer() {
    return new LongWritable(0L);
  }
  @Override
  public Writable terminate(Writable arg0) throws UDFException {
    return arg0;
  }
}

UDTF sample code

UDTFs are registered in the same way as UDFs. Sample code:
package org.alidata.odps.udtf.examples;
import com.aliyun.odps.udf.UDTF;
import com.aliyun.odps.udf.UDTFCollector;
import com.aliyun.odps.udf.annotation.Resolve;
import com.aliyun.odps.udf.UDFException;
// TODO define input and output types, e.g., "string,string->string,bigint".
@Resolve({"string,bigint->string,bigint"})
public class MyUDTF extends UDTF {
  @Override
  public void process(Object[] args) throws UDFException {
    String a = (String) args[0];
    Long b = (Long) args[1];
    for (String t: a.split("\\s+")) {
      forward(t, b);
    }
  }
}