edit-icon download-icon

Java UDF Development

Last Updated: May 07, 2018

MaxCompute user-defined functions (UDFs) include User Defined Scalar Function (UDF), User Defined Aggregation Function (UDAF), and User Defined Table Valued Function (UDTF). Users who use Maven can search “odps-sdk-udf” in the Maven Library to get the required Java SDK (available in different versions). The related configuration is shown as follows:

  1. <dependency>
  2. <groupId>com.aliyun.odps</groupId>
  3. <artifactId>odps-sdk-udf</artifactId>
  4. <version>0.20.7-public</version>
  5. </dependency>

Note:

  • Currently, UDF only supports Java language. If you want to write a UDF program, you can upload UDF code into the project through Add Resource and create UDF through Create Function.

  • To run a UDF with Eclipse, see UDF Eclipse Plug-in Introduction.

  • If you must use UDF, you must submit application in ticket system, providing the MaxCompute project name and application scenario. When your application is passed, you can use the UDF.

UDF Example

The following example shows how to develop a UDF to realize character lowercase conversion.

  • Coding: realize the UDF function in accordance with MaxCompute UDF framework and compile the code. An example is as follows:
  1. package org.alidata.odps.udf.examples;
  2. import com.aliyun.odps.udf.UDF;
  3. public final class Lower extends UDF {
  4. public String evaluate(String s) {
  5. if (s == null) { return null; }
  6. return s.toLowerCase();
  7. }
  8. }
  • Name the JAR package ‘my_lower.jar’.

Note:

For more information about the SDK, see UDF SDK.

  • Add resource: Specify the referenced UDF code before running UDF. The user’s code is added to MaxCompute in the form of resource. Java UDF must be compiled into the JAR package and added in MaxCompute as a JAR resource. The UDF framework loads the JAR package automatically and runs UDF. MaxCompute MapReduce also describes the use of resource.

  • Run the command:

  1. add jar my_lower.jar;
  2. -- If the resource name already exists, rename the JAR package.
  3. -- Pay attention to modifying related name of JAR package in following command.
  4. -- Alternatively, use f option directly to overwrite original JAR resource.
  • Register UDF: After the JAR package has been uploaded, MaxCompute can obtain a user’s code and run it. Note that, for the UDF to be usable, MaxCompute requires the user to register a unique function name in MaxCompute and specify which function is corresponding to this function name in the JAR resource. For registering a UDF, see Create Function. Next, run the command:
  1. CREATE FUNCTION test_lower AS org.alidata.odps.udf.examples.Lower USING my_lower.jar;

Use this function in SQL:

  1. select test_lower('A') from my_test_table;

UDAF Example

The registration method of a UDAF is similar to a UDF. Its usage is also the same as Aggregation Function in built-in function. The following example shows a UDAF code to calculate the average:

  1. package org.alidata.odps.udf.examples;
  2. import com.aliyun.odps.io.LongWritable;
  3. import com.aliyun.odps.io.Text;
  4. import com.aliyun.odps.io.Writable;
  5. import com.aliyun.odps.udf.Aggregator;
  6. import com.aliyun.odps.udf.UDFException;
  7. /**
  8. * project: example_project
  9. * table: wc_in2
  10. * partitions: p2=1,p1=2
  11. * columns: colc,colb,cola
  12. */
  13. public class UDAFExample extends Aggregator {
  14. @Override
  15. public void iterate(Writable arg0, Writable[] arg1) throws UDFException {
  16. LongWritable result = (LongWritable) arg0;
  17. for (Writable item : arg1) {
  18. Text txt = (Text) item;
  19. result.set(result.get() + txt.getLength());
  20. }
  21. }
  22. @Override
  23. public void merge(Writable arg0, Writable arg1) throws UDFException {
  24. LongWritable result = (LongWritable) arg0;
  25. LongWritable partial = (LongWritable) arg1;
  26. result.set(result.get() + partial.get());
  27. }
  28. @Override
  29. public Writable newBuffer() {
  30. return new LongWritable(0L);
  31. }
  32. @Override
  33. public Writable terminate(Writable arg0) throws UDFException {
  34. return arg0;
  35. }
  36. }

UDTF Example

The registration method and usage of a UDTF is similar to a UDF. The code example is as follows:

  1. package org.alidata.odps.udtf.examples;
  2. import com.aliyun.odps.udf.UDTF;
  3. import com.aliyun.odps.udf.UDTFCollector;
  4. import com.aliyun.odps.udf.annotation.Resolve;
  5. import com.aliyun.odps.udf.UDFException;
  6. // TODO define input and output types, e.g., "string,string->string,bigint".
  7. @Resolve({"string,bigint->string,bigint"})
  8. public class MyUDTF extends UDTF {
  9. @Override
  10. public void process(Object[] args) throws UDFException {
  11. String a = (String) args[0];
  12. Long b = (Long) args[1];
  13. for (String t: a.split("\\s+")) {
  14. forward(t, b);
  15. }
  16. }
  17. }
Thank you! We've received your feedback.