This topic describes how to use complex data types in Java user-defined functions (UDFs) by using MaxCompute Studio.

Prerequisites

MaxCompute Studio is installed and connected to a MaxCompute project. A MaxCompute Java module is created.

For more information, see Install MaxCompute Studio, Manage project connections, and Create a MaxCompute Java module.

Sample code

In the following code, three evaluate methods are defined for function overloading.
  • Method 1: Use an array as a parameter. The array corresponds to the java.util.List class.
  • Method 2: Use a map as a parameter. The map corresponds to the java.util.Map class.
  • Method 3: Use a struct as a parameter. The struct corresponds to the com.aliyun.odps.data.Struct class.
    Note You cannot use the reflection feature for the com.aliyun.odps.data.Struct class to obtain the names and types of fields. If you want to use the STRUCT data type for a UDF, you must add the @Resolve annotation to the com.aliyun.odps.data.Struct class. This annotation affects only the overloading of a UDF whose input parameters or return value contains the com.aliyun.odps.data.Struct class.
import com.aliyun.odps.data.Struct;
import com.aliyun.odps.udf.UDF;
import com.aliyun.odps.udf.annotation.Resolve;
import java.util.List;
import java.util.Map;

@Resolve("struct<a:string>,string->string")
public class UdfArray extends UDF {
    // Receive two parameters. The first parameter corresponds to the ARRAY data type, and the second parameter corresponds to the index of the element that you want to obtain. The code segment is used to obtain the element at the index position. 
    public String evaluate(List<String> vals, Long index) { 
        return vals.get(index.intValue());
    }
    // Receive two parameters. The first parameter corresponds to the MAP data type, and the second parameter corresponds to the key that you want to obtain. The code segment is used to obtain the value that corresponds to the key. 
    public String evaluate(Map<String, String> map, String key) {
        return map.get(key);
    }
    // Receive two parameters. The first parameter corresponds to the STRUCT data type, and the second parameter is a key value. The code segment is used to obtain the value that corresponds to member variable a in the data of the STRUCT data type, add the key value to the obtained value, and then return the value of the STRING type. 
    public String evaluate(Struct struct, String key) {
        return struct.getFieldValue("a") + key;
    }
}

Procedure

  1. Write a Java UDF in MaxCompute Studio. In this example, the name of the Java class is UdfArray, and the code in Sample code is used.
    For more information about how to write a UDF, see Write a UDF. Write a UDF
  2. Run and debug the UDF on your on-premises machine to check whether the UDF code is run as expected.
    For more information about how to debug UDFs, see Perform a local run to debug the UDF. Debug the UDF
    Note The parameter settings in the preceding figure are provided for reference.
  3. Package the created UDF, such as my_index, into a JAR file, upload the file to the MaxCompute project, and then register the UDF.
    For more information about how to package UDFs, see Package the code. Packaging
  4. In the left-side navigation pane of MaxCompute Studio, click the Project Explorer tab. Right-click your MaxCompute project and select Open in Console to start the MaxCompute client. Then, execute an SQL statement to call the UDF that you created.
    Call the created UDFSample statements:
    select my_index(array('a', 'b', 'c'), 0); -- The return value is a. 
    select my_index(map('key_a','val_a', 'key_b', 'val_b'), 'key_b'); -- The return value is val_b. 
    select my_index(named_struct('a', 'hello'), 'world'); -- The return value is hello world.