This topic describes how to use complex data types in Java user-defined functions (UDFs) by using MaxCompute Studio.
Prerequisites
MaxCompute Studio is installed and connected to a MaxCompute project. A MaxCompute Java module is created.
For more information, see Install MaxCompute Studio, Manage project connections, and Create a MaxCompute Java module.
Sample code
In the following code, three evaluate methods are defined for function overloading.
- Method 1: Use an array as a parameter. The array corresponds to the java.util.List class.
- Method 2: Use a map as a parameter. The map corresponds to the java.util.Map class.
- Method 3: Use a struct as a parameter. The struct corresponds to the com.aliyun.odps.data.Struct
class.
Note You cannot use the reflection feature for the com.aliyun.odps.data.Struct class to obtain the names and types of fields. If you want to use the STRUCT data type for a UDF, you must add the
@Resolve
annotation
to the com.aliyun.odps.data.Struct class. This annotation affects only the overloading of a UDF whose input parameters or return value contains the com.aliyun.odps.data.Struct class.
import com.aliyun.odps.data.Struct;
import com.aliyun.odps.udf.UDF;
import com.aliyun.odps.udf.annotation.Resolve;
import java.util.List;
import java.util.Map;
@Resolve("struct<a:string>,string->string")
public class UdfArray extends UDF {
// Receive two parameters. The first parameter corresponds to the ARRAY data type, and the second parameter corresponds to the index of the element that you want to obtain. The code segment is used to obtain the element at the index position.
public String evaluate(List<String> vals, Long index) {
return vals.get(index.intValue());
}
// Receive two parameters. The first parameter corresponds to the MAP data type, and the second parameter corresponds to the key that you want to obtain. The code segment is used to obtain the value that corresponds to the key.
public String evaluate(Map<String, String> map, String key) {
return map.get(key);
}
// Receive two parameters. The first parameter corresponds to the STRUCT data type, and the second parameter is a key value. The code segment is used to obtain the value that corresponds to member variable a in the data of the STRUCT data type, add the key value to the obtained value, and then return the value of the STRING type.
public String evaluate(Struct struct, String key) {
return struct.getFieldValue("a") + key;
}
}