Use a Java UDTF to read resources from MaxCompute - MaxCompute

This topic describes how to use a Java user-defined table-valued function (UDTF) to read resources from MaxCompute on MaxCompute Studio.

Prerequisites

MaxCompute Studio is installed and connected to a MaxCompute project. A MaxCompute Java module is created.

For more information about related operations, see Install MaxCompute Studio, Manage project connections, and Create a MaxCompute Java module.

For more information about MaxCompute resources, see Resource.

Sample code

package com.aliyun.odps.examples.udf;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.Iterator;
import com.aliyun.odps.udf.ExecutionContext;
import com.aliyun.odps.udf.UDFException;
import com.aliyun.odps.udf.UDTF;
import com.aliyun.odps.udf.annotation.Resolve;
/**
 * project: example_project 
 * table: wc_in2 
 * partitions: p2=1,p1=2 
 * columns: colc,colb
 */
@Resolve("string,string->string,bigint,string")
public class UDTFResource extends UDTF {
  ExecutionContext ctx;
  long fileResourceLineCount;
  long tableResource1RecordCount;
  long tableResource2RecordCount;
  @Override
  public void setup(ExecutionContext ctx) throws UDFException {
  this.ctx = ctx;
  try {
   InputStream in = ctx.readResourceFileAsStream("file_resource.txt");
   BufferedReader br = new BufferedReader(new InputStreamReader(in));
   String line;
   fileResourceLineCount = 0;
   while ((line = br.readLine()) != null) {
     fileResourceLineCount++;
   }
   br.close();
   Iterator<Object[]> iterator = ctx.readResourceTable("table_resource1").iterator();
   tableResource1RecordCount = 0;
   while (iterator.hasNext()) {
     tableResource1RecordCount++;
     iterator.next();
   }
   iterator = ctx.readResourceTable("table_resource2").iterator();
   tableResource2RecordCount = 0;
   while (iterator.hasNext()) {
     tableResource2RecordCount++;
     iterator.next();
   }
 } catch (IOException e) {
   throw new UDFException(e);
 }
}
   @Override
   public void process(Object[] args) throws UDFException {
     String a = (String) args[0];
     long b = args[1] == null ? 0 : ((String) args[1]).length();
     forward(a, b, "fileResourceLineCount=" + fileResourceLineCount + "|tableResource1RecordCount="
     + tableResource1RecordCount + "|tableResource2RecordCount=" + tableResource2RecordCount);
    }
}

Procedure

In the top navigation bar of MaxCompute Studio, choose MaxCompute > Add Resource. In the Add Resource dialog box, add the resource files listed in the following table.


Resource file	Resource type	Example
file_resource.txt	file	You can obtain sample resource files from the location shown in the following figure on the MaxCompute client.
table_resource1	table
table_resource2	table

Create a Java UDTF program on MaxCompute Studio. For example, the name of the MaxCompute Java class is UDTFResource and the program code is the code in Sample code.
For more information about how to create a UDTF, see Write a UDF.
Debug the UDTF on your on-premises machine and make sure that the UDTF code can normally run.
For more information about how to debug a UDTF, see Perform a local run to debug the UDF.

Note You can configure the runtime parameters based on the example shown in the preceding figure.
Package the created UDTF as a JAR file, submit the file to your MaxCompute project, and then register the UDTF, such as my_udtf.
For more information about how to package a UDTF, see Procedure. You must select the three resource files that are added in Step 1 for Extra resources.

In the left-side navigation pane of MaxCompute Studio, click Project Explorer. On the page that appears, right-click the MaxCompute project for which the UDTF is created and select Open in Console to start the MaxCompute client. Then, execute SQL statements in the code editor to call the created UDTF.

SQL sample code:

select my_udtf("10","20") as (a, b, fileResourceLineCount) from tnp1;

The following result is returned:

+-------+------------+-------+
| a | b      | fileResourceLineCount |
+-------+------------+-------+
| 10    | 2          | fileResourceLineCount=3|tableResource1RecordCount=0|tableResource2RecordCount=0 |
| 10    | 2          | fileResourceLineCount=3|tableResource1RecordCount=0|tableResource2RecordCount=0 |
+-------+------------+-------+