MaxCompute 2.0 allows you to use external tables to access Object Storage Service (OSS) and Tablestore. MaxCompute Studio provides code templates to help you develop unstructured data queries. This topic describes how to use MaxCompute Studio to query unstructured data.
Write StorageHandlers, Extractors, or Outputters
- In the Project tool window, click your MaxCompute Java module and choose > java. Then, right-click java and choose .
- Create an Extractor class. Specify Name and Kind, and click OK.
- Name: the name of the MaxCompute Java class. If no package is created, enter packagename.classname. The system automatically creates a package.
- Kind: the category of the MaxCompute Java class. Select Extractor. Supported categories include custom functions (UDF, UDAF, and UDTF), MapReduce (Driver,
Mapper, and Reducer), and non-structural development frameworks (StorageHandler, Extractor,
Note If you create a StorageHandler or Outputter class, set Kind to StorageHandler or Outputer.
- After you create an Extractor class, you can develop a Java program in the editor. The Java template is automatically populated with framework code. You need only to compile the logic code based on your requirements.
- Use the same method to create a StorageHandler and an Outputter.
Debug Extractors or Outputters
Package and upload Java programs
After you debug a Java program, compress the Java program into a JAR package and upload the package to the MaxCompute server as a resource. For more information, see Package, upload, and register.
Query unstructured data
- In the Project tool window, right-click scripts under your MaxCompute project, and choose .
- Enter the name of the SQL script in the Script Name field, select a MaxCompute project from the MaxCompute Project drop-down list, and then click OK.
- Enter SQL statements to create an external table in the editor.
- Enter the query statement and click the Run MaxCompute SQL Script icon to query data.