MaxCompute V2.0 allows you to use external tables to access Object Storage Service (OSS) and Tablestore. MaxCompute Studio provides code templates to help you query unstructured data. This topic describes how to use MaxCompute Studio to query unstructured data.

Prerequisites

Write a StorageHandler, Extractor, or Outputer program

  1. In the Project tool window, expand your MaxCompute Java module and choose src > main > java. Then, right-click java and choose New > MaxCompute Java.
    11
  2. Specify Name, select Extractor, StorageHandler, or Outputer, and then press Enter.
    • Name: the name of the MaxCompute Java class that you want to create. If no package is created, enter packagename.classname. The system automatically creates a package.
    • Select Extractor, StorageHandler, or Outputer as the class type.
  3. After the class is created, develop a Java program in the editor. The Java template is automatically filled with framework code. You need only to compile the logic code based on your requirements.

Debug the Extractor or Outputer program

Write your test cases to debug your Extractor or Outputer program based on the unit test examples in the examples directory. Example

Package and upload the program

After you debug the program, compress the program into a JAR package and upload the package to the MaxCompute server as a resource. For more information, see Package, upload, and register.

Query unstructured data

  1. In the Project tool window, right-click scripts under your MaxCompute project and choose New > MaxCompute SQL Script.
    Create a script
  2. Enter the name of an SQL script in the Script Name field, select a MaxCompute project from the MaxCompute Project drop-down list, and then click OK.
    Create a script
  3. In the editor, enter the SQL statement that is used to create an external table and click the Run icon.
  4. Create a MaxCompute SQL script, enter the following query statement, and then click the Run icon to query data.
    Query