All Products
Search
Document Center

MaxCompute:Query unstructured data

Last Updated:Mar 26, 2026

MaxCompute V2.0 supports external tables that connect to Object Storage Service (OSS) and Tablestore. MaxCompute Studio provides code templates that speed up writing custom handler classes. This topic shows how to use MaxCompute Studio to write, debug, package, and run queries against unstructured data stored in OSS or Tablestore.

Prerequisites

Before you begin, ensure that you have:

How it works

Querying unstructured data in MaxCompute requires four stages:

  1. Write a handler class — Implement an Extractor, StorageHandler, or Outputer class that defines how MaxCompute reads from or writes to your external storage.

  2. Debug — Run unit tests against your handler using the examples in the examples directory.

  3. Package and upload — Compress the handler into a JAR package and upload it to MaxCompute as a resource.

  4. Query — Create an external table that references your handler and JAR resource, then run SQL queries.

Write a StorageHandler, Extractor, or Outputer class

Choose the handler class type based on your business requirements:

ClassRole
ExtractorDefines custom logic for reading unstructured data from OSS or Tablestore
OutputerDefines custom logic for writing unstructured data to external storage
StorageHandlerImplements the logic defined in the Extractor or Outputer class

To create a class:

  1. In the left-side navigation pane of the Project tab, choose src > main > java, right-click java, and then choose New > MaxCompute Java.

    11

  2. In the Name field, enter the class name. If no package exists yet, use the format PackageName.ClassName — MaxCompute Studio automatically creates the package. Select Extractor, StorageHandler, or Outputer as the class type, and then press Enter.

  3. In the code editor, implement your logic. MaxCompute Studio pre-fills the file with framework code — fill in the logic sections for your use case.

Debug the Extractor or Outputer class

Write unit tests based on the examples in the examples directory. Each example shows how to construct test inputs and verify the handler's output. Run the tests to confirm that your reading or writing logic is correct before you package the handler.

示例

Package and upload the program

Compress your handler into a JAR package and upload it to the MaxCompute server as a resource. The external table you create in the next step references this resource to locate your handler at query time.

For detailed steps, see Package a Java program, upload the package, and create a MaxCompute UDF.

Query unstructured data

  1. In the Project tool window, right-click scripts under your MaxCompute project and choose New > MaxCompute SQL Script.

    添加脚本

  2. Enter a name in the Script Name field, select your MaxCompute project from the MaxCompute Project drop-down list, and then click OK.

    创建脚本

  3. Enter the SQL statement that creates an external table referencing your JAR resource and handler class, and then click the 运行 icon to run it.

  4. Create another MaxCompute SQL script, enter your query statement, and then click the 运行 icon to run the query.

    查询

What's next

Example: Create an OSS external table by using a custom extractor