All Products
Search
Document Center

Object Storage Service:Use Trino on an EMR cluster to query data stored in OSS-HDFS

Last Updated:Feb 27, 2026

OSS-HDFS provides an HDFS-compatible interface that allows big data engines such as Trino to access OSS data directly. Set up Trino on an E-MapReduce (EMR) cluster to run interactive SQL queries against data stored in OSS-HDFS.

Prerequisites

Before you begin, make sure that you have:

Step 1: Create an EMR cluster

  1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

  2. Create an EMR cluster with the following settings:

    ParameterRequired value
    Product VersionEMR-3.46.2 or later, or EMR-5.12.2 or later
    Root Storage Directory of ClusterA bucket with OSS-HDFS enabled
    Other parametersDefault values

    For more information, see Create a cluster.

Step 2: Connect to the Trino server

  1. Get the Trino server address and port:

    1. On the EMR on ECS page, click the name of your cluster.

    2. Go to Services > Trino > Configure to find the server address and port.

  2. Run the following command to connect to the Trino server:

    ParameterDescription
    <Trino_server_address>The IP address or hostname of the Trino server. Located on the Configure tab.
    <Trino_server_port>The port number of the Trino server. Located on the Configure tab.
    <catalog_name>The catalog to connect to, such as hive.
       trino --server <Trino_server_address>:<Trino_server_port> --catalog <catalog_name>
    Note: The --catalog flag requires a value. Specify the catalog that maps to your data source.

Step 3: Query data stored in OSS-HDFS

After you connect to the Trino server, run the following SQL statements to create a schema, load data, and run a query.

  1. Create a schema that points to an OSS-HDFS location.

       create schema testDB with (location='oss://<Bucket>.<Endpoint>/<schema_dir>');

    Replace the placeholders with your values:

    PlaceholderDescriptionExample
    <Bucket>The name of the OSS bucket with OSS-HDFS enabledmy-data-bucket
    <Endpoint>The OSS-HDFS endpoint for your regioncn-hangzhou.oss-dls.aliyuncs.com
    <schema_dir>The directory path for the schematrino/testDB

    The full URI follows this format:

       oss://<Bucket>.<Endpoint>/<schema_dir>
  2. Switch to the new schema.

       use testDB;
  3. Create a table.

       create table tbl (key int, val int);
  4. Insert data into the table.

       insert into tbl values (1,666);
  5. Query data in the table.

       select * from tbl;

    Expected output:

        key | val
       -----+-----
          1 | 666
       (1 row)

Result

The query returns the row you inserted. The data is stored in the OSS-HDFS location specified in the schema.

References