Query OSS-HDFS data in EMR using Trino - Object Storage Service

Query data stored in OSS-HDFS by running Trino on an EMR cluster.

Prerequisites

An EMR cluster (EMR-3.42.0 or later, or EMR-5.8.0 or later) with Trino selected. Create a cluster.
OSS-HDFS is enabled with authorized access. Enable the OSS-HDFS service.

Log on to the E-MapReduce console. In the left-side navigation pane, click EMR on ECS and create an EMR cluster.

When you create the EMR cluster, make sure that Product Version is EMR-3.46.2 or later, or EMR-5.12.2 or later, and Root Storage Directory of Cluster is set to an OSS-HDFS-enabled bucket. Use the defaults for other parameters. For details, see Create a cluster.
Query data in the OSS-HDFS service.
1. Connect to the Trino CLI.
  
  On the EMR on ECS console, go to Services > Trino > the Configure tab to get <Trino_server_address> and <Trino_server_port>.
```
trino --server <Trino_server_address>:<Trino_server_port> --catalog hive
```
2. Create a schema in OSS.
```
create schema testDB with (location='oss://<Bucket>.<Endpoint>/<schema_dir>');
```
3. Use the schema.
```
use testDB;
```
4. Create a table.
```
create table tbl (key int, val int);
```
5. Insert data into the table.
```
insert into tbl values (1,666);
```
6. Query the table.
```
select * from tbl;
```