All Products
Search
Document Center

Object Storage Service:Use Hive on an EMR cluster to process data stored in OSS-HDFS

Last Updated:Mar 20, 2026

This topic describes how to use Hive on an E-MapReduce (EMR) cluster to process data stored in OSS-HDFS.

Prerequisites

Before you begin, make sure that you have:

Add partitions to a Hive table backed by OSS-HDFS

This procedure walks you through logging on to an EMR cluster, connecting to the Hive CLI, and adding partitions that map to OSS-HDFS paths:

  1. Log on to the EMR cluster: Access the cluster node through the EMR console and open a terminal session.

  2. Connect to Hive: Launch the Hive CLI on the cluster terminal.

  3. Add partitions to an existing table: Use the ALTER TABLE ... ADD PARTITION command to map Hive table partitions to OSS-HDFS paths.

Step 1: Log on to the EMR cluster

  1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

  2. Click the EMR cluster that you created.

  3. Click the Nodes tab, and then click p480359.jpg on the left side of the node group.

  4. Click the ID of the ECS instance. On the Instances page, click Connect next to the instance ID.

For more information about logging on from Windows or Linux using an SSH key pair or SSH password, see Log on to a cluster.

Step 2: Connect to Hive

On the terminal, run the following command to start the Hive CLI:

hive

For other connection methods, see Connect to Hive.

Step 3: Add partitions to an existing table

Adding partitions stores table data in smaller units. Queries that specify partition conditions scan only matching partitions, which improves query performance.

Syntax

ALTER TABLE <table_name> ADD [IF NOT EXISTS] PARTITION <pt_spec> [PARTITION <pt_spec> ...] LOCATION 'location';

Parameters

ParameterRequiredDescription
table_nameYesThe name of the table to which you want to add partitions.
IF NOT EXISTSNoSkips the operation without returning an error if a partition with the same name already exists. If omitted, adding a duplicate partition name causes the operation to fail.
pt_specYesThe partition specification in the format (partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...). Partition key column names are case-insensitive; values are case-sensitive.
locationYesThe OSS path used to store data in the partition.

Example

The following example adds a partition to the sale_detail table for the China (Hangzhou) region in December 2021:

ALTER TABLE sale_detail ADD IF NOT EXISTS PARTITION (sale_date='202112', region='hangzhou') LOCATION 'oss://examplebucket.cn-hangzhou.oss-dls.aliyuncs.com/path/2021/';