All Products
Search
Document Center

E-MapReduce:Enable and configure OSS storage analysis

Last Updated:Mar 26, 2026

EMR Doctor supports storage analysis for Object Storage Service (OSS). The analysis helps you understand the usage and health of your OSS resources, including data usage, health status, and association with Hive storage.

How it works

OSS provides an inventory feature that periodically generates manifest files for a bucket. Each manifest file contains metadata about the objects in the bucket, such as their count, size, and storage class. EMR Doctor reads the latest manifest file from your bucket to run storage analysis.

To use OSS storage analysis, enable the OSS inventory feature for each bucket you want to analyze, then configure EMR Doctor to locate the manifest files.

Prerequisites

Before you begin, ensure that you have:

  • An EMR cluster

  • One or more OSS buckets containing data you want to analyze

  • Access to the OSS console

Enable the OSS inventory feature

Enable the inventory feature for each bucket to analyze. OSS uses the inventory configuration to generate manifest files, which EMR Doctor reads to perform storage analysis.

Note

The OSS inventory feature incurs charges. For pricing details, see Bucket inventory.

  1. Log on to the OSS console.

  2. In the left navigation pane, click Buckets. On the Buckets page, find and click the bucket you want to analyze.

  3. In the left navigation pane, choose Data Management > Bucket Inventory.

  4. On the Bucket Inventory page, click Create Inventory.

  5. In the Create Inventory panel, configure the inventory parameters. For parameter details, see Bucket inventory. Note the following constraints:

    • Set Inventory Storage Bucket to the same bucket you are enabling inventory for.

    • Under Optional Fields, select Object Size and Storage Class. EMR Doctor requires these two fields to calculate storage usage and analyze storage class distribution.

    • If the bucket contains more than 10 billion objects, set Frequency to Weekly. Otherwise, set it to Daily.

  6. Select I understand the terms and agree to authorize Alibaba Cloud OSS to access the resources in my buckets, then click OK.

Repeat these steps for each bucket you want to include in storage analysis.

Configure OSS storage analysis

After enabling inventory for your buckets, configure the OSS storage analysis parameters in the EMR console. These parameters tell EMR Doctor which buckets to analyze and where to find their manifest files.

Configure the following parameters on the TAIHAODOCTOR service configuration page in the EMR console. For other configuration options, see EMR Doctor configuration guide.

ParameterDescriptionRequired
collect.oss.bucketThe name of the OSS bucket to analyze.Yes
collect.oss.manifest.dirThe directory of the manifest file, in the format inventory_path/inventory_bucket/inventory_name.Yes

The collect.oss.manifest.dir value is composed of three parts from your inventory configuration:

  • inventory_path: The storage path for the inventory report that you set when creating the inventory.

  • inventory_bucket: The name of the OSS bucket being analyzed (same as collect.oss.bucket).

  • inventory_name: The inventory name that you set when creating the inventory.

For example, if you set the storage path to reports, the bucket name is my-bucket, and the inventory name is my-inventory, the manifest file directory is:

reports/my-bucket/my-inventory
Important

For multiple buckets, list the bucket names and their corresponding manifest file directories in the same order, separated by commas. The order of entries in collect.oss.bucket must match the order in collect.oss.manifest.dir.

Single bucket

collect.oss.bucket:       my-bucket
collect.oss.manifest.dir: reports/my-bucket/my-inventory

Multiple buckets

collect.oss.bucket:       my-bucket1,my-bucket2
collect.oss.manifest.dir: reports1/my-bucket1/my-inventory1,reports2/my-bucket2/my-inventory2

What's next