All Products
Search
Document Center

Data Lake Formation:Storage overview

Last Updated:Mar 25, 2026

Storage overview gives you a consolidated view of your data lake's storage health — how much space you're using, which tables and databases are growing fastest, and where you can reduce costs or improve query performance. All metrics reflect the previous day's data (T-1).

The feature covers two analysis areas:

  • Metadata analysis: identify storage trends, find the largest consumers, and spot optimization opportunities across storage class, file distribution, and format

  • Location analysis: track storage and request trends for registered locations, filterable by OSS bucket and time period

Prerequisites

Before you begin, ensure that you have:

  • Object Storage Service (OSS) activated

  • Location hosting completed in DLF — see Location hosting

Activate storage overview

Important

Enabling storage overview writes statistical files to the OSS buckets of your databases. You are charged for the storage of these files. No statistics are generated on the first day — data becomes available the next day.

  1. Log on to the DLF console.

  2. In the left-side navigation pane, click Lake Management > Storage Overview, then click Enable Now.

Metadata analysis

Metadata analysis surfaces storage metrics across all tables and databases registered in DLF. Use it to track overall usage trends, identify the largest consumers of OSS storage, and find optimization opportunities — such as tables with inefficient storage classes or many small files that degrade query performance.

Total resources

Displays the current totals and their monthly and daily changes:

  • Total storage volume: total OSS storage used by tables in metadata management (excludes Hadoop Distributed File System (HDFS) storage)

  • Total number of tables: all tables registered in metadata management

  • Total number of databases: all databases registered in metadata management

  • API monthly/daily access volume: API calls for the current calendar month, with daily breakdown

Total resources

Trend changes

Shows time-series graphs for storage volume, table count, database count, and API calls. Select a time segment to focus on a specific period — for example, to see whether a spike in storage coincides with a batch ingestion job.

Trend changes

Table/database storage ranking

Lists tables and databases ranked by OSS storage consumed. Use this to identify which tables are taking up the most space so you can prioritize optimization — for example, by compacting files, changing the storage class, or archiving data that is no longer frequently accessed.

Table/database storage ranking

Storage class distribution

Shows how your OSS storage is distributed across storage classes: standard, IA, archive, and cold archive. Use this view to spot mismatches between access patterns and storage cost — for example, data that hasn't been accessed in months but is still stored in standard class.

Storage class distribution

Storage format distribution

Shows the distribution of storage formats across statistical tables. Use this to check whether tables are using efficient columnar formats for your query workloads.

Storage format distribution

File distribution and rankings of small files

Shows the size distribution of files and ranks tables by their small file count. Tables with many small files increase metadata overhead and slow down query planning. Use this ranking to identify which tables to compact first.

File distribution and rankings of small files

Location analysis

Location analysis shows storage and request trends for locations registered in DLF. Filter by OSS bucket and time segment to drill down into specific data sources.

Location storage trend analysis

Shows how storage volume changes over time for each registered location. Use this to detect sustained growth in specific locations and plan capacity ahead of time.

Location storage trend analysis

Location request trend analysis

Shows request volume trends for each registered location. Use this to correlate request spikes with storage growth or identify locations with unusual access patterns.

Location request trend analysis

Location storage ranking

Ranks registered locations by OSS storage consumed. Use this to identify which locations are the largest consumers and prioritize cost optimization efforts.

Location storage ranking