All Products
Search
Document Center

Object Storage Service:Apsara File Storage for HDFS

Last Updated:Mar 20, 2026

Apsara File Storage for HDFS is a cloud file storage service that exposes the same interface as a Hadoop Distributed File System (HDFS). MapReduce, Hive, Spark, and Flink jobs connect to it as the default file system without code changes or recompilation.

Key capabilities:

  • Unlimited capacity with linear performance scaling

  • High-throughput, high IOPS, low-latency access from Elastic Compute Service (ECS) instances and Container Service

  • A unique namespace shared across multiple compute nodes

  • 99.999999999% (eleven 9's) data durability

  • Network isolation, security groups, and RAM user authorization

  • Pay-as-you-go billing by default, with subscription resource plans for additional discounts

Use cases

Apsara File Storage for HDFS is suited for workloads that require sustained high throughput, such as big data analytics and machine learning. ECS instances and other compute resources access stored data directly—no need to copy data to local storage before processing. Deploy Hadoop or machine learning applications across multiple compute nodes and run online or offline computing jobs against the same file system. Export results back to the file system for permanent storage.

Performance

Throughput is the primary performance metric. The practical throughput of a file system is bounded by the maximum bandwidth of the attached ECS instance. For example, an ECS instance with 1.5 Gbit/s of bandwidth supports a maximum file system throughput of 187.5 Mbit/s. Throughput scales linearly with file system capacity: provisioning more capacity directly increases available throughput.

Data durability and availability

Apsara File Storage for HDFS stores multiple replicas of every file. Replicas are placed on devices isolated across different fault domains for geo-redundancy, providing 99.999999999% (eleven 9's) data durability.

Security

Apsara File Storage for HDFS protects data using five complementary mechanisms:

MechanismDescription
Network isolationIsolates file systems within a Virtual Private Cloud (VPC)
Classic network user isolationControls access in classic network environments
File system permission controlStandard permission control for file systems
Security group access controlRestricts access at the network level using security groups
RAM user authorizationGrants fine-grained permissions to RAM users

SDK and console

Apsara File Storage for HDFS provides two management interfaces:

  • SDK — The Apsara File Storage for HDFS SDK for Java (aliyun-sdk-dfs-x.y.z.jar) implements Hadoop-compatible file system operations. Applications built on MapReduce, Hive, Spark, and Flink can use the SDK to switch to Apsara File Storage for HDFS as the default file system without modifying or recompiling code.

  • Console — Use the Apsara File Storage for HDFS console to create and manage file systems through a graphical web interface.

Note During public preview, only the file system SDK is available.

Billing

Apsara File Storage for HDFS is billed based on file system capacity and preset throughput.

Billing methodDescription
Pay-as-you-goBilled hourly based on used resources. The default option.
Subscription (resource plan)Purchase capacity in advance for a lower per-unit price.