Decoupled Storage Architecture for Scalable Data Processing - MaxCompute

MaxCompute uses a compute-storage decoupled architecture, storing data in columnar format with a compression ratio of approximately 6.5 times and three replicas by default. Storage and compute scale independently, supporting both single-zone and multi-zone deployments. This topic describes the key characteristics of MaxCompute storage, including data types, billing, ingestion methods, and lifecycle management.

Main features

MaxCompute storage has two core features:

Fully managed: No provisioning or capacity planning required. MaxCompute automatically allocates storage when you write data, and you pay only for what you use. Storage and compute are billed separately. For details, see Billable items and billing methods.
Storage encryption: Data at rest is encrypted and decrypted through Key Management Service (KMS). For details, see Storage encryption.

Table data

MaxCompute stores data primarily as table data. The following table data types are billable:

Internal tables: Mainly structured data stored in columnar format.
Table clones: Independent data replicas of internal tables created using CLONE TABLE. Each clone is stored separately from the base table and billed independently.
Local backups: System-generated snapshots of historical table versions, created automatically when data is deleted or modified. These data versions support quick recovery of data within the retention period. Backup data is retained free of charge for one day by default. Set a longer retention period to meet your data recovery needs. For details, see Local table backups.
Materialized views: Pre-computed views that periodically save and delete view query results for reuse, reducing repeated execution of expensive queries. Materialized views occupy physical storage and are billed accordingly.

External tables store data outside MaxCompute. Examples include OSS external tables (data stored in Object Storage Service) and Hologres external tables (data stored in Hologres). External tables have a table schema but the data path points to an external storage service. MaxCompute does not charge for external table storage; the external storage service bills for the underlying data. For details on OSS external tables, see OSS external tables.

For storage fee details, see Storage fees.

Metadata

Metadata includes table schemas, partitioning and clustering specifications, table lifecycles, and permission configurations. MaxCompute stores metadata at no charge.

Storage billing

Storage fees are calculated based on the size of a single replica after columnar compression. With a compression ratio of approximately 6.5 times, the billed size is substantially smaller than the original data size. Three-replica redundancy and zone configuration (single-zone or multi-zone) do not affect the billed amount — you always pay for one compressed replica. For details, see Storage fees.

Tiered storage

MaxCompute supports three storage tiers to help you balance performance and cost based on how often data is accessed:

Storage tier	Best for
Standard storage	Frequently accessed data
Infrequent access (IA) storage	Data accessed less often
Long-term storage	Rarely accessed, archival data

Set storage tiers per table based on your access patterns to reduce costs. For details, see Configure storage tiers for storage resources.

Data writing

MaxCompute supports multiple data writing modes. The right mode depends on how much latency your workflow can tolerate and where your data originates:

Mode	Latency	Best for
Offline batch writing (data channel)	Scheduling-cycle latency	Workflows where data delivery within a batch window is acceptable
Offline data streaming writing (data channel)	Higher visibility and request latency	Continuous data feeds that do not require near-real-time visibility
Real-time data writing (data channel)	Second-level visibility latency	Scenarios where data is written to DataHub first, then synced to MaxCompute
Generated data	Not applicable	Transforming or copying data already in MaxCompute using SQL `INSERT` statements
Offline batch writing (lakehouse solution)	Not applicable	Federated computing scenarios where occasional data migration is needed

For details on each mode, see Overview of the data transmission service.

Data reading

Data in MaxCompute is primarily read for analytics and queries within MaxCompute. Additional reading options include:

Batch reading via data channels
Querying through JDBC
Reading via external tables based on the lakehouse solution, with results written to Object Storage Service (OSS)

Data lifecycle

A lifecycle defines how long a table or partition retains data before MaxCompute automatically reclaims it. The lifecycle clock resets each time data changes. Configure lifecycle rules to automate data expiration and reduce storage costs over time.