MaxCompute uses a compute-storage decoupled architecture, storing data in columnar format with a compression ratio of approximately 6.5 times and three replicas by default. Storage and compute scale independently, supporting both single-zone and multi-zone deployments. This topic describes the key characteristics of MaxCompute storage, including data types, billing, ingestion methods, and lifecycle management.
Main features
MaxCompute storage has two core features:
-
Fully managed: No provisioning or capacity planning required. MaxCompute automatically allocates storage when you write data, and you pay only for what you use. Storage and compute are billed separately. For details, see Billable items and billing methods.
-
Storage encryption: Data at rest is encrypted and decrypted through Key Management Service (KMS). For details, see Storage encryption.
Table data
MaxCompute stores data primarily as table data. The following table data types are billable:
-
Internal tables: Mainly structured data stored in columnar format.
-
Table clones: Independent data replicas of internal tables created using CLONE TABLE. Each clone is stored separately from the base table and billed independently.
-
Local backups: System-generated snapshots of historical table versions, created automatically when data is deleted or modified. These data versions support quick recovery of data within the retention period. Backup data is retained free of charge for one day by default. Set a longer retention period to meet your data recovery needs. For details, see Local table backups.
-
Materialized views: Pre-computed views that periodically save and delete view query results for reuse, reducing repeated execution of expensive queries. Materialized views occupy physical storage and are billed accordingly.
External tables store data outside MaxCompute. Examples include OSS external tables (data stored in Object Storage Service) and Hologres external tables (data stored in Hologres). External tables have a table schema but the data path points to an external storage service. MaxCompute does not charge for external table storage; the external storage service bills for the underlying data. For details on OSS external tables, see OSS external tables.
For storage fee details, see Storage fees.
Metadata
Metadata includes table schemas, partitioning and clustering specifications, table lifecycles, and permission configurations. MaxCompute stores metadata at no charge.
Storage billing
Storage fees are calculated based on the size of a single replica after columnar compression. With a compression ratio of approximately 6.5 times, the billed size is substantially smaller than the original data size. Three-replica redundancy and zone configuration (single-zone or multi-zone) do not affect the billed amount — you always pay for one compressed replica. For details, see Storage fees.
Tiered storage
MaxCompute supports three storage tiers to help you balance performance and cost based on how often data is accessed:
| Storage tier | Best for |
|---|---|
| Standard storage | Frequently accessed data |
| Infrequent access (IA) storage | Data accessed less often |
| Long-term storage | Rarely accessed, archival data |
Set storage tiers per table based on your access patterns to reduce costs. For details, see Configure storage tiers for storage resources.
Data writing
MaxCompute supports multiple data writing modes. The right mode depends on how much latency your workflow can tolerate and where your data originates:
| Mode | Latency | Best for |
|---|---|---|
| Offline batch writing (data channel) | Scheduling-cycle latency | Workflows where data delivery within a batch window is acceptable |
| Offline data streaming writing (data channel) | Higher visibility and request latency | Continuous data feeds that do not require near-real-time visibility |
| Real-time data writing (data channel) | Second-level visibility latency | Scenarios where data is written to DataHub first, then synced to MaxCompute |
| Generated data | Not applicable | Transforming or copying data already in MaxCompute using SQL INSERT statements |
| Offline batch writing (lakehouse solution) | Not applicable | Federated computing scenarios where occasional data migration is needed |
For details on each mode, see Overview of the data transmission service.
Data reading
Data in MaxCompute is primarily read for analytics and queries within MaxCompute. Additional reading options include:
-
Batch reading via data channels
-
Querying through JDBC
-
Reading via external tables based on the lakehouse solution, with results written to Object Storage Service (OSS)
Data lifecycle
A lifecycle defines how long a table or partition retains data before MaxCompute automatically reclaims it. The lifecycle clock resets each time data changes. Configure lifecycle rules to automate data expiration and reduce storage costs over time.