By Alibaba Cloud StorageTeam
At the recently held global top storage conference FAST 2026, a new paper jointly published by Alibaba Cloud, Shanghai Jiao Tong University, and Solidigm once again won the Best Paper award. For the third time in four years, Alibaba Cloud’s storage research has been honored with this prestigious global award.
This paper provides a systematic overview of the three-generation evolution of storage architecture and proposes Latte, a brand-new local-cloud converged storage architecture. Combined with the systematic engineering experience of Alibaba Cloud, the paper provides a new paradigm for breaking the "impossible triangle" of storage Performance, cost, and reliability.

Below is an in-depth look at the Best Paper "Here, There and Everywhere: The Past, the Present and the Future of Local Storage in Cloud."
In the cloud-native era, local storage has long faced the trade-off between "extreme performance" and "operational usability." It must meet microsecond-level low latency while balancing multi-tenant data isolation, elastic O&M, and high availability. Therefore, based on the large-scale production practices of Alibaba Cloud, the research team systematically outlined the "three-generation evolutionary history" of local disk technology from pure software optimization to software-hardware synergy for the first time in the paper. They proposed the new local-cloud converged storage architecture, Latte, and provided a clear technical evolution roadmap.

Figure | New Local-Cloud Converged Storage Architecture
The paper uses the "extraction" process of coffee to vividly illustrate the three-generation evolutionary history of storage technology: how local disk technology evolves from pure software optimization to software-hardware synergy.
The first-generation storage technology resembles Espresso. Alibaba Cloud pioneered the User space polling architecture. Although the speed is fast and it unleashes the potential of NVMe disks, it sacrifices CPU efficiency, much like clearing out an entire kitchen just to drink a cup of coffee.
The second-generation technology resembles Double Espresso (Doppio). It introduces hardware assistance to improve isolation, but rigid hardware makes it difficult to adapt to the rapid iteration of SSDs. This is akin to buying a machine that can only brew specific beans, which fails to keep pace with rapid disk innovation.
Finally, the technology evolved to the third generation, akin to Ristretto. The third-generation software-hardware synergy design architecture retains the hardware high-speed channel and uses a programmable "intelligent brain" to flexibly adapt to new disks. This allows storage to approach the physical disk limit in large-scale applications, truly balancing speed, security, and future upgrades.
Based on this, the next-generation hybrid architecture Latte proposed in the paper creatively integrates the ultra-fast response of local storage and the infinite elasticity of the cloud into one. By means of intelligent scheduling, it achieves a long-tail latency prediction accuracy of 95.6% with less than 10% CPU overhead. Our efficient caching strategy moves beyond traditional "first-in, first-out" (FIFO) caching, achieving an 80% read hit rate with almost no increase in write overhead. The elastic disaster recovery mechanism allows the local side to absorb burst traffic while the cloud side silently acts as a backstop. Even if the server suddenly breaks down, the system can achieve rapid recovery and service continuity.

Figure | IO Latency Comparison Experiment Result
Today, when AI LLM inference has become a mainstream workload, the value of the Latte architecture is even more significant. This architecture can build a high-performance, large-capacity, and cost-effective elastic cache layer, effectively solving the pain point of GPU starvation caused by data throughput bottlenecks. Meanwhile, through the architecture pattern of "local absorption + intelligent offloading," Latte significantly improves response speed and resource utilization. It provides a “data-on-demand” storage foundation for LLM inference, making response faster, cost more optimized, and scale-out more flexible.
From Espresso to Latte, it is not only an evolution of storage forms but also a microcosm of the underlying architecture of cloud computing moving from "siloed resources" to "unified resource pooling." With this achievement, Alibaba Cloud demonstrates to the industry how to utilize the technical benefits of software-hardware synergy and local-cloud convergence. It lays a more solid foundation for cloud-native databases, AI inference, and big data analytics, leading global storage technology toward a new intelligent stage.
The Full Name of FAST is Conference on File and Storage Technologies. Founded in 2002, it is a top international conference focusing on the storage realm jointly organized by the Advanced Computing Systems Association (USENIX) and the Association for Computing Machinery Special Interest Group on Operating Systems (ACM SIGOPS). It represents the highest international level in the computer storage realm. Since its inception more than 20 years ago, FAST has promoted the development of many storage-related technologies such as software-hardware combination, RAID, flash memory file systems, non-volatile memory technology, and distributed storage.
1,362 posts | 485 followers
FollowApsaraDB - December 3, 2025
ApsaraDB - July 30, 2024
Alibaba Cloud Community - December 1, 2025
ApsaraDB - May 23, 2024
Alibaba Cloud ECS - March 5, 2021
Alibaba Cloud MaxCompute - July 4, 2019
1,362 posts | 485 followers
Follow
Container Service for Kubernetes
Alibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn More
Storage Capacity Unit
Plan and optimize your storage budget with flexible storage services
Learn More
Container Compute Service (ACS)
A cloud computing service that provides container compute resources that comply with the container specifications of Kubernetes
Learn More
Hybrid Cloud Storage
A cost-effective, efficient and easy-to-manage hybrid cloud storage solution.
Learn MoreMore Posts by Alibaba Cloud Community