MiniMax is a global AI foundation model company. Founded in early 2022, MiniMax is committed to advancing the frontiers of AI towards AGI via its mission Intelligence with Everyone.
MiniMax's proprietary multimodal models, led by MiniMax M2.1, Hailuo2.3, Speech 2.6 and Music 2.0, have advanced coding capability and high agentic performance, as well as ultra-long context processing capability, and can understand, generate, and integrate a wide range of modalities, including text, audio, images, video, and music. These models power MiniMax's major AI-native products — including MiniMax Agent, Hailuo AI, MiniMax Audio, Talkie, and enterprise and developer-facing Open API Platform — which collectively deliver intelligent, dynamic experiences to enhance productivity and quality of life for users worldwide.
To date, MiniMax's proprietary models and AI-native products have cumulatively served over 212 million individual users across over 200 countries and regions, and more than 130,000 enterprises and developers across over 100 countries and regions.
Starting in 2024, MiniMax’s products—including Hailuo AI, MiniMax Audio, and Talkie—experienced rapid growth both domestically and internationally. This surge led to an explosive increase in user data volume, quickly scaling to tens of petabytes (PBs), which posed significant technical challenges for building a robust data platform:
Alibaba Cloud helped MiniMax build a globally unified, cloud-native data warehouse architecture. Centered around Alibaba Cloud’s DataWorks—a one-stop data development and governance platform—this solution enables seamless integration of heterogeneous data sources, unified stream-batch processing, real-time/offline data collaboration, and end-to-end data lifecycle management.
Aggregates diverse, heterogeneous storage systems, covering OLTP databases, unstructured data, and real-time streaming data.
Object Storage Service (OSS) serves as the cold data tier, seamlessly integrated with MaxCompute to enable intelligent hot/cold data tiering, optimizing the balance between cost and performance.

Leveraging Alibaba Cloud’s cloud-native data warehouse solution, MiniMax established a unified global data warehouse technology stack. Powered by high performance, low latency, and serverless elasticity, this infrastructure provides efficient and stable support for critical business scenarios such as operational analytics and user growth.
Through DataWorks’ visual ETL capabilities, MiniMax achieved real-time full and incremental data synchronization from source systems directly into Hologres. By utilizing cross-engine data federation between MaxCompute and Hologres, the company decoupled real-time storage from offline computation. As a result, key data now lands in the warehouse approximately one hour earlier, significantly improving the timeliness of business decisions.
A globally consistent tech stack—built on Alibaba Cloud’s serverless, storage-compute decoupled architecture—dramatically reduced operational complexity and enhanced team delivery velocity.
The integrated big data platform—comprising DataWorks, MaxCompute, and Hologres—enables unified management across development, scheduling, operations, and governance. It currently handles over tens of PBs of total data, with daily processing volumes reaching hundreds of TBs.
Through techniques like storage-compute separation and operator-level optimizations, MiniMax reduced compute resource consumption by 50%. Further refinements brought overall compute usage down by 75%. Additionally, implementing data lifecycle management policies lowered storage costs by 40%, achieving an optimal balance between performance and cost.
In the era of rapid advancement in large language models (LLMs), the deep integration of data and artificial intelligence has become essential for enterprises seeking competitive advantage. LLM training continuously drives innovation in large-scale data processing technologies, demanding greater elasticity, higher-performance preprocessing operators, and unified data governance frameworks.
Building on MiniMax’s extensive experience with Alibaba Cloud’s cloud-native data warehouse solution, both parties are jointly exploring next-generation solutions that further fuse large-scale data processing with AI. By leveraging Alibaba Cloud’s MaxFrame—a next-generation distributed computing framework—they aim to enhance data processing efficiency and accelerate the practical deployment of AI innovations.
Model training cycles are fast-paced, often requiring temporary access to massive elastic resources for short-duration, high-efficiency preprocessing of PB-scale datasets, followed by immediate resource release. Traditional architectures struggle to simultaneously meet demands for elasticity, processing speed, and cost control.
Common issues during data preprocessing—such as file size limits, out-of-memory (OOM) errors, and failed full-dataset MinHash deduplication tasks—led to low job success rates and poor stability, severely impacting overall pipeline efficiency.
The original workflow relied on Python scripts for development, debugging, and production execution, lacking visual tools for task development, management, scheduling, and operations. This made it difficult to evaluate the impact of multi-parameter iterations and hampered developer productivity.
Custom data preprocessing pipelines (e.g., for Common Crawl datasets) demanded significant engineering effort for development and maintenance, diverting talent away from core AI innovation.
MiniMax built a fully managed, one-stop Data + AI data processing platform on Alibaba Cloud’s MaxCompute, powered by the MaxFrame distributed computing framework. This solution delivers unified management and elastic, large-scale preprocessing capabilities for diverse data types—including structured, unstructured, and multimodal data.

Key features include:

By adopting the MaxFrame distributed computing framework, MiniMax achieved significant improvements in resource utilization, processing efficiency, and platform architecture:
Through deep technical collaboration with Alibaba Cloud, MiniMax has successfully built a highly efficient, cost-effective, cloud-native Data + AI integrated data processing platform centered on a modern data warehouse—effectively addressing the challenges of rapid business iteration and elastic scalability in the age of large models.
This solution not only delivers substantial gains in data processing performance and significant reductions in operational costs but also establishes a widely reusable engineering paradigm for AI application development driven by large models.
Looking ahead, MiniMax and Alibaba Cloud will continue to deepen their joint innovation in frontier areas such as large-model data preprocessing and multimodal data processing, working together to advance the large-scale industrial adoption of Data + AI technologies worldwide.
Beyond Silos: How Unified Multimodal Analytics Is Redefining Data Infrastructure for the AI Era
5 posts | 0 followers
FollowAlibaba Cloud Big Data and AI - January 21, 2026
Alibaba Cloud Community - January 23, 2025
Alibaba Cloud Big Data and AI - October 27, 2025
Alibaba Container Service - November 15, 2024
Alibaba Cloud Community - July 2, 2025
Alibaba Container Service - January 15, 2026
5 posts | 0 followers
Follow
DataWorks
A secure environment for offline data development, with powerful Open APIs, to create an ecosystem for redevelopment.
Learn More
Serverless Workflow
Visualization, O&M-free orchestration, and Coordination of Stateful Application Scenarios
Learn More
MaxCompute
Conduct large-scale data warehousing with MaxCompute
Learn More
Serverless Application Engine
Serverless Application Engine (SAE) is the world's first application-oriented serverless PaaS, providing a cost-effective and highly efficient one-stop application hosting solution.
Learn MoreMore Posts by Alibaba Cloud Big Data and AI