MaxCompute (formerly known as ODPS) is an enterprise-level cloud data warehouse that uses the software as a service (SaaS) model. MaxCompute is suitable for scenarios that require data analysis. It provides a fast, fully managed online data warehousing service in a serverless architecture. MaxCompute eliminates the constraints of traditional data platforms in terms of resource extensibility and elasticity, minimizes operations and maintenance (O&M) costs, and allows you to efficiently process and analyze large amounts of data at low costs.
As data collection techniques continue to diversify, enterprises in various industries accumulate terabytes, petabytes, or even exabytes of data. The rapid increase in the data amount exceeds the processing capacity of the traditional software industry. MaxCompute provides offline and streaming data access, supports large-scale data computing and query acceleration, and provides data warehousing solutions and analysis and modeling services for a variety of computing scenarios. MaxCompute also provides comprehensive data import solutions and various typical distributed computing models. It allows you to complete big data analytics without knowledge about distributed computing and maintenance.
MaxCompute is suitable for scenarios in which more than 100 GB of data needs to be stored or computed. MaxCompute can process up to exabytes of data and is widely used in Alibaba Group. MaxCompute is suitable for various big data processing scenarios, such as data warehousing and business intelligence (BI) analysis for large Internet enterprises, website log analysis, e-commerce transaction analysis, and exploration of user characteristics and interests.
DataWorks provides a variety of features, such as end-to-end data synchronization, workflow design, data development, data management, and O&M for MaxCompute.
- Machine Learning Platform for AI
The algorithm components of PAI can be used to train models based on data in MaxCompute.
- Quick BI
Quick BI allows you to create reports for data in MaxCompute and analyze the data in a visualized manner.
For more information about the concepts, basic operations, and advanced operations of MaxCompute, see MaxCompute Learning Path.
Functions and features
|Fully managed online data warehousing service in a serverless architecture||
|High elasticity and extensibility||
|Centralized, rich computing and storage capabilities||
|Deep integration with DataWorks||Integrates with DataWorks, an end-to-end data development and data governance platform. DataWorks enables global data aggregation, processing, and governance. DataWorks can be used to manage MaxCompute projects and edit web-side query code.|
|Integrated AI capability||
|Deep integration with a Spark engine||
For more information, see Lakehouse of MaxCompute.
|Streaming data collection and near real-time analysis||
|Continuous SaaS-based data protection in the cloud||Provides enterprises with three levels of more than 20 security features, such as infrastructure, data center, network, power supply, and platform security capabilities, user permission management, and privacy protection. MaxCompute also provides the same security capabilities as open source big data services and managed databases.|
The following figure shows the architecture of MaxCompute.
|Compute engine||MaxCompute supports various compute engines. MaxCompute runs Spark jobs on the Cupid platform developed by Alibaba Cloud. The Cupid platform is fully compatible with the computing framework that is supported by open source YARN.|
|Data tunnels for computing models||MaxCompute supports various data tunnels, which can meet your requirements in different
|User interfaces||MaxCompute provides the following user interfaces:|
|Unified metadata and security systems||The Information Schema service of MaxCompute provides information such as project metadata and historical
data. You can analyze job metrics such as the resource usage, job execution duration,
and size of processed data to optimize jobs or plan resource capacity.
MaxCompute also provides comprehensive security management systems, such as access control, data encryption, and dynamic data masking systems, to ensure data security. For more information about security, see Security features.
- Helps you build a data warehouse that delivers high-performance storage and computing.
- Pre-integrates multiple services, which simplifies standard SQL development.
- Provides comprehensive management and security capabilities.
- Is O&M-free and supports the pay-as-you-go billing method. You are charged only for the resources that you use.
- High scalability to meet business requirements
Supports separate extension of storage and computing capabilities. The dynamic scaling feature frees you from planning capacity in advance and can meet the storage and computing requirements of rapid business growth.
- Various analysis scenarios
Uses an open, unified platform to meet business requirements in various scenarios, such as data warehousing, BI, near real-time analysis, data lake analysis, and machine learning.
- Open platform
- Supports open interfaces and data ecosystems, which ensures flexible data migration, application migration, and custom software development.
- Supports flexible combination with commercial or open source services, such as Airflow and Tableau, to build various data applications.
If you have questions or suggestions about MaxCompute, you can fill in the DingTalk group application form to join the DingTalk group for feedback.