Data is the new corporate currency but many businesses are failing to effectively analyze and capitalize on petabytes of information because, quite frankly, they don’t know where to start.
This is where Alibaba Cloud MaxCompute can help. It is an AI-enabled big data processing platform to help enterprises unlock the immense value of their data.
The platform offers a combination of data intelligence services, mainly for batch structural data storage and processing. It is cheap to use and can process 100PB data in six hours. That’s roughly the same amount of data as 100 million HD movies, or one-third of Facebook’s entire data warehouse.
Let’s look at an example. What do you do with all the data captured from your social media streams? With MaxCompute, you could upload every Facebook like or retweet in a matter of minutes. And, using its machine learning tools, gain insights into how the market responds to your promotions and products.
You could break down this information by campaign or date or even mine user characteristics and spending habits to further optimize and personalize your social media streams.
MaxCompute is an incredibly low-cost service. Costing just USD $1.44 to sort 1TB of data, the platform set a new low-price record in the 2016 CloudSort Sort Benchmark competition.
You can create an ever-expanding ecosystem as project owners, data analysts and developers can work concurrently using MaxCompute. The platform also provides powerful security services and disaster recovery to protect your data.
A single MaxCompute cluster can scale up to 10,000 servers. And your data analysts do not need to adopt a distributed computing model to overcome the limited processing capacities of a single server for big data applications. That’s because MaxCompute uses a distributed model so you can analyze your data without worrying about the service requirements or the underlying model.
With usability and scalability on this scale, MaxCompute is bringing big data analysis to the masses.
Alibaba Cloud launched MaxCompute in Mainland China and Singapore at the start of 2017. In China, the platform has already been used to help ease traffic congestion, diagnose diseases using medical imagery and predict the winner of a singing talent competition.
The MaxCompute service is now available in Hong Kong, Europe, and Australia through the Internet, a classic network or VPC. If you’re not located in those regions you can still connect to the service over the Internet.
MaxCompute is incredibly easy to learn as it is based on traditional SQL syntax and uses a Java programming interface. It uses a relational DBMS as its primary database model with a simple additional key-value store.
There are three core components: the MaxCompute proprietary TUNNEL function for data uploads and downloads; a combination of MaxCompute SQL, Google’s MapReduce data processing model and a graph function for computing and analysis; and an SDK toolkit for developers.
MaxCompute does not collect data, it only processes it. You can upload offline data into the system, or download offline data from MaxCompute using the TUNNEL data channel. You can only upload and download data in tables.
MaxCompute SQL acts and feels just like a traditional piece of database software – but you are now querying and analyzing terabytes or petabytes of data. It supports the data definition language DDL so you can use the ALTER, CREATE and DROP commands to manage tables and partitions, as well as your traditional SELECT, JOIN, GROUP BY and WHERE clauses.
The MaxCompute SQL syntax is intuitive if you are familiar with standard database operations, though there are small differences, for example, in that MaxCompute SQL does not support transactions, index and UPDATE/DELETE operations.
Using the MapReduce programming interface, you can effectively process your data by splitting the input dataset into independent chunks. These can then be processed by the map tasks in a completely parallel manner. Graph is a set framework for iterative graph computing and effective data modeling.
There are Eclipse plugins for developers and DataHub Services are also available so you can publish and subscribe to real-time data.
MaxCompute is a powerful tool. Going back to our social media example, imagine if you could optimize your products, prices and promotions for every user on the fly? Your conversion rates would skyrocket!
But it doesn't stop there. Alibaba Cloud DataWorks is a perfect companion to MaxCompute, helping you build a one-stop Big Data solution. DataWorks works straight ‘out-the-box’ without the need to worry about complex underlying cluster establishment and Operations & Management. Additionally, you can try Alibaba Cloud E-MapReduce to add Hadoop capabilities to your Big Data solution.
Alibaba Clouder - January 18, 2018
Alibaba Clouder - June 25, 2018
Alibaba Clouder - August 28, 2019
Alibaba Clouder - September 3, 2019
Alibaba Clouder - February 9, 2018
Alibaba Cloud MaxCompute - September 18, 2018
Conduct large-scale data warehousing with MaxComputeLearn More
Allows developers to quickly diagnose and analyze application performance bottlenecks and root causes in distributed architecture frameworks.Learn More
SDDP automatically discovers sensitive data in a large amount of user-authorized data, and detects, records, and analyzes sensitive data consumption activities.Learn More
Realtime Compute offers a highly integrated platform for real-time data processing, which optimizes the computing of Apache Flink.Learn More
More Posts by Alibaba Clouder