Open source big data platform EMR comprehensive upgrade

On December 27, Alibaba Cloud officially released the cloud native open source big data platform EMR 2.0. The upgraded open source big data platform can increase the capacity expansion performance by up to 6 times with the same cost.

It is reported that Alibaba Cloud EMR2.0 provides users with a new platform, development, resource form, analysis scenario and other better product experience. Through the upgrade of the EMR Doctor health check, comprehensive service patrol and event notification, node fault compensation and other operation and maintenance capabilities, the estimated operation and maintenance costs can be reduced by 20% - 30%. The new platform is committed to rapidly building an open source big data platform with high cost performance, safety, reliability and compatibility with ecology for customers.

Comparison of elastic expansion speed between EMR2.0 and EMR1.0

Under the cloud native trend, open source big data is in the process of reconstruction. The open source big data system with Hadoop as its core has begun to transform into a diversified technology for parallel development. He Yuan, head of Alibaba Cloud EMR products, said that Alibaba Cloud EMR began to serve Alibaba Group's internal customers in 2009. In 2016, Alibaba Cloud EMR turned its past technical capabilities into products to provide commercial services to customers. As a leading product in the field of open source big data, EMR 2.0 redefines the new generation of open source big data platform by reconstructing the platform layer, data layer, and computing layer through cloud native capabilities to meet the multi-scenario requirements of thousands of customer stream processing, data visualization, interactive analysis, and data lake. Build a new generation of open source big data infrastructure for customers.

EMR 2.0 product architecture diagram

Based on the EMR2.0 platform, customers can achieve more low-cost, efficient and intelligent big data cluster management and application development. By using preemptive examples, production demonstration can reduce costs by more than 80% at most. Enable automatic compensation of fault instances. Under the full scenario cluster, the stability can be improved by 1 9. The newly released EMR Doctor can check whether there is resource waste in the cluster through the cluster daily report function of the health check service; Find the job with the most waste of resources and optimize it by ranking the top N in the reverse order of task score; Through continuous optimization, help customers maximize the use of resources and avoid waste. At the same time, it can also help customers find some risks in advance and deal with them. EMR Studio, which provides Notebook and Workflow services. Fully managed Notebook is compatible with users' Jupyter usage habits. It can seamlessly interface with EMR computing and storage engines for interactive big data development and debugging. Jobs that have been developed and debugged can be added to the Workflow workflow for scheduling and online. In addition, EMR Studio's Workflow service also supports Flink and other jobs.

In June 2022, Alibaba Cloud EMR, in conjunction with OSS, DLF, DataWorks, and others, built a cloud native data lake product solution that passed the evaluation and certification of the ICT Academy, which is the first and only product solution with full marks in China. This solution provides users with comprehensive data lake capabilities such as "full custody lake storage, comprehensive lake acceleration, unified lake management, multimodal lake computing, and intelligent lake governance". (The first batch of Alibaba Cloud native data lake products in China have passed the evaluation and certification of the ICT Academy)

Huiliang Technology, a well-known advertising marketing service provider in China, has used EMR products for 4 years. Under the favorable situation of rapid business growth, Huiliang Technology is facing more and more problems: such as complex data sources, large data volume, multiple data dimensions, real-time operation business second-level data freshness requirements and other business requirements; After this upgrade, the data synchronization and query efficiency of Huiliang Technology has been improved several times on the big data platform of the material platform, thermal engine and other businesses, and the system stability has been significantly improved. The previous situation of high load of cpu, mem and io has not occurred.

With the release of Alibaba Cloud EMR 2.0, Alibaba Cloud EMR will transform its technology leading advantages into cloud-based product service capabilities. The redefined new generation of EMR products will provide the most solid foundation for customers in all industries to build open source big data platforms.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us