Apsara Conference 2022 | Panjiu Server Release Review

Date: Oct 1, 2022

Related Tags:1. Application Real-Time Monitoring Service
2. Activate ARMS

Abstract: On the morning of the 20th, at the main technology forum, Zhou Ming, head of Alibaba Cloud infrastructure, explained for the first time "infrastructure born of cloud" from the perspectives of "stable security, self-developed innovation and green future".

On October 19th, at the 2021 Apsara Conference, Zhang Jianfeng, President of Alibaba Cloud Intelligence, first explained a new world on the cloud with the title of "Deep Cloud, New World". He believes that computing will further migrate to the cloud in the future. On the morning of the 20th, at the main technology forum, Zhou Ming, head of Alibaba Cloud infrastructure, explained for the first time "infrastructure born of cloud" from the perspectives of "stable security, self-developed innovation and green future". As an important component of Alibaba Cloud's infrastructure, the self-developed Panjiu server is one of the comprehensive manifestations of Alibaba Cloud's infrastructure engineering capabilities. Considering technical security and resource security, it responds to the green and low-carbon national strategy, and combines the cutting-edge basic technologies of DAMO Academy. It will build a continuously leading server infrastructure to provide customers with a variety of cloud service capabilities that are more technically competitive.

Two major releases Apsara Conference 2022



01:Self-developed Panjiu server family



Recommendation: For the next-generation cloud-native architecture, Alibaba Cloud launches its self-developed "Panjiu" server family Apsara Conference 2022

Apsara Conference 2022 on the main forum on the 19th, the self-developed Panjiu server family based on the Fangsheng architecture was officially unveiled (Alibaba Cloud Server), including three series of Panjiu high-performance computing, Panjiu high-performance storage and Panjiu high-capacity storage. Alibaba Cloud's server hardware has always followed technological and architectural innovation and self-research from chips, components to complete machines. Facing the extreme performance requirements of Alibaba Cloud's basic products and Alibaba's world-class application scenarios, it has always adhered to the integration and optimization of software and hardware, and constantly pursued usability ( performance + energy saving) and extreme stability, and gradually established a new generation of cloud-native server hardware architecture standards in the industry - Fangsheng Open Architecture Apsara Conference 2022.

Product performance highlights Apsara Conference 2022

● First, Panjiu servers can achieve absolute performance advantages in multiple technical scenarios. In general computing, it supports various computing architectures such as X86, ARM, and RISC-V. In e-commerce scenarios, the SPEC performance can be improved by up to 30%; In the field of structural computing, it supports a variety of heterogeneous computing components such as GPGPU/xPU/FPGA. Through intelligent computing optimization, the multimedia processing performance can be improved by more than 10 times; it supports DRAM/PCM/SSD/HDD hierarchical high-performance storage; In the scenario, the computing and storage hardware is separated and pooled, and the large-capacity storage model can support 160T SSD and 10PB+ HDD in a single machine Apsara Conference 2022.

● Second, Apsara Conference 2022 relying on Fangsheng's open architecture, Panjiu servers have achieved a unified motherboard for air-cooled models and liquid-cooled models, and unified motherboards for computer models and storage models through flexible modularization and generalized design. This modular design ultimately reduces the cooling energy consumption of the overall server model by 76% compared to the industry average; and the final server delivery efficiency increases by 50%;

● Third, Alibaba's world-class business scenarios and customer demands have tempered the pursuit of extreme performance of Alibaba Cloud Panjiu servers. We have begun to explore the integration and optimization of software and hardware very early to support hardware acceleration of computing/storage applications. And network protocol hardware acceleration, Shenlong MOC, AliFPGA and AliFIC and other hardware offload capabilities are industry-leading. In addition, Panjiu servers support full-link chip-level hardware encryption.

● Fourth, Panjiu servers operate big data intelligence based on Alibaba Cloud's existing millions-scale servers, support sustainable design of server devices, natively support memory isolation and QoS guarantee, and intelligently predict, diagnose and repair overall server hardware failures The level is higher than the industry average.

Apsara Conference 2022 in the future, Panjiu's server products will continue to evolve the Panjiu self-developed server family based on the needs of customers' business scenarios, especially the business needs of Alibaba Cloud's basic products and various industry solutions, relying on innovative computing, storage and interconnection technology capabilities. series of products.


02:Apsara Conference 2022 A new generation of high-density GPU servers and liquid cooling cluster solutions



Recommendation: The industry's highest density Alibaba Cloud releases a new generation of submerged liquid-cooled GPU server cluster solutions

On the morning of the 20th, Apsara Conference 2022 and NVIDIA released an immersion liquid-cooled GPU server cluster solution, which aims to solve the current energy consumption increase caused by the continuous improvement of high-performance computing and artificial intelligence performance. It can support 8 NVSwitch high-speed interconnected 500W NVIDIA A100 GPU chips in an industry-standard 2U space, improve heat dissipation efficiency through industry-leading full immersion liquid cooling technology, and support high-density deployment of GPU chips in high-performance mode for long-term and stable operation, so that maximum Unleash computing potential to the limit. Whether in the emerging fields of deep learning, VR or 3D rendering, or in traditional financial analysis, science and technology, and genetic engineering industries, customers will obtain AI heterogeneous computing power with stronger performance, more flexibility and ease of use.

Product performance highlights

● First, high density. Traditional heterogeneous computing servers usually require 8-card GPGPU with 4U or even larger space, while Panjiu high-performance heterogeneous computing servers can be implemented in 2U space. A single cabinet can achieve a power density of 80kw or even 100kw, and achieve high computing power through high density.

● Second, high energy efficiency. According to the heat dissipation characteristics of immersion liquid cooling and the advantages of Fangsheng architecture server hardware, GPU chips and boards dedicated to liquid cooling have been customized and developed. Compared with traditional air-cooled models, the overall computing power and various indicators of liquid-cooled GPU servers greatly exceed those of traditional air-cooled models, and GPU performance can be improved by more than 20% during the same period.

● Third, high reliability. In a traditional air-cooled environment, high energy consumption brings about an increase in operating temperature, which makes chip components easily damaged or unstable. Even though the current mainstream chip manufacturing process has evolved to 7nm, the power consumption of mainstream GPUs has soared with the rapid increase in the number of cores; the solution based on liquid cooling can support long-term stable operation of GPU chips in high-performance mode, and the chip reliability is higher than Traditional servers are improved by more than 50%.

Two sub-forums



01:Cloud Server Technology Frontier Exploration Forum



Servers have been providing surging computing power, massive storage and high-speed network capabilities for cloud computing, and industry-related technologies have also continued to evolve and innovate. On the morning of the 20th, in the sub-forum of "Exploring the Frontiers of Cloud Server Technology", industry experts from Alibaba Cloud, benchmarking organization MLCommons, CXL Alliance, China Merchants Bank, etc. gathered together to discuss the current industry concerns about heterogeneous computing, secure computing, immersion Liquid cooling and cloud server architecture design are discussed.

The participants believed that with the popularization of cloud computing and the rise of cloud native, the hardware and software architecture of cloud servers in data centers is also continuously reconstructed and evolved. The modularized and air-liquid-cooled normalized architecture design will continue to innovate to support the technological iteration of chips and components. The exploration of memory pooling and AI heterogeneous computing power pooling technology is also looking for the best application scenarios for future cloud-native applications. In addition, the combination of immersion liquid cooling, heterogeneous computing standardization and extreme optimization will greatly improve the server foundation. The energy efficiency of facilities brings more cost-effective services to customers.

02 One cloud with multiple cores--ARM Cloud Ecology Forum

As a new force, the ARM architecture has gradually been applied in many business fields, but it also faces many challenges. Among them, ecology is the key to the large-scale application of ARM. This sub-forum invites experts and scholars from industry and academia, including Zhejiang University, ARM China, Linaro, and Alibaba Cloud internal teams, to share the best practices of ARM software and hardware ecosystems in the fields of OS, JVM, compilers, and core products.

phone Contact Us