On-site visit of Alibaba Cloud Summit, what is the strength of "one cloud with multiple cores"?

"Han Xin points out the troops, the more the better." At present, the cloud platform can manage more and more nodes, but the unified management of different nodes across the CPU architecture is still a problem in the industry. The author is now participating in the Alibaba Cloud Summit these days. Personally It is believed that Zhang Jianfeng, president of Alibaba Cloud Intelligence, released "one cloud with multiple cores", which is one of the most revolutionary technologies in the history of cloud computing development this year. Hardware such as dedicated chips is packaged into standard computing power. Whether the underlying layer is X86, ARM or RISC-V, it provides customers with standard, high-quality cloud computing services.

In the past, cloud operating systems could only shield the hardware details of CPUs of one architecture. For example, different models of Intel X86 CPUs could form a cluster, and tenants on the cloud would not know what type of CPU the instance uses when using cloud services. But if you have X86 and ARM on your cloud platform, you can only use two clouds to manage them separately. Previously, CISC processors represented by X86 focused on the server and cloud computer markets, while RISC processors represented by ARM and RISC-V focused on mobile and IoT terminals. But recently the situation has changed.
RISC that opens new doors
After Nvidia initiated the acquisition of ARM, it took out its first CPU chip Grace as scheduled at the new product launch conference in April. Because ARM uses a RISC-style simplified instruction set, the ARM core is naturally more powerful than X86 in terms of instruction prediction and so on. Advantages, energy consumption is also lower than X86. Of course, these are the traditional advantages of ARM over X86. The biggest innovation of Grace this time is to increase the communication speed between CPU and GPU by nearly 10 times. According to Huang Renxun, "This is the result of several years of research and development by 10,000 engineers, aiming to meet the computing needs of the world's most advanced applications, and its computing performance and throughput rate are unmatched by any previous architecture. "
Judging from the latest AI development trends, the latest artificial intelligence models often require very high computing power. For example, the parameter Alibaba Cloud Summit scale of GPT-3, which can automatically write code, has exceeded 100 billion. GPT-3 Chinese version PLUG parameter scale is also comparable, and GPT-3 variant can convert text description into image cross-modal generation model DALL.E, the number of parameters of the entire model is more than 150 billion, up to 150 billion. Quite a few scientists have directly pointed out that larger models tend to perform better, and scaling up may still be the way to achieve better performance. In Huang Renxun's words at the press conference, "The number of parameters of large-scale pre-training models has increased by 3,000 times in three years. We estimate that models with 100 trillion parameters will appear in 2023." It can be said that as the model becomes larger and larger, Ordinary startups can only use the latest and best Alibaba Cloud Summit AI models through the AI ​​cloud, so I personally think that the technical route put forward by Grace of the N factory is completely correct.
Inseparable X86
However, Alibaba Cloud Summit the secure computing instruction set SGX in Intel's newly released Xeon third-generation Ice Lake chip is also difficult to give up. Multi-party secure computing is definitely a future-oriented black technology for people like me who have worked in the banking industry for many years. The application scenario of the so-called multi-party secure computing can be expressed by the millionaire problem. If two millionaires meet on the street, they both want to show off their wealth and compare their wealth, but out of privacy, they do not want the other party to know that they have How much wealth, how to let them know who is richer among each other without the help of a third party? In response to this problem, in the 1980s, Academician Yao Qizhi of Tsinghua University proposed a solution, and thus won the Turing Award, which proved the feasibility of the multi-party trusted computing problem from a theoretical level.

However, in practice, multi-party secure computing still plagues the industry, especially in the financial industry where the author is located. Generally speaking, financial institutions have a lot of very valuable data, but how to use the value of data has made it difficult for major banks , even the Industrial and Commercial Bank of China, known as the Universal Bank, has a market share of less than 10%. They cannot train a particularly good model by relying on their data alone. Let’s share the data of major banks and have customers. Risk of privacy leakage. How to perform calculations without letting other participants see the real data, and to implement the solution of Academician Yao Qizhi, has become a difficult problem.

Under this classic problem, at present, only a few platforms such as GAIA CUBE of Blue Elephant Zhilian can combine the data of multiple parties for joint calculation and obtain plaintext calculation results without data leakage, so as to realize the ownership of data and data. The separation of use rights, and this is also based on the blockchain software technology mechanism to ensure security and credibility Alibaba Cloud Summit.

The SGX supported by the third generation of Intel Xeon completely reassures users from the perspective of hardware. Secure computing actually adds a safe room to the computer. Even the most privileged administrators cannot enter the safe room, let alone in front of the safe room. Deployment monitoring. All interactions between the safe room and the outside world must be encrypted and integrity checked.

In fact, Intel has Alibaba Cloud Summit implemented SGX technology a few years ago, but the memory space that SGX can create at that time is only 128M, and the current AI machine learning model needs hundreds of M, and even dozens or hundreds of G. The SGX simply cannot accommodate such a model and cannot be used in multi-party secure computing. However, this time Ice Lake-SP can support up to 1T of security space. This level of improvement will comprehensively expand the application scenarios of SGX. For example, Tencent has teamed up with Beijing Microchip Edge Computing and Blockchain Research Institute to integrate blockchain with SGX. Combined, ensure data security, and make the final data available and invisible, thereby breaking the current data silos between the edges of various institutions and giving full play to the maximum value of data. Alibaba Cloud Summit
One cloud and multiple cores that unify the rivers and lakes
The problem now is that end users often want both the AI ​​reasoning capabilities of NVIDIA ARM chips and the secure computing capabilities of X86. According to the previous plan, multi-cloud collaboration must be performed, and ARM clusters and The X86 cluster builds the supporting equipment for storage and network respectively so that the entire cloud system can operate normally. It will not only waste certain resources but also bring about management problems of multi-cloud collaboration. Alibaba Cloud's latest one-cloud and multi-core solution actually uses a set of cloud systems. The operating system manages hardware server clusters of different architectures. Its biggest feature is that it can standardize the computing power of CPUs of different architectures, so as to Alibaba Cloud Summit fundamentally solve the multi-cloud management problem caused by the coexistence of different types of CPUs. Of course, we will see the follow-up Alibaba The cloud will continue to move forward, and truly standardize the computing power of different CPUs from the bottom layer of the cloud operating system. All requirements can be met within the cloud's standardized resource pool Alibaba Cloud Summit .

