Yitian 710 instance helps ABACUS new practice
01 AI for Science explores a new paradigm of material research and development
First, let's talk about how AI for Science defines a new paradigm in the material R&D process. Material innovation is the driving force for the development of drug design, new energy and other fields. The paradigm of material research and development has been upgraded from the traditional trial and error of a large number of repeated experiments to the computationdriven process of screening out possible materials through theoretical simulation and then verifying research and development.
However, in the process of theoretical simulation, the disaster of dimensionality hinders the progress of highprecision calculation. To borrow Paul Dirac's words: "With quantum mechanics, for all chemical problems and most physical problems, the basic laws of physics have been clarified. The difficulty is only that the precise application of these laws will lead to too complex mathematical equations to solve."
In order to solve the problem of dimensionality disaster, scientists abstract the most basic and accurate physical model layer by layer, and select the corresponding physical model in different systems to ensure that the calculation results can be given in a reasonable time. However, physical models of different scales have many orders of magnitude differences in time and space, and the accuracy of the results is not small.
The AI for Science method aims to solve the above problems: it can apply the features learned under the highprecision method to a larger system through machine learning, and has highprecision calculation results and efficient solution time.
There are still some challenges in atomic scale molecular simulation methods. Traditional molecular dynamics methods require scientists to provide empirical parameters of force field, and the development cycle of potential function is very long.
For density functional theory (DFT), the code branches of DFT software are very complex and the R&D cycle is long; The DFT algorithm uses the commutative correlation functional approximation, but the higher the accuracy of the approximation, the greater the computational load.
The deep potential energy method is a molecular dynamics method based on machine learning. It is a good coupling of scientific computing, machine learning and highperformance computing.
The figure on the left shows the training process of the depth potential energy method. It uses DFT to calculate the atomic potential, uses neural network to learn the potential function, and finally applies it to molecular dynamics, so as to achieve efficient and highprecision calculation.
The results in the middle of the figure above show the comparison between the molecular radial distribution function calculated by the depth potential energy method and the traditional DFT method, and the results are very consistent.
DFT is an algorithm for directly solving the properties of matter by solving the Schrodinger wave function equation.
Given the cell parameters and other information of a crystal, its basic physical properties such as conductivity and system density can be obtained by DFT calculation. It is a firstprinciples method that hardly requires empirical parameters. Researchers also won the 1998 Nobel Prize in Chemistry for this work.
As shown in the formula above, the core of density functional theory is to express the total energy E of the system as the electron density ρ Functional of.
The lower left corner shows the Jacob ladder of DFT, which starts with the simplest local density approximation, and gradually approaches the results calculated by accurate quantum mechanical methods at the cost of computation. The DeePKS method uses a neural network model to represent the difference between the high precision method and the low precision method; Researchers can use DeePKS software, first use low precision functional, give results with relatively high efficiency, and then add the correction value given by DeePKS method to the results, so as to make the calculation results approximate high precision functional. The right figure is the radial distribution function of the distance between oxygen elements in water molecules: the results calculated by DeePKS method and highprecision DFT method can be well consistent; In contrast, the calculation results using PBE functional will have a large deviation.
AI+material science paradigm needs to rely on DFT software to generate data for AI model training; The model in training will affect the calculation results of DFT software again, and this process needs to be iterated repeatedly until convergence, in which a large number of DFT calculations are required.
ABACUS (Chinese name: "Atomic Computing") is a domestic opensource density functional theory software. ABACUS was first developed by the research group of He Lixin, a teacher from China University of Science and Technology, and joined the Deep Modeling open source community in March 2021.
As a scientific computing software, ABACUS R&D has broken through the code development method of traditional research groups, and hosted the code on the GitHub platform. Open source contributors are welcome to develop new functions and repair errors together.
After joining the DeepModeling community, ABACUS welcomed contributors from different institutions such as China University of Science and Technology, Peking University, Institute of Physics and Beijing Institute of Science and Intelligence (AISI).
It is worth mentioning that the Beijing Institute of Intelligence for Science (AISI) is the first scientific research institution with AI for Science as its mission established by the Evian Academician in 2021.
02 Adaptation and optimization of Yitian 710
Next, we will introduce how ABACUS migrated to the Yitian 710 cloud platform. The operating system of Yitian 710 is Ali Linux3 provided by Alibaba Cloud, which can well support the software ecosystem of existing Linux. When installing the software, users can download the required dependencies directly from the package manager, and do not need to recompile manually.
Yitian 710 chip is developed based on ARM v9 architecture and supports instruction set acceleration such as SVE, INT8mm and BF16mm. In addition, the ARM platform provides you with a highperformance math library, which allows researchers to focus on algorithm development without worrying about the implementation of matrix calculation.
The above figure shows the performance comparison of ABACUS with its own calculation example. The rightmost column is the calculation time of Yitian 710, which is the same as that of the 7generation x86 architecture CPU and the 6generation highfrequency instance.
It should be noted that in the above test, the instance used by Yitian 710 is the 4xlarge specification, which is only half of the x86 instance. This result is due to the independent physical core, independent cache and ALU performance of Yitian 710, and there is no excessive thread loss.
03 Example verification of 10000 cores based on EHPC
The researchers have carried out the verification of calculation power close to the actual situation on Alibaba Cloud EHPC. The elastic highperformance computing EHPC service provided by the Alibaba Cloud team creates computing nodes from the ECS virtual machine image to realize the elastic scaling of computing cluster resources, and ensures that computing can be carried out efficiently under the same operating experience as local supercomputing on the cloud.
When EHPC creates ECS instances, it can choose bidding instances to allow scientific computing tasks that are not sensitive to time requirements to perform scientific computing at a very low price during the low period of the use of cloud platform resources.
ABACUS calculated the newly developed Stochastic DFT method for 32 boron atom systems at 350 eV extreme high temperature on a cluster of Yitian 710 instances.
As shown in the figure above, the researchers used 11008 CPU cores of Yitian 710, namely 86 128core nodes. This is a weakly scalable task. The amount of computation for each core is certain. The consumption of computing resources increases linearly with the increase of the number of cores.
Under the two different task partition modes of orbit parallel and Kpoint parallel implemented by ABACUS, the computing time increases in a limited range, and the parallel performance is very good. In terms of software accuracy, the energy and pressure calculated by different core numbers are consistent, and the calculation results are correct.
Each displayed data point is initialized and calculated by ten random seeds to avoid systematic random error. With the increase of calculation amount, the standard deviation of pressure and energy is constantly converging until the theoretical optimal value. In the actual calculation, the researcher can select the corresponding core number for calculation according to the accuracy required by the downstream task.
This experiment verified the following advantages of the Yitian ECS instance:
First of all, Yitian 710 has a stable main frequency, which can ensure that the frequency will not be reduced in the scientific computing scenario of highdensity computing, and maintain the consistency of performance output.
Secondly, the scalability of the Yitian 710 instance is excellent. It can achieve nearlinear acceleration on the core scale of 10000 levels.
Finally, compared with x86, the cost performance of Rely on Heaven instance is very high, and the research institute can save nearly 70% of the cost.
Yitian helps us reduce costs and increase efficiency in the process of deploying traditional scientific computing tasks from local supercomputing to cloud platforms.
First, let's talk about how AI for Science defines a new paradigm in the material R&D process. Material innovation is the driving force for the development of drug design, new energy and other fields. The paradigm of material research and development has been upgraded from the traditional trial and error of a large number of repeated experiments to the computationdriven process of screening out possible materials through theoretical simulation and then verifying research and development.
However, in the process of theoretical simulation, the disaster of dimensionality hinders the progress of highprecision calculation. To borrow Paul Dirac's words: "With quantum mechanics, for all chemical problems and most physical problems, the basic laws of physics have been clarified. The difficulty is only that the precise application of these laws will lead to too complex mathematical equations to solve."
In order to solve the problem of dimensionality disaster, scientists abstract the most basic and accurate physical model layer by layer, and select the corresponding physical model in different systems to ensure that the calculation results can be given in a reasonable time. However, physical models of different scales have many orders of magnitude differences in time and space, and the accuracy of the results is not small.
The AI for Science method aims to solve the above problems: it can apply the features learned under the highprecision method to a larger system through machine learning, and has highprecision calculation results and efficient solution time.
There are still some challenges in atomic scale molecular simulation methods. Traditional molecular dynamics methods require scientists to provide empirical parameters of force field, and the development cycle of potential function is very long.
For density functional theory (DFT), the code branches of DFT software are very complex and the R&D cycle is long; The DFT algorithm uses the commutative correlation functional approximation, but the higher the accuracy of the approximation, the greater the computational load.
The deep potential energy method is a molecular dynamics method based on machine learning. It is a good coupling of scientific computing, machine learning and highperformance computing.
The figure on the left shows the training process of the depth potential energy method. It uses DFT to calculate the atomic potential, uses neural network to learn the potential function, and finally applies it to molecular dynamics, so as to achieve efficient and highprecision calculation.
The results in the middle of the figure above show the comparison between the molecular radial distribution function calculated by the depth potential energy method and the traditional DFT method, and the results are very consistent.
DFT is an algorithm for directly solving the properties of matter by solving the Schrodinger wave function equation.
Given the cell parameters and other information of a crystal, its basic physical properties such as conductivity and system density can be obtained by DFT calculation. It is a firstprinciples method that hardly requires empirical parameters. Researchers also won the 1998 Nobel Prize in Chemistry for this work.
As shown in the formula above, the core of density functional theory is to express the total energy E of the system as the electron density ρ Functional of.
The lower left corner shows the Jacob ladder of DFT, which starts with the simplest local density approximation, and gradually approaches the results calculated by accurate quantum mechanical methods at the cost of computation. The DeePKS method uses a neural network model to represent the difference between the high precision method and the low precision method; Researchers can use DeePKS software, first use low precision functional, give results with relatively high efficiency, and then add the correction value given by DeePKS method to the results, so as to make the calculation results approximate high precision functional. The right figure is the radial distribution function of the distance between oxygen elements in water molecules: the results calculated by DeePKS method and highprecision DFT method can be well consistent; In contrast, the calculation results using PBE functional will have a large deviation.
AI+material science paradigm needs to rely on DFT software to generate data for AI model training; The model in training will affect the calculation results of DFT software again, and this process needs to be iterated repeatedly until convergence, in which a large number of DFT calculations are required.
ABACUS (Chinese name: "Atomic Computing") is a domestic opensource density functional theory software. ABACUS was first developed by the research group of He Lixin, a teacher from China University of Science and Technology, and joined the Deep Modeling open source community in March 2021.
As a scientific computing software, ABACUS R&D has broken through the code development method of traditional research groups, and hosted the code on the GitHub platform. Open source contributors are welcome to develop new functions and repair errors together.
After joining the DeepModeling community, ABACUS welcomed contributors from different institutions such as China University of Science and Technology, Peking University, Institute of Physics and Beijing Institute of Science and Intelligence (AISI).
It is worth mentioning that the Beijing Institute of Intelligence for Science (AISI) is the first scientific research institution with AI for Science as its mission established by the Evian Academician in 2021.
02 Adaptation and optimization of Yitian 710
Next, we will introduce how ABACUS migrated to the Yitian 710 cloud platform. The operating system of Yitian 710 is Ali Linux3 provided by Alibaba Cloud, which can well support the software ecosystem of existing Linux. When installing the software, users can download the required dependencies directly from the package manager, and do not need to recompile manually.
Yitian 710 chip is developed based on ARM v9 architecture and supports instruction set acceleration such as SVE, INT8mm and BF16mm. In addition, the ARM platform provides you with a highperformance math library, which allows researchers to focus on algorithm development without worrying about the implementation of matrix calculation.
The above figure shows the performance comparison of ABACUS with its own calculation example. The rightmost column is the calculation time of Yitian 710, which is the same as that of the 7generation x86 architecture CPU and the 6generation highfrequency instance.
It should be noted that in the above test, the instance used by Yitian 710 is the 4xlarge specification, which is only half of the x86 instance. This result is due to the independent physical core, independent cache and ALU performance of Yitian 710, and there is no excessive thread loss.
03 Example verification of 10000 cores based on EHPC
The researchers have carried out the verification of calculation power close to the actual situation on Alibaba Cloud EHPC. The elastic highperformance computing EHPC service provided by the Alibaba Cloud team creates computing nodes from the ECS virtual machine image to realize the elastic scaling of computing cluster resources, and ensures that computing can be carried out efficiently under the same operating experience as local supercomputing on the cloud.
When EHPC creates ECS instances, it can choose bidding instances to allow scientific computing tasks that are not sensitive to time requirements to perform scientific computing at a very low price during the low period of the use of cloud platform resources.
ABACUS calculated the newly developed Stochastic DFT method for 32 boron atom systems at 350 eV extreme high temperature on a cluster of Yitian 710 instances.
As shown in the figure above, the researchers used 11008 CPU cores of Yitian 710, namely 86 128core nodes. This is a weakly scalable task. The amount of computation for each core is certain. The consumption of computing resources increases linearly with the increase of the number of cores.
Under the two different task partition modes of orbit parallel and Kpoint parallel implemented by ABACUS, the computing time increases in a limited range, and the parallel performance is very good. In terms of software accuracy, the energy and pressure calculated by different core numbers are consistent, and the calculation results are correct.
Each displayed data point is initialized and calculated by ten random seeds to avoid systematic random error. With the increase of calculation amount, the standard deviation of pressure and energy is constantly converging until the theoretical optimal value. In the actual calculation, the researcher can select the corresponding core number for calculation according to the accuracy required by the downstream task.
This experiment verified the following advantages of the Yitian ECS instance:
First of all, Yitian 710 has a stable main frequency, which can ensure that the frequency will not be reduced in the scientific computing scenario of highdensity computing, and maintain the consistency of performance output.
Secondly, the scalability of the Yitian 710 instance is excellent. It can achieve nearlinear acceleration on the core scale of 10000 levels.
Finally, compared with x86, the cost performance of Rely on Heaven instance is very high, and the research institute can save nearly 70% of the cost.
Yitian helps us reduce costs and increase efficiency in the process of deploying traditional scientific computing tasks from local supercomputing to cloud platforms.
Related Articles

A detailed explanation of Hadoop core architecture HDFS
Knowledge Base Team

What Does IOT Mean
Knowledge Base Team

6 Optional Technologies for Data Storage
Knowledge Base Team

What Is Blockchain Technology
Knowledge Base Team
Explore More Special Offers

Short Message Service(SMS) & Mail Service
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00