Community Blog How Is Alibaba DAMO Academy Helping to Fight the Outbreak of the Novel Coronavirus?

In this blog, we'll take a look at Alibaba DAMO Academy's algorithm innovation that helped improve the diagnostic efficiency in the fight against the coronavirus outbreak.

Alibaba DAMO Academy is using AI algorithms to fight the outbreak of the novel coronavirus (COVID-19). On February 1, the Zhejiang Center for Disease Control and Prevention (Zhejiang CDC) launched an automated whole-genome detection and analysis platform. Using the AI algorithm developed by Alibaba DAMO Academy, they can reduce the time required to perform genetic analysis on suspected cases of pneumonia caused by the novel coronavirus from several hours to 30 minutes. This greatly reduces the diagnosis time and allows for the accurate detection of virus mutations.

Dr. Gu Fei, an algorithm expert from Alibaba DAMO Academy, performs genetic detection and analysis at the CDC.

The outbreak of the novel coronavirus remains a serious problem in China, and rapid and precise diagnosis is especially important to control the epidemic. According to official information, the coronavirus has one of the longest genetic sequences of any virus, with a total length of 29847bp. In clinical diagnosis, patient samples must be compared with the genetic sequence of the virus to make a diagnosis.

Currently, hospitals generally use the nucleic acid detection method, which can only detect a part of the virus genome. However, if the virus mutates, this detection method may no longer be able to detect it.

Structure of the novel coronavirus

Unlike traditional nucleic acid detection methods, whole-genome detection technology can analyze and compare the entire genetic sequence of virus samples in suspected cases to effectively prevent misdiagnosis due to virus mutations. The platform jointly developed by Alibaba DAMO Academy and Matridx Biotechnology uses the whole-genome detection method. Its major advantage is its significant reduction of the detection time.

In the genetic analysis stage, the system provided by the Alibaba DAMO Academy and Alibaba Cloud Elastic Compute Service team also provides rapid virus genome splicing capabilities, allowing technicians to quickly and accurately capture mutated virus sequences, secondary structures, and three-dimensional structures. This information provides the foundation for virus vaccines and drugs.

Setting the parameters for genome detection and analysis

Algorithm innovation allowed for the improvement in diagnostic efficiency. Alibaba DAMO Academy performed feature analysis on the coronavirus genome and developed multiple algorithm models. In the sequence comparison process, Alibaba DAMO Academy introduced a distributed design for algorithms to improve the comparison efficiency. In the virus genome splicing stage, a distributed de Bruijn graph algorithm is used to accurately detect virus variants.

Dr. Sun Yi, director of genetic sequencing at the Zhejiang CDC, said, "Based on Alibaba Cloud's powerful computing capabilities and the new algorithms supplied by Alibaba DAMO Academy, this platform can support virus analysis. In the future, it will be able to cover the entire range of confirmed cases in a short period of time, laying a solid foundation for subsequent vaccine and drug development."

We spoke with the algorithm experts from Alibaba DAMO Academy, who described about the platform as follows:

What Are the Features of This Platform?

This automated whole-genome detection and analysis platform is a high-throughput sequencing platform jointly developed by the Zhejiang CDC, Alibaba DAMO Academy, and Matridx Biotechnology. It provides the Zhejiang CDC with fully automated library creation and distributed computing analysis capabilities for the mitigation and control of the novel coronavirus outbreak. Matridx Biotechnology developed a component for building fully automated high-throughput sequencing libraries, allowing technicians to do what used to take 12 hours in just two hours. Each sequencing process generates a large amount of data. Alibaba DAMO Academy uses distributed analysis algorithms to reduce the time required for sample genome analysis from several hours to half an hour. At the same time, due to the use of distributed algorithms, the speed of virus genome splicing is reduced from 30 minutes or 1 hour to only 15 to 30 minutes. In addition, unlike traditional nucleic acid detection methods, this platform can detect the entire genome of the virus to avoid false negatives caused by virus mutation.

What Is the Value of the Algorithms Provided by Alibaba DAMO Academy?

Alibaba DAMO Academy analyzed the features of the novel coronavirus genome and optimized and trained the algorithms based on data from common datasets, such as PDB. Virus detection and virus mutation analysis are based on open-source algorithms, and we designed distributed algorithms to accelerate the analysis process. After virus genome splicing, we apply a model trained using the BiLSTM + DNN method to predict the secondary structures of viral proteins. At the same time, Alibaba DAMO Academy is currently researching prediction models and drug screening models of protein three-dimensional structure based on virus genome sequencing.

While continuing to wage war against the worldwide outbreak, Alibaba Cloud will play its part and will do all it can to help others in their battles with the coronavirus. Learn how we can support your business continuity at https://www.alibabacloud.com/campaign/supports-your-business-anytime


