Nowadays, artificial intelligence (AI) technology is widely applied and indispensable to various industries. The use of AI to boost development is an issue that every enterprise is thinking about. In this article, Hua Xiansheng analyzes the development of AI based on how AI is applied in different industries and predicts its future development. Hua Xiansheng is the Director of the AI Center and the City Brain Lab of Alibaba DAMO Academy, and an academician of the Institute of Electrical and Electronics Engineers.
The development of AI has undergone three peaks. The first peak was the initial rise of AI in the 1950s. The second peak was the AI expert system developed in the 1980s. Now, we predict a third peak coming. The first two peaks did not drastically transform society as expected. Will the third one make a difference? The third peak has four distinctive characteristics:
As a powerful supporting tool for AI, deep learning provides solutions to many unsolved problems.
The improvement of cloud computing and chip computing power greatly increases the data processing capability of computers.
The massive amount of data accumulated in various industries provides the precondition for creating value for AI applications.
The successful application of search engines, e-commerce recommendations, and facial recognition payment provides references for the development of AI.
Rapid Growth of the AI Industry
The AI industry has developed rapidly since around 2012. In 2017 and earlier, more than 8,000 AI startups emerged around the world, and the number of active AI startups increased 14-fold. In 2017, 167 Chinese AI enterprises received a total investment of more than USD 5 billion. Currently, most enterprises have set up their own AI laboratories to assist in developing their businesses.
Predicament of AI
Although AI is developing rapidly, many problems still exist in the practical application of the technology, such as revenue difficulties, large differences in data, difficulty in realizing core value, and high user expectations.
The use of AI to generate revenue is the top challenge for AI enterprises. According to industry statistics for 2018, more than 90% of AI enterprises are losing money. AI projects have the disadvantages of requiring high investment but offering a low return on investment (ROI) because enterprises have to invest heavily in project customization and development. As the technology gap between enterprises is narrowing, the returns on developing general AI products are also decreasing.
The second predicament of AI implementation is the huge difference between experimental data and actual data. Due to the huge differences between public data sets and real enterprise data, results in real enterprise scenarios are often unsatisfactory. Enterprises cannot correctly estimate the effect of the application of these technologies, greatly reducing the confidence of enterprises. For example, in labs, developers' code has high accuracy on Labeled Faces in the Wild (LFW), a famous facial recognition dataset. However, due to the great differences between scenarios, the results are often unsatisfactory when such code runs in actual scenarios.
Another example is Person Re-ID. The difference between the public test dataset and the actual application is even greater. Compared with the data in the public data set, people in real scenarios often wear different clothes, ride different vehicles, and perform different actions. Such changes have caused great difficulties with the identification of the algorithm and greatly reduced the accuracy of the algorithm.
Available Technologies vs. User Demands
The third predicament of AI is the existence of large differences between the technologies that are available and the real demands of users. Enterprise users have high expectations for AI and often hope to solve most of their business problems by using AI. However, AI is only a good solution for some types of business problems. It cannot meet all business demands.
Generally, successful AI technology creates core values for enterprises. The values of AI technology can be divided into the following categories.
If an enterprise invests a lot of manpower and material resources into improving existing applications, it often cannot get the expected return, resulting in the waste of resources.
In this case, the technology developed by enterprises can create irreplaceable value for the industry and provide good solutions for some business problems.
A new AI technology brings new demands and new businesses. The large screen of mobile phones is a good example. Mobile phones were designed for communication, but as the technology developed, mobile phones allowed users to watch videos, browse websites, and make video calls through larger screens. Larger screens have become an indispensable function of mobile phones.
Large-Scale AI Applications
Based on the large-scale AI application, Alibaba promotes its AI research and development by solving problems in actual application scenarios. The following figure shows some of the AI applications of Alibaba. This article describes AI based on some of these applications.
Presently, visual search technology is widely used throughout many fields, such as general search, commodity search, city search, and raw material search. The following example uses visual search in the e-commerce field to describe the key technologies of visual search.
The e-commerce visual search process can be divided into six steps: category classification, subject detection feature extraction, retrieval and indexing, sorting, and result presentation. The system identifies the category of the products in the image based on the algorithm. Then, the system marks the target products in the image through subject detection, and converts the pixels of the image into computable features based on the algorithm. The product image search engine compares the feature data of the products and that in the index, sorts the returned results by similarity, and finally displays the rearranged product.
Feature learning is a key step in visual search. The feature learning method based on deep learning extracts the features of products from images and converts images into comparable vectors. Deep learning improves the effect of feature learning and allows users to design network structures to force neural networks to converge image features ideally. This greatly improves the accuracy and recall rate of the search algorithm.
Indexing and search systems are two other challenges in visual search. Searching for vectors is a challenge. A common method is to transform the vector features of the image into data that can be indexed. In the search phase, enterprises build a search system to process users' search requests. The search system distributes the index data to multiple servers for storage, sends each search request to different servers for processing, summarizes, and sorts all of the search results.
The pailitao feature of the Taobao app uses the visual search technology to recognize products in users' images and retrieve them. The following figure shows the effect of pailitao. Here, the products retrieved by the system are basically the same as or similar to the products in the images uploaded by the user. This feature reduces the time for users to manually search for products, greatly improving the user's shopping experience.
The visual creation technology can apply algorithms to create visual data, including images, three-dimensional (3D) graphics, and videos. This technology converts users' ideas into visible visual data by combining visual analysis, search, and visual creation engines. The following are examples of the application of visual creation in different scenarios.
Alibaba has used visual creation technology to turn product pages on Taobao into videos in seconds. The system automatically analyzes the image and text on the product page and generates a video based on the analysis results. By converting a static product page to a dynamic video, it can increase the click-through rate (CTR) and the conversion rate of the product, and reduce the video production cost.
Visual creation can also be used to repair videos with low resolution. Repairing 1.5 hours of film manually would take 40 days, but it would only take 3 hours to repair it with AI technology. In the "Classic HD Films" module launched by Alibaba and Youku, more than 1,000 classic films are automatically repaired in a very short period through video repair and enhancement. The following figure shows the repair effect of the TV play, Soldiers Sortie. This TV play was played on a movie theater screen and was highly praised by the audience.
Through video creation, enterprises can embed their advertisements at the appropriate positions in the video. The technology for embedding video content uses algorithms to analyze video scenarios, places advertisements on horizontal/vertical planes or curved surfaces, and renders the advertisements seamlessly into the scenarios. This way, the enterprise advertising is naturally embedded into the video, without destroying the video content or increasing the video duration, avoiding negative reactions from viewers. The following figure shows the effect of video content embedding.
Luban is another application for visual creation. As a smart platform developed by Alibaba, Luban can automatically design print advertisements for users. It integrates users' copywriting, images, and desired advertising styles to automatically generate advertising images or posters that meet users' needs. Novice users can use Luban to generate 8,000 banners per second. During the Double 11 Global Shopping Festival in 2017, Alibaba Group used Luban to generate 410 million banners, which increased the influence of Double 11 and helped enterprises substantially reduce publicity costs.
The development of Luban was a valuable experience. The manual generation of a large number of advertising banners in such a short period would be a difficult task. Using automatic design technology makes this kind of scenario possible and is indispensable in big promotion scenarios. Enterprises have realized that AI application should focus on core demands first. AI promotes business innovation and business innovation improves AI.
Visual diagnosis is another important AI application. It is divided into two types: diagnostics for people (medical imaging technology) and diagnostics for products or machines (industrial visual technology, especially quality inspection technology.) The following are the technologies and applications from Alibaba Group for visual diagnosis.
The healthcare AI team of Alibaba is committed to making medical analysis and health management more efficient, inclusive, and cost-effective by using AI technologies. Statistics show that healthy behavior, among various health factors, accounts for as high as 50% of our health, but most of our health expenses are spent on medical services. Intensive care units (ICUs) are often the places where people spend the most on health. To protect people's health more effectively, the Alibaba medical team built an intelligent health management platform by collecting human auditory, visual, perceptual, and text data. The platform integrates and analyzes vital signs and health data to provide warnings for users with high-risk diseases such as diabetes, hyperlipidemia, and cardiovascular diseases, and analyzes the health data of users each day. It helps users to keep track of their physical condition in real-time and adjust their bodies at any time. It promotes users to maintain healthy behaviors to ensure their health.
Through deep learning and 3D image detection, the CT angiography (CTA) image analysis technology accurately divides and names the coronary artery, and identifies the stenosis area and small lesion plaque in the coronary artery. The accurate image and deep learning technology can detect all small lesions inside the patient.
AI is applied to extract the spine structure in orthopedics and precisely divide and measure the vertebral cone and intervertebral disc. Algorithms can help doctors with diagnosis and treatment and can differentiate degenerative diseases at a fine-grained level. This greatly improves the efficiency of diagnosis.
The following figure shows an example of the application of smart orthopedic technology in hip and knee joint surgery. By using the algorithm, the system can automatically mark the position, angle, and length of feature points in the joint, providing reliable references for doctors during surgery.
On the health search platform built by Alibaba, doctors can find information and medical images similar to the current cases on the platform. By referring to historical treatment records and treatment experience, they can better diagnose patients and formulate more reasonable solutions for patients.
Alibaba builds a knowledge graph based on healthcare data and stores the knowledge in databases that are accessible by Tmall Genie. Users can call information from databases through Tmall Genie. Tmall Genie provides health improvement solutions based on the knowledge graph. Tmall Genie can also automatically analyze and manage the health status of users.
Traditional medical culture emphasizes intervening in health before people get sick. The future health management solution of Alibaba helps users prevent health problems before they occur through cognition, judgment, decision-making, and learning, and combines with the AI medical technology to protect people's health.
The Alibaba AI medical team successfully launched a CT image analysis system for the coronavirus disease (COVID-19) on February 16. The system analyzes and outputs data, such as the probability of COVID-19 and the percentage of the diseased area of the lungs within 20 seconds, with a prediction accuracy as high as 96%. It has been installed in more than 160 designated hospitals. With more than 290,000 calls (based on most recent data), it reliably supports the rapid diagnosis of COVID-19.
The complete genome sequencing and analysis technology designed by Alibaba for COVID-19 can complete the comparison of all genomes within 10 minutes. The algorithm covers up to 95% of complete genomes, delivering diagnosis accuracy close to 100%. The technology reduces the complete genome sequencing time for COVID-19 from two or three days to 14 hours.
Industry Visual Diagnosis
Industrial visual diagnosis is widely used in product quality inspection and fault diagnosis in the manufacturing processes of cell plates, textiles, and large machinery. It aims to save manpower, improve the yield of products, and the accuracy and stability of the equipment. The following are examples of the application of industrial diagnosis in various industries.
Manual detection of solar cell plate defects takes a long time and fails to detect all defects. To solve these problems, Alibaba designed a solar cell plate detection system. This system can detect all cell plates and use the AI technology in analysis, improving the detection efficiency for enterprises 36-fold. The following figure compares the accuracy, speed, and granularity of identification between manual detection and AI-based detection. AI-based detection is superior to manual detection in all these aspects.
The industrial visual detection technology is widely used in transmission line inspection, food quality inspection, and other industry scenarios, and has had satisfactory results.
The AI pig breeding model created by Alibaba uses AI technology to monitor the physical condition of each pig in real-time. The AI technology also supports features such as remote counting, behavior analysis, feed monitoring, and health alerting.
The city brain system designed by Alibaba Group aims to integrate accumulated data in a city, use AI technology to analyze collected data, and propose optimization solutions for the city. The system uses AI and computing power to analyze city data and provide a data-backed basis for urban governance and services, making urban governance intelligent, efficient, cost-effective, and convenient. The system has made breakthroughs in the urban governance model, service model, and industrial development.
The following figure shows the structure of the city brain system. The city brain system collects various types of data, such as video data, Global Position System (GPS) data, and microwaves, and analyzes the video data to gain a preliminary understanding of the data. It analyzes and processes the generated cognitive information by using AI algorithms, and provides optimization solutions, such as traffic-light optimization, bus optimization, and accident-event alerting. The system automatically searches and mines data by integrating urban elements into the search engine. Therefore, the system can search for suspicious vehicles, detect traffic patterns, and locate congestion causes at the same time. The system can also predict various data, such as traffic flow and traffic accident probability, according to factors, such as current traffic conditions, weather, and events, and perform intervention based on the prediction.
Currently, Alibaba City Brain has been deployed in over 60 projects in more than 30 cities and urban areas. The City Brain AI Open Innovation Platform developed by Alibaba assists in the R&D and deployment of more than 10 research institutions and third-party manufacturers. The six groups of products of City Brain have been widely used in many important fields, such as transportation, security, and municipal management.
The City Brain AI Open Innovation Platform has five advantages: comprehensive functions, flexibility, high real-time performance, efficient operation, and high openness. It can provide safe and reliable support for development and research teams at the AI platform level. The large-scale video analysis and processing acceleration technology provided by the platform enables a single server to process more than 100 videos at the same time, greatly increasing the efficiency of video data processing.
The all-weather incident detection feature of City Brain automatically arranges the incidents detected in cities in near real-time on the dashboard and continuously updates the relevant data. Based on the incident type, the system automatically handles the incidents or promptly notifies the traffic police.
Through the traffic perception and traffic signal optimization provided by Alibaba City Brain, the traffic efficiency in Hangzhou is increased by 15.3%. The system can report 20,000 incidents per day with 96% accuracy.
City Brain reduces the travel time of special vehicles, such as police cars, ambulances, and fire engines, by interfering with traffic lights and optimizing road traffic.
City Brain provides additional features, such as vehicle inspection, risk driving behavior detection, traffic flow prediction, municipal administration, and security detection. Through these features, City Brain supports the development of the city.
The 3D urban reconstruction and 4D inference features provided by City Brain can display the real-time city status and the running status of the city at different time points through a 3D sandbox.
During the design and implementation of City Brain, the Alibaba AI team created irreplaceable value by using the AI technology, explored application scenarios, optimized product features, and generated core competitiveness for the products. Finally, the team transformed the product into a platform and then into an ecosystem, providing strong technical support for urban governance and management.
Despite the limitations of today's AI methodologies, AI still has room for development because many problems in various industries have not yet been solved. Traditional and digital industries should embrace AI technology to support their further development. AI professionals should embrace the industry. To implement AI on a large scale, enterprises should create sufficient value for their customers. For individuals, AI is ubiquitous. We should embrace it and the changes it brings.
The role of AI depends on how we understand, develop, and use it. If it is not used well, AI will not have much influence. If we embrace AI, design products with advanced core AI technologies, and create irreplaceable value, AI can have a major impact worldwide and help everyone succeed.
Get to know our core technologies and latest product updates from Alibaba's top senior experts on our Tech Show series
Alibaba Clouder - January 24, 2017
Alibaba Clouder - November 5, 2019
Alibaba Clouder - July 27, 2020
Alibaba Clouder - July 10, 2018
Alibaba Clouder - August 17, 2017
Alipay Technology - November 6, 2019
Deploy custom Alibaba Cloud solutions for business-critical scenarios with Quick Start templates.Learn More
This solution helps you easily build a robust data security framework to safeguard your data assets throughout the data security lifecycle with ensured confidentiality, integrity, and availability of your data.Learn More
Alibaba Cloud equips financial services providers with professional solutions with high scalability and high availability features.Learn More
Alibaba Cloud is committed to safeguarding the cloud security for every business.Learn More
More Posts by Alibaba Clouder