Jia Yangqing's Thoughts on the Direction of Artificial Intelligence

Introduction: As an AI god, what makes Jia Yangqing impressive may be the AI ​​framework Caffe he wrote, which was six years ago. After years of precipitation, what does he think of artificial intelligence as a "new Ali"? Recently, Jia Yangqing shared his thoughts and insights inside Ali, and welcome to discuss and exchange together.

Jia Yangqing, a native of Shangyu, Zhejiang, graduated from the Department of Automation of Tsinghua University and obtained a Ph.D.

The popularity of deep learning in recent years is generally considered to be a milestone from the success of AlexNet in the field of image recognition in 2012. AlexNet has improved the acceptance of machine learning in the entire industry: In the past, many machine learning algorithms were at the level of "almost able to do demo", but the effect of AlexNet has crossed the threshold of many applications, causing a blowout interest in the application field.

Of course, nothing can be accomplished overnight. Before 2012, many successful factors have gradually emerged: the ImageNet database in 2009 laid the foundation for a large amount of labeled data; since 2010, Dan Ciresan of IDSIA used GPGPU for object recognition for the first time. ; In 2011, at the ICDAR conference in Beijing, neural networks shined in offline Chinese recognition. Even the ReLU layer used in AlexNet was mentioned in the neuroscience literature as early as 2001. Therefore, to a certain extent, the success of neural networks is also a natural process. After 2012, you can read a lot, so I won't go into details here.

success and limitations
While looking at the success of neural networks, we also need to dig deeper into the theoretical background and engineering background behind them. Why did neural networks and deep learning fail decades ago, but are now successful? What is the reason for its success? And what are its limitations? We can only say a few key points here:

The reason for the success is big data and high performance computing.
The reason for the limitation is the structured understanding and the effective learning algorithm on small data.
A large amount of data, such as the rise of the mobile Internet and low-cost platforms such as AWS to obtain labeled data, enable machine learning algorithms to break the data constraints; due to the rise of high-performance computing such as GPGPU, we can control the time. It can perform exaflop-level computations in days (or even less), making it possible to train complex networks. It should be noted that high-performance computing is not limited to GPU, a large number of vectorized computing on CPU, MPI abstraction in distributed computing, these are inseparable from the research results in the field of HPC that began to rise in the 1960s.

However, we also need to see the limitations of deep learning. Today, many deep learning algorithms have made breakthroughs at the level of perception, which can recognize speech, images, and these unstructured data. When faced with more structured problems, simply applying deep learning algorithms may not achieve very good results. Some students may ask why algorithms such as AlphaGo and Starcraft can be successful. On the one hand, deep learning solves the problem of perception. learning and other reinforcement learning algorithms together underpin the entire system. Moreover, when the amount of data is very small, the complex network of deep learning often cannot achieve good results. However, in many fields, especially in fields like medical care, data is very difficult to obtain. This may be a very difficult next step. meaningful research direction.

Next, what will be the direction of deep learning or, more generally, AI? My personal feeling is that although everyone has been paying attention to AI frameworks in the past few years, the homogenization of frameworks in recent years shows that it is no longer a problem that requires a lot of energy to solve, the wide application of frameworks such as TensorFlow in the industry, and the Various frameworks have used Python's excellent performance in the field of modeling to help us solve many problems that previously required our own programming.

Going up, we will encounter many new challenges in products and research, such as:

How should traditional deep learning applications, such as speech, images, etc., output products and values? For example, computer vision is still basically at the level of security. How to penetrate into medical care, traditional industries, and even social care (how to help blind people see the world?) These fields require not only technology, but also product thinking.
How to solve more problems than just voice and images. In Ali and many Internet companies, there is a "silent majority" application, which is the recommendation system: it often occupies more than 80% or even 90% of the machine learning computing power. How to further integrate deep learning and traditional recommendation systems? Finding new models for how to model the effects of search and recommendation, which may not be as well known as speech and images, is an indispensable skill for companies.
Even in the scientific research direction, our challenges have just begun: Berkeley professor Jitendra Malik once said, "We used to manually adjust the algorithm, but now we manually adjust the network architecture. If we are confined to this model, then artificial intelligence cannot progress." How to get out of the old way of manual parameter tuning and improve intelligence with intelligence is a very interesting question. The initial AutoML system is still at the level of brute-forcing the model structure with a lot of computing power, but now various more efficient AutoML technologies are starting to emerge, which is worthy of attention.

Going down, we will find that traditional system, architecture and other knowledge, and the practice of computer software engineering will bring many new opportunities to AI, such as:

Traditional AI frameworks are handwritten high-performance code, but the models are so varied and new hardware platforms emerge one after another. How should we further improve software efficiency? We've seen projects that in turn optimize AI frameworks through compiler technology and traditional AI search methods, such as Google's XLA and the University of Washington's TVM, and these projects, albeit at an early stage, are already showing their potential.
How the platform can improve integration capabilities. In the open source field, everyone's approach is to train a more academic model with one person, one machine, and several GPUs. However, in large-scale applications, our data volume is very large, the model is very complex, and the cluster will have various scheduling challenges (can we require 256 GPUs at once? Can computing resources be elastically scheduled?), these are very important to us. Owning a machine learning platform, as well as offering services to customers in the cloud, presents many challenges.
How to carry out the co-design of software and hardware. When the computing model of deep learning begins to gradually solidify (such as CNN), the advantages of new hardware and special hardware (such as ASIC) begin to manifest. How to realize the co-design of software and hardware to prevent problems such as "the hardware comes out, I don't know how to write the program" or "the model has changed, the hardware is outdated as soon as it comes out", will be a big direction in the next few years.
Artificial intelligence is a field that is changing with each passing day. We have a joke that the scientific research results in 2012 are now the stories of ancient times. The large number of opportunities and challenges brought about by rapid iteration is very exciting. Whether it is an experienced researcher or an engineer who is new to AI, in today's cloud-based and intelligent era, if we can quickly learn and refresh algorithms and engineering It can lead and empower all fields of society through algorithm innovation. In this regard, various open source codes, scientific research articles and platforms in the field of artificial intelligence have created an easier entry barrier than before, and opportunities are in our own hands.

Author: Jia Yangqing

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us