From the second low point in development in 1990s to 2006, neural networks once again entered the consciousness of the masses. And in 2012, the convolutional neural networks (CNN) model experienced a major breakthrough in the form of ImageNet in the field of image classification.
There are two core concepts to Convolutional Neural Networks. One is convolution and the other is pooling. At this point, some may ask why we don't simply use feed-forward neural networks rather than convolutional neural networks. Taking a 1000x1000 image for example, a neural network would have 1 million nodes on the hidden layer. A feed-forward neural network, then, would have 10^12 parameters. At this point it's nearly impossible for the system to learn since it would require an absolutely massive number of estimations.
However, a large number of images have characteristics like this. If we use convolutional neural networks to classify images, then because of the concept of convolution, each node on the hidden layer only needs to connect and scan the features of one location of the image. If each node on the hidden layer connects to 10*10 estimations, then the final number of parameters is 100 million, and if the local parameters accessed by multiple hidden layers can be shared, then the number of parameters is decreased significantly.
Another operation is pooling. A convolutional neural networks will, on the foundation of the principle of convolution, form a hidden layer in the middle, namely the pooling layer. The most common pooling method is Max Pooling, wherein nodes on the hidden layer choose the largest output value. Because multiple kernels are pooling, we get multiple hidden layer nodes in the middle.
These two characteristics of CNN have made it popular in the field of image processing, and it has become a standard in the field of image processing. And CNN has widespread real-world applications, for example in investigations, self-driving cars, Segmentation, and Neural Style. Neural Style is a fascinating application. For example, there is a popular app in the App Store called Prisma, which allows users to upload an image and convert it into a different style. For example, it can be converted to the style of Van Goh's Starry Night. This process relies heavily on CNN.
For more details about deep learning, particularly convolutional neural networks (CNN) and recursive neural networks (RNN), please go to All You Need to Know About Neural Networks – Part 2.
Person Re-Identification (ReID) has been a research focus in computer vision in recent years. It is the process of retrieving images of a person across devices based on a given image of that person.
In AlignedReID, deep convolutional neural networks extract both global features and local information. The distance between any pair of local information in two images is calculated to generate a distance matrix. Then the shortest path from the upper left corner to the lower right corner of the matrix is calculated through dynamic programming. An edge of the shortest path corresponds to the matching of a pair of local features, which gives a way of aligning the human body. The total distance of this alignment is the shortest when ensuring the relative order of the different parts of the body. During training, the length of the shortest path is added to the loss function to aid in the study of the overall features of a person.
The infamous Convolutional Neural Network (CNN) technology has helped us to complete the task of image recognition. However, we still need to add some additional functions to complete the task of positioning; this is where deep-learning comes into play.
In this article, we will discuss the evolution of object detection technology from the perspective of object positioning. Here's how we can briefly denote the evolution of object detection technology: R-CNN -> SppNET -> Fast R-CNN -> Faster R-CNN
This guide creates an image recognition model using the deep learning framework TensorFlow in Alibaba Cloud Machine Learning Platform for AI with the convolutional neural network (CNN) for the training model. The entire procedure takes about 30 minutes to complete. After the procedure, the system is able to recognize the bird in the following image.
Neural networks were first introduced to the biological field to form a network of nerves in the brain. Then mathematical formulas were used to simulate the procedure of how the brain analyzing objects. Later, the deep learning framework was introduced. Scientists write code to build deep learning networks. Complex deep learning networks always contain tens or even hundreds of rows of code.
Machine Learning Platform for AI provides end-to-end machine learning services, including data processing, feature engineering, model training, model prediction, and model evaluation.
Alibaba Cloud Image Search is an intelligent image search service that helps users find similar or identical images. Based on machine learning and deep learning, the product enables end-users to take a screenshot or upload an image to search and find desired products and fulfill other search requests.
Alibaba Clouder - November 5, 2019
Alibaba Clouder - November 4, 2019
Alex - January 22, 2020
Alibaba Clouder - March 9, 2017
Alibaba Clouder - March 8, 2017
Alibaba Clouder - October 31, 2019
More Posts by Alibaba Clouder