As one of the hottest subtopics of artificial intelligence, deep learning can be applied in many fields, like image recognition. But how to make use of unstructured data from large amounts of image data?
The processing of unstructured data usually involves the use of deep learning algorithms and these algorithms can be daunting for beginners. In addition, processing unstructured data usually requires powerful GPUs and a large amount of computing resources. This article introduces a method of image recognition using deep learning. This method can be applied to scenarios such as illicit image filtering, facial recognition, and object detection.
The experiment of creating an image recognition model using the deep learning framework TensorFlow in Alibaba Cloud Machine Learning Platform for AI may take about 30 minutes.
check_pointfor model storage,
cifar-10-batches-pyfor training data storage,
train_code. And then upload the dataset and source data to the corresponding directory of the OSS bucket.
For detailed information, please go to Image Classification with TensorFlow.
On December 25, 2018, Stanford University released the latest DAWNBench deep learning inference rankings. Alibaba Cloud ranked first in terms of image recognition performance and cost.
To achieve the fastest performance and lowest cost, the participating team made optimizations in the following three aspects: deep learning model selection, 8-bit quantitative optimization, and Alibaba Cloud GPU instance selection.
The Alibaba Cloud team selects the popular TensorFlow deep learning framework for optimization, so that Alibaba Cloud customers can have access to the optimization results. Int8 quantization is based on TensorRT. The difficulty of optimization lies in quantizing the well-trained TensorFlow model into the TensorRT Int8 model and loading the quantized TensorRT model to the TensorFlow computing diagram for inference.
The team then performed deep optimization based on the benchmark code of TensorFlow. The Kullback-Leibler divergence before and after quantization is calculated during Int8 quantization to calibrate the dynamic range of activated values at each layer of the neural network. The Alibaba Cloud team completes calibration in three phases: (1) create an Int8 quantization model; (2) calibrate the quantization model; (3) generate the optimized Int8 model based on the calibration results. The team then optimized the benchmark inference mode to import the optimized inference engine.
This article provides a fully verified solution (with code) to run LR and GBDT on a LibSVM-formatted dataset efficiently using TensorFlow.
MapReduce, Spark, and TensorFlow all utilize distributed computing capabilities to perform some calculations and solve specific problems. From this perspective, they all define a distributed computing model, that is, they put forward a computing method that enables the distributed computing of large amounts of data. However, MapReduce, Spark, and TensorFlow differ in the distributed computing model put forward. MapReduce, as its name implies, is a basic map-reduce computing model. Spark defines a set of RDD models, which are essentially a DAG consisting of maps/reduces. The TensorFlow computing model is also a graph, which is more complex than that of Spark. You need to define each node and edge in the TensorFlow graph. These definitions can be used to specify how TensorFlow computes this graph. Acting as a TensorFlow neural network, these specific definitions make TensorFlow suitable for processing a specific type of computations. The RDD model of Spark makes RDD suitable for processing non-correlated parallel data tasks. Is it possible to implement a general-purpose, simple, and high-performance distributed computing model? In my opinion, it is very difficult to implement this kind of computing model. The "general-purpose" feature usually means that the performance cannot be optimized based on specific circumstances. However, a distributed framework written for specific tasks is neither general-purpose nor simple.
TensorFlow Serving uses Alibaba Cloud elastic computing resources (Elastic Compute Service (ECS) or EGS), Server Load Balancer, and Object Storage Service (OSS) to perform prediction for TensorFlow models.
Data Science cluster is a new model available in E-MapReduce (EMR) 3.13.0 and later versions for machine learning and deep learning. You can use GPU or CPU models to perform data training through Data Science clusters. Training data can be stored on HDFS and OSS. EMR supports TensorFlow for distributed training on large amounts of data.
EMR is an all-in-one enterprise-ready big data platform that provides cluster, job, and data management services based on open-source ecosystems, such as Hadoop, Spark, Kafka, Flink, and Storm.
Alibaba Cloud Object Storage Service (OSS) is an encrypted, secure, cost-effective, and easy-to-use object storage service that enables you to store, back up, and archive large amounts of data in the cloud, with a guaranteed reliability of 99.999999999%.
Alibaba Clouder - December 27, 2018
GarvinLi - February 28, 2019
Alibaba Clouder - October 24, 2019
Alibaba Container Service - July 29, 2019
Alibaba Clouder - October 30, 2019
Ahmed Gad - August 26, 2019
More Posts by Alibaba Clouder