By Alex, Alibaba Cloud Community Blog author.
This tutorial explores how you can implement an object detection method that relies on deep neural networks on Alibaba Cloud. This kind of method employs the Single Shot MultiBox Detector (SSD) technique based on the paper by Wei, et. al published on Cornel State Library website. The SSD technique functions by discretizing bounding boxes in the output space to a "a set of default boxes with varying aspect ratios and scales" on the feature map location.
The SSD model's network generates scores for each object's presence in the default boxes while making predictions. Thereafter, the model makes adjustments to the respective boxes until the object shape matches. For handling objects of different sizes, the network uses multiple feature maps with different resolutions to come up with predictions of varying resolutions.
The SSD model is relatively simple. The model does not require proposal generation as it eliminates the need for object proposals. In effect, image or pixel resampling is the entire computation required for a single network. SSD is very easy to train and integrate for systems that require a detection component.
The previous tutorial in the series already discussed how to install MXNet and its dependencies. With this tutorial let's understand how to deploy and train an MXNet model using an open-source project.
The following are the essential requirements for this tutorial:
To get started with this model, get the latest Docker image using the command below.
docker pull daviddocker78/mxnet-ssd:gpu_0.12.0_cuda9
Refer to the Docker section below for a more comprehensive description.
Alternatively, install supporting modules and then clone the SSD from GitHub. For this demonstration, begin by installing cv2, matplotlib, and numpy python modules. To implement this, install the modules using a pip or package manager as shown below.
sudo apt-get install python-opencv python-matplotlib python-numpy
Next, clone the repository in GitHub using the following commands.
sudo apt-get install git # cd where you would like to clone this repo cd ~ mkdir mxnet-ssd cd mxnet-ssd git clone --recursive https://github.com/zhreshold/mxnet-ssd.git # ensure that you clone this with --recursive # if not done correctly or you are using downloaded repo, pull them all via: # git submodule update --recursive šCinit
Previous tutorials in the series comprehensively explain how to install the MXNet tool on Alibaba Cloud instance running Ubuntu 16.04. Let's take a quick look at the following commands used for building MXNet.
# for Ubuntu/Debian OS cp make/config.mk ./config.mk # modify it if required
This tutorial uses a pre-trained model for the demonstration. Download the zip from ssd_resnet50_0712.zip by running the following commands.
cd ~ mkdir model cd model/ wget https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/resnet50_ssd_512_voc0712_trainval.zip
Extract the zip in the directory and then run the commands below.
python demo.py --gpu 0 #Some pretrained models to work with: python demo.py --epoch 0 --images ./data/demo/dog.jpg --thresh 0.5 python demo.py --cpu --network resnet50 --data-shape 512 # Give the library some time to load as it may take up some time
For more details, refer to the instructions on
python demo.py --help.
If you have successfully run the test pre-trained model, the system is all set for the next step. For training the model, let's understand how to use a Pascal VOC dataset. Although it is possible to use other datasets as well by adding a subclass in the
Imdb class in
dataset/imdb.py class. The example below has more information about the
Note: Since CPU training is exceedingly slow, make sure to enable Compute Unified Device Architecture (CUDA) for training the model. While using CUDNN is optional, we recommend using it.
Refer to this page to download a pre-trained
VGG16_reduced model as shown below.
Access the model/directory and unzip the
Now, run the following command to create a directory for the datasets.
cd~ mkdir data/VOCdevkit mkdir VOCdevkit
Next, refer to the commands below to download the Pascal VOC dataset for training.
cd data/VOCdevkit wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
Run the following commands to extract the data.
tar -xvf VOCtrainval_11-May-2012.tar tar -xvf VOCtrainval_06-Nov-2007.tar tar -xvf VOCtest_06-Nov-2007.tar
Make sure to use the
trainval set in the VOC2007/2012. It is recommended to have a VOCdevkit folder to store the directories for VOC2007 and VOC2012.
tar -xvf VOCtest_06-Nov-2007.tar
Subsequently, link the VOCdevkit folder and data/VOCdevkit folder by default.
ln -s /home/*user/VOCdevkit /home/*user/data/VOCdevkit
Use a hard link instead of a copy to save some disk space.
Additionally, improve the training rate by creating the binary file below.
cd /home/*user/mxnet-ssd bash tools/prepare_pascal.sh # or if you are using windows OS python tools/prepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target ./data/train.lst python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target ./data/val.lst --shuffle False
Now the system is all set to start training the model.
In all the instances, the created model uses
learning_rate=0.004. Change these parameters to accommodate other configurations that you intend to use. There are other training parameters available on
train.py --help. For instance, use the following option while working with four GPUs.
# do note that there is yet a perfect training method for training on several GPUs simultaneously. python train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001
MXNet is very efficient with memory usage. For example, using a
batch-size 32 to train on a
VGG16_reduced model only requires 4684 MB if CUDNN (conv1_x and conv2_x fixed) are not used.
Use the command below to evaluate the trained model.
cd /home/*user/mxnet-ssd python evaluate.py --gpus 0,1 --batch-size 128 --epoch 0
Conversion of a trained model to the deployment model eliminates its loss layers and attaches another one to merge results and non-maximum suppression. Note that there is no symbol while loading python.
cd /home/*user/mxnet-ssd python deploy.py --num-class 20 # then you can run demo with new model without loading python symbol python demo.py --prefix model/ssd_300_deploy --epoch 0 --deploy
Find Caffe converter at
/path/to/mxnet-ssd/tools/caffe_converter as shown below. The modified converter handles all custom layers in
cd /home/*user/mxnet-ssd/ tools/caffe_converter make python convert_model.py deploy.prototxt name_of_pretrained_caffe_model.caffemodel ssd_converted # you will use this model in deploy mode without loading from python symbol python demo.py --prefix ssd_converted --epoch 1 --deploy
While the conversion does not always work, in this case, it should work without trouble.
This tutorial uses a new interface for composing models. It is significant to note that older models have inconsistencies with names of weights. However, it is still possible to load older models on the new interface by renaming their symbols to
legacy_xxx.py and using python
train/demo.py --network legacy_xxx to call them.
Consider the example below.
python demo.py --network 'legacy_vgg16_ssd_300.py' --prefix model/ssd_300 --epoch 0
docker pull daviddocker78/mxnet-ssd:gpu_0.12.0_cuda9
However, it is also possible to build Docker images by yourself using Dockerfiles.
Run Docker container instances using the following command.
nvidia-docker run -it --rm myImageName:tag
For more detailed information on how to work with Docker, follow this Guide.
Tensorboard is compatible with MXNet as a visualisation tool. Docker prebuilt images already include it and it is recommended to use for Docker.
Specify the following parameters to save training loss graphs, classes validation APs, and ROC validation graphs to tensorboard.
python train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001 --tensorboard True
Also, specify the following to save the layer distribution.
python train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.001 --tensorboard True --monitor 40
Download the tensorflow docker-image for visualization with Docker using the commands below.
# download pre-built image from Dockerhub docker pull tensorflow/tensorflow:1.4.0-devel-gpu # run a container and open a port using '-p' flag. # attach a volume from where you stored your logs, to a directory inside the container nvidia-docker run -it --rm -p 0.0.0.0:6006:6006 -v /my/full/experiment/path:/res tensorflow/tensorflow:1.4.0-devel-gpu cd /res tensorboard --logdir=.
Note: It is possible to launch TensorBoard without Docker by running the last command. It helps to load TensorEvents for the experiments. Access TensorBoard on the browser from '0.0.0.0:6006'.
The following snapshots show some visualization examples.
Refer to the following demo result snapshots.
The SSD approach uses shallow layers and is not very suitable for neural network prediction due to inadequate high-level features. Thus, it's not dependable for very small objects.
Furthermore, there is a need for a large dataset for training, thereby necessitating a pre-training using Pascal VOC, COCO and Open Images datasets. Overall, it is actually quite powerful in spite of some implementation technicalities.
This tutorial demonstrates the successful deployment and training of an MXNet model. It is crucial to note that the Git repository is synchronized with the official MXNet backend to allow popular methods. The GitHub repository supports MXNet 1.1.0 and 1.2.0. The sample supports multiple trained models through a simple method of composing networks for classification networks. Further, it allows running GPU workloads such as training on Alibaba Cloud with ease and flexibility.
Alibaba Offline Video Intelligence - October 9, 2019
Alibaba Clouder - August 3, 2017
Alibaba Clouder - November 7, 2019
Alibaba Tech - February 24, 2020
Alibaba Clouder - September 30, 2019
digoal - September 12, 2019
A high-performance container manage service that provides containerized application lifecycle managementLearn More
An encrypted and secure cloud storage service which stores, processes and accesses massive amounts of data from anywhere in the worldLearn More
An online computing service that offers elastic and secure virtual cloud servers to cater all your cloud hosting needs.Learn More
A secure image hosting platform providing containerized image lifecycle managementLearn More
More Posts by Alex