[DSW Gallery] Example of image detection based on YOLOX model

This article will take the YOLOX model as an example to introduce how to perform target detection model training and prediction based on easyCV.
  The development of the YOLO series mainly includes a series of versions such as V1 (2015), V2 (2016), V3 (2018), V4 (2020) and V5 (2020). YOLO is a one-stage object detection network proposed by Joseph Redmon and Ross Girshick in You Only Look Once: Unified, Real-Time Object Detection (Link) in 2015. The YOLO series is favored by engineering researchers for its fast response, high precision, simple structure, and easy deployment. At the same time, the YOLO series has the problem of poor generalization ability due to the need to manually set the positive and negative samples.
  YOLOX draws on the latest achievements of academic object detection in recent years on the basis of the YOLO series and inherits the characteristics of easy deployment of the YOLO series. In addition, YOLOX has designed Decoupled Head, Data Aug, Anchor Free and SimOTA components. Its code currently supports deployment on various platforms (MegEngine, TensorRT, ONNX, OpenVino, and ncnn).
  In EasyCV, we provide a fintune of pre-trained models (Link) with various specifications on the COCO2017 dataset that can be used for downstream tasks.
  This article will introduce how to quickly use YOLOX to train and reason the image detection model based on EasyCV in pai-dsw.
Operating environment requirements
PAI-Pytorch 1.8 image, GPU model P100 or V100, memory 32G
Install dependencies
1. Obtain the torch and cuda versions, and modify the mmcv installation command according to the version number, and install the corresponding version of mmcv and nvidia-dali
import torch
import os
os.environ['CUDA']='cu' + torch.version.cuda.replace('.', '')
os.environ['Torch']='torch'+torch.version.__version__.replace('+PAI', '')
!echo $CUDA
!echo $Torch
# install some python deps
!pip install --upgrade tqdm
!pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/$CUDA/$Torch/index.html
!pip install http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/third_party/nvidia_dali_cuda100-0.25.0-1535750-py3-none-manylinux2014_x86_64.whl
2. Install the EasyCV algorithm package Note: The pai-easycv library is pre-installed in the PAI-DSW docker, and this step can be skipped
!pip install pai-easycv
# !pip install http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/pkgs/whl/2022_6/pai_easycv-0.3.0-py3-none-any.whl
3. Simple verification
from easycv.apis import *
Image Detection Model Training & Prediction
The following example introduces how to use cifar10 data and use the ResNet50 model to quickly perform image classification model training evaluation and model prediction process (requires modification)
data preparation
You can download the COCO2017 data, or use the sample COCO data we provide
! wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/small_coco_demo/small_coco_demo.tar.gz && tar -zxf small_coco_demo.tar.gz
Rename the data file to make it exactly the same as the COCO data format
!mkdir -p data/ && mv small_coco_demo data/coco
The data/coco format is as follows
├── annotations
│ ├── instances_train2017.json
│ └── instances_val2017.json
├── train2017
│ ├── 000000005802.jpg
│ ├── 000000060623.jpg
│ ├── 000000086408.jpg
│ ├── 000000118113.jpg
│ ├── 000000184613.jpg
│ ├── 000000193271.jpg
│ ├── 000000222564.jpg
│ └── 000000574769.jpg
└── val2017
├── 000000006818.jpg
├── 000000017627.jpg
├── 000000037777.jpg
├── 000000087038.jpg
├── 000000174482.jpg
├── 000000181666.jpg
├── 000000184791.jpg
├── 000000252219.jpg
└── 000000522713.jpg
training model
Download the sample configuration file for YOLOX-S model training
!rm -rf yolox_s_8xb16_300e_coco.py
!wget https://raw.githubusercontent.com/alibaba/EasyCV/master/configs/detection/yolox/yolox_s_8xb16_300e_coco.py
In order to adapt to small data, we modify the following fields in the configuration file yolox_s_8xb16_300e_coco.py to reduce the number of training epochs and increase the frequency of printing logs
total_epochs = 3
#optimizer.lr -> 0.0002
optimizer = dict(
type='SGD', lr=0.0002, momentum=0.9, weight_decay=5e-4, nesterov=True)
# log_config.interval 1
log_config = dict(interval=1)
Note: If you are using COCO complete data training, in order to ensure the effect, it is recommended to use a single machine with 8 cards for training; if you want to use a single card for training, it is recommended to reduce the learning rate optimizer.lr
In order to ensure the effect of the model, we finetune on the basis of the pre-trained model and execute the following command to start the training
!python -m easycv.tools.train yolox_s_8xb16_300e_coco.py --work_dir work_dir/detection/yolox/yolox_s_8xb16_300e_coco --load_from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo /detection/yolox/yolox_s_bs16_lr002/epoch_300.pth
export model
Export the YOLOX model for prediction, execute the following command to view the model file generated by training
# View the pt file generated by training
! ls work_dir/detection/yolox/yolox_s_8xb16_300e_coco/*.pth
Before exporting the model, the configuration file needs to be modified to specify the score threshold of nms
model.test_conf 0.01 -> 0.5
model = dict(
model_type='s', # s m l x tiny nano
Execute the following command to export the model
!cp yolox_s_8xb16_300e_coco.py yolox_s_8xb16_300e_coco_export.py && sed -i 's#test_conf=0.01#test_conf=0.5#g' yolox_s_8xb16_300e_coco_export.py
!python -m easycv.tools.export yolox_s_8xb16_300e_coco_export.py work_dir/detection/yolox/yolox_s_8xb16_300e_coco/epoch_3.pth work_dir/detection/yolox/yolox_s_8xb16_300e_coco/yolox_export.pth
Download test image
!wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/small_coco_demo/val2017/000000017627.jpg
Import model weights and predict detection results for test images
import cv2
from easycv.predictors import TorchYoloXPredictor
output_ckpt = 'work_dir/detection/yolox/yolox_s_8xb16_300e_coco/yolox_export.pth'
detector = TorchYoloXPredictor(output_ckpt)
img = cv2.imread('000000017627.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
output = detector. predict([img])
# view detection results
%matplotlib inline
from matplotlib import pyplot as plt
image = img. copy()
for box, cls_name in zip(output[0]['detection_boxes'], output[0]['detection_class_names']):
# box is [x1,y1,x2,y2]
box = [int(b) for b in box]
image = cv2.rectangle(image, tuple(box[:2]), tuple(box[2:4]), (0,255,0), 2)
cv2.putText(image, cls_name, (box[0], box[1]-5), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0,0,255), 2)
plt. show()

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us