Machine Learning Platform for AI (PAI) provides EasyVision, an enhanced algorithm framework for visual intelligence. EasyVision provides a variety of model training and prediction capabilities. You can use EasyVision to train and apply computer vision (CV) models for your CV applications.

The rapid development of deep learning technologies promotes large-scale commercial use of CV. As CV developers, you may encounter the following difficulties when you use deep learning technologies to build CV models:
  • High costs are required to develop and debug deep learning algorithms.
  • Models are frequently updated. You must spend much time understanding how the models work and other details of the models.
  • To train algorithms and improve the inference performance, you must master professional and systematic knowledge.
  • High costs are required for data annotation.
  • To use open source algorithms in PAI, you must spend time learning and rebuilding the algorithms.
To overcome the preceding difficulties, PAI provides EasyVision, which is a simple and easy-to-use algorithm framework that allows you to train models. You can use EasyVision to build and apply CV models with ease. EasyVision has the following advantages:
  • Ease of use

    EasyVision supports pluggable API operations that can be called to complete various tasks of various modules. In addition, EasyVision provides rich features, such as data I/O, preprocessing, model training, and offline prediction, to support the entire modeling process. You can use EasyVision in multiple modules such as Machine Learning Studio or Data Science Workshop (DSW) of PAI.

  • High performance

    EasyVision encapsulates a variety of optimization engines of PAI-TensorFlow, such as the optimization engines for compiling, distributed training, and mixed precision training. You can use these optimization engines to improve the system performance by configuring the related configuration files. You can also use EasyVision in open source TensorFlow systems.

  • Rich models

    EasyVision provides a variety of models, such as the optical character recognition (OCR) model. The models are trained based on open source datasets. This reduces the development and training costs.


EasyVision builds a model zoo with a plenty of models. In addition, EasyVision provides a variety of model training and prediction capabilities and is compatible with the commands, Video Intelligence Platform (VIP), and DSW of PAI. This way, EasyVision can meet various modeling requirements. EasyVision uses a distributed pipeline architecture for offline prediction. This is a flexible and highly available architecture that allows EasyVision to process hundreds of millions of data records offline within a short period. Models can be used to make prediction in Elastic Algorithm Service (EAS) of PAI. The system and model optimization features of PAI allow you to make prediction with less parameters more efficiently. In addition, you can use EasyVision to customize operations of model training and prediction. This way, you can reuse existing features and optimize models. The following figure shows the architecture of EasyVision.Architecture diagram of EasyVision


  • Ease of use

    You may have different requirements on model training and prediction. For example, you may want to simplify the operations of model training, run model training and prediction tasks as scheduled, and reuse existing models and algorithms. To meet these requirements, EasyVision is compatible with the commands, VIP, and DSW of PAI.

  • Optimized performance

    EasyVision optimizes distributed training based on PAI-TensorFlow. EasyVision allows you to train a model on one or more multi-GPU servers. EasyVision also improves the inference performance, including graph optimization and model compression.

  • Connection to Smart Labeling of PAI

    EasyVision is connected to Smart Labeling of PAI, which is used to label data. PAI provides a conversion tool to convert files that contain labeled data to TFRecord files. You can use the TFRecord files to train EasyVision models. In addition, EasyVision provides rich data enhancement modules to dynamically inject data during training.

  • Efficient offline prediction

    EasyVision allows you to use multiple servers to concurrently make prediction. Each server separately processes data. This way, you can use offline data to make prediction based on the models that are trained by EasyVision. Each processing job in a prediction task can be run in an accelerated manner by using multiple threads on multiple servers. All jobs are asynchronously run one by one. This improves the processing efficiency. You can also customize jobs.

  • Connection to EAS

    A SavedModel file is generated during training. You can use the SavedModel file in your own system or EAS to make online prediction. EasyVision provides a processor that supports the powerful online prediction capabilities of EAS. You can use this processor to process data in real time after you specify the model information such as the endpoint and type in the configuration file.