Overview - Platform For AI - Alibaba Cloud Documentation Center

Machine Learning Platform for AI (PAI)-Blade integrates various optimization technologies. You can use PAI-Blade to optimize the inference performance of a trained model so that the model can run at optimal inference performance. In addition, PAI-Blade provides an SDK for C++ that you can use to deploy optimized models for inference. This helps you apply models to production with ease. This topic describes how PAI-Blade works and how to use PAI-Blade.

Background information

PAI-Blade is a universal tool provided by PAI for inference optimization. PAI-Blade provides a wide range of technologies to optimize the inference performance of a model. You can use PAI-Blade to run your model at optimal inference performance. PAI-Blade integrates various optimization technologies, including computational graph optimization, vendor-optimized libraries such as TensorRT and oneDNN, AI compilation optimization, operator libraries for manual optimization in PAI-Blade, mixed precision of PAI-Blade, and auto compression of PAI-Blade. If you use PAI-Blade to optimize the inference performance of a model, PAI-Blade analyzes the model, and then applies some or all optimization technologies to the model.

All optimization technologies integrated into PAI-Blade use a universal design and can be applied to various business scenarios. In addition, PAI-Blade verifies the accuracy of the numerical results of each optimization step to prevent unexpected impact on the precision or metrics of the model.

PAI-Blade is a new kind of product that lowers the threshold of model optimization, provides better user experience, and improves production efficiency.

How PAI-Blade works

You can install PAI-Blade in your local environment by using a wheel package without the need to apply for resources or upload models and data. You can call Python methods provided by PAI-Blade in code to integrate model optimization in a pipeline. PAI-Blade also allows you to verify the performance of an optimized model in your local environment. In addition, you can try different optimization policies and explore varied optimization effects by configuring parameters.

PAI-Blade also provides an SDK for C++ that you can use to deploy models. SDK is required to run a model that is optimized by using PAI-Blade. However, you need to only link to the library files of PAI-Blade without the need to modify the model code.

How to use PAI-Blade

Perform the following steps to use PAI-Blade:

Install PAI-Blade. For more information, see Install Blade
Optimize a model. For more information, see Optimize a TensorFlow model and Optimize a PyTorch model.
For more information about how to quantize a model, see Quantization. For more information about how to specify a compilation optimization mode, see Use AICompiler to optimize models.
Interpret the optimization report that is generated. For more information, see Optimization report.
Deploy the model for inference. For more information, see Use an SDK to deploy a TensorFlow model for inference and Use an SDK to deploy a PyTorch model for inference.