Deployment workflows and methods - Platform For AI - Alibaba Cloud Documentation Center

The Elastic Algorithm Service (EAS) module of Platform for AI (PAI) provides multiple methods to help you deploy model services based on your business requirements.

Deployment methods

You can deploy a model by using an image or a processor in EAS.

Model deployment by using an image (recommended)

When you use an image to deploy a model, EAS pulls the environment image from the Container Registry (ACR) and mounts storage services, such as Object Storage Service (OSS) and Apsara File Storage NAS to obtain the required preparations. The preparations include the running environment, model, and other related files, such as the code that is used to process the model.

The following figure describes the workflow of deploying a model by using an image in EAS.

Take note of the following items:

EAS supports two methods of model deployment by using an image: Deploy Service by Using Image and Deploy Web App by Using Image.
- Deploy Service by Using Image: Use an image to deploy a model service. After you deploy the service, you can call the service by calling an API operation.
- Deploy Web App by Using Image: Use an image to deploy web application. After you deploy the application, you can access the application by using a link.
  For more information about the two deployment methods, see the Step 2: Deploy the service section of this topic.
PAI provides multiple official images for model deployment. You can also develop a model and create an image based on your business requirements. You must upload the created image to ACR for easy deployment.
We recommend that you upload the model and the code files that are used to process the models to storage services in the cloud and mount the storage services instead of packaging the model into a custom image. This allows you to update the model in a convenient manner.
When you use an image to deploy model services, we recommend that you build an HTTP server. After you deploy services in EAS, EAS forwards the call requests to the HTTP server that you developed. You cannot specify ports 8080 and 9090 for the HTTP server because the EAS engine listens on ports 8080 and 9090.

Note

If you use a custom image for deployment, you must upload the image to ACR before you use the image. Otherwise, the system may fail to pull the image during model development. If you use Data Science Workshop (DSW) for model development and training, you must upload the image to ACR before you can use the image in EAS.
If you want to use your custom images or to prefetch data in other scenarios, you can use AI Computing Asset Management in PAI to manage them in a centralized manner. CPFS datasets of NAS are not supported in EAS.

Model deployment by using a processor

After you prepare the model and processor file, you can upload the files to the storage services, such as OSS or NAS, and mount the storage services to EAS. EAS can obtain the files for deployment.

The following figure describes the workflow of deploying a model by using a processor in EAS.

Take note of the following items:

PAI provides multiple official processors for model deployment. You can also develop a model and a custom processor file based on your business requirements and upload the model and the processor to OSS or NAS.
We recommend that you develop and store the model and the processor file separately. You can configure the mount path when you deploy the model and use the get_model_path parameter in the processor file to obtain the specified model path. This allows you to update the model in a convenient manner.
When you use a processor to deploy a model service, EAS automatically pulls an official image based on your inference framework to deploy the service, and deploys an HTTP server based on the processor file to receive service calls.

Note

If you use a processor to deploy a model service, make sure that the inference framework of the model and the processor file meet the requirements of the development environment. This deployment method is not as flexible and efficient as the deployment method that uses an image. Therefore, we recommend that you use an image to deploy the model.

Deployment tools and methods

In terms of deployment tools, EAS allows you to deploy and manage model services in the console or by using command-line interface (CLI). The deployment procedure varies based on the tools that you use. The following table describes the detailed instructions.

Action type	Use the PAI console	Use the CLI
Deploy services	Model service deployment by using the PAI console or Deploy a model service by using Machine Learning Designer.	Deploy model services by using EASCMD or DSW.
Manage online model services	You can manage model services in EAS. For more information, see Model service deployment by using the PAI console. The following operations are supported: View model calling information. View logs, monitoring information, and service deployment information. Scale in, scale out, start, stop, and delete model services.	Use the EASCMD client to manage model services. For more information, see Run commands to use the EASCMD client.

When you use a dedicated resource group to deploy a model service, you can configure a storage mount to store the required data. For more information, see Mount storage to services (advanced).

In terms of deployment methods, EAS allows you to deploy model services by using an image or a processor. The following table describes the detailed instructions.

Deployment method	Description	Reference
Deploy Service by Using Image (recommended)	Scenario: Use an image to deploy a model service. Benefits: You can use an image to ensure consistency between the model development and training environments and the deployment and running environments. EAS provides official images that are suitable for various scenarios. You can use an official image to implement push-button deployment. You can also use a custom image without modification to deploy a model service in a convenient manner.	Deployment methods Deploy a model service by using a custom image
Deploy Web App by Using Image (recommended)	Scenario: Use an image to deploy a model service or a web application. Benefits: EAS provides multiple preset official images, such as Stable-Diffusion-Webui and Chat-LLM-Webui, and provides frameworks such as Gradio, Flask, and FastAPI to build HTTP servers. You can also use a custom image without modification to deploy a model service in a convenient manner.
Deploy Service by Using Model and Processor	EAS provides built-in processors for commonly-used model frameworks, such as PMML and XGBOOST. You can use a built-in processor to start the service in a quick manner. You can also build custom processors to implement more flexible business logic.	Model deployment by using a processor Deploy model services by using built-in processors Deploy services by using custom processors

Advanced configurations

Service groups
EAS supports service groups, which you can use in scenarios that require traffic distribution across multiple services, such as canary releases and blue-green deployment. For more information, see Manage service groups.
Scheduled service deployment
You can use DataWorks to automatically deploy services on a regular basis. For more information, see Configure scheduled model deployment.
Instance utilization
EAS provides preemptible instances and allows you to select multiple instance types. This way, you can deploy services in a cost-effective manner. For more information, see Create and use preemptible instances and Specify multiple instance types.
Storage integration
EAS can mount data from multiple storage services, such as Object Storage Service (OSS), Apsara File Storage NAS, and Git repositories. For more information, see Mount storage to services (advanced).
Model warm-up
EAS provides the model warm-up feature to reduce the delay in processing the first request after deployment. This ensures that model services can work as expected immediately after they are published. For more information, see Warm up model services (advanced).

References

You can use multiple methods to call the service that you deployed. For more information, see Methods for calling services.
You can view the metrics related to service invocation and operational health on the Service Monitoring tab. For more information, see Service monitoring.