Elastic Algorithm Service (EAS) of Platform for AI (PAI) allows you to deploy trained models as model services or AI-powered web applications. The trained models may be obtained from open source communities or trained by yourself. EAS provides multiple methods for you to deploy models that are trained by using different methods. You can use the PAI console to deploy models as API services. This topic describes how to deploy models and manage EAS online services by using the PAI console.
Prerequisites
A trained model is obtained.
Background information
You can deploy models and manage EAS online services in the console.
Upload and deploy models in the console
You can deploy the model in the following methods:
Custom deployment: Custom deployment allows you to deploy models in a more flexible manner. You can deploy a model as an AI-powered web application or an inference service by using images, models, or processors.
Scenario-based model deployment: EAS provides various scenario-specific deployment solutions that use different models, such as ModelScope, Hugging Face, Triton, TFServing, Stable Diffusion (for AI painting), and pre-trained large language models (LLMs). EAS provides simplified deployment solutions for these deployment scenarios.
You can manage deployed model services in the PAI console, such as viewing service details, updating service resource configurations, adding a version for a deployed model service, or scaling resources.
Upload and deploy models in the console
On the EAS-Online Model Services page, you can upload models that you trained or public models that you obtained from open-source communities and deploy the model as online model services
Step 1: Go to the EAS-Online Model Services page.
Log on to the PAI console.
In the left-side navigation pane, click Workspaces. On the Workspace list page, click the name of the workspace that you want to manage.
In the left-side navigation pane, choose to go to the EAS-Online Model Services page.
Step 2: Select a deployment method
On the Inference Service tab, click Deploy Service.
In the Select Deployment Mode dialog box, select a deployment method and click OK.
The following deployment methods are supported:
Custom Deployment: provides a more flexible deployment method. This is the default value. You can quickly deploy a model as an online inference service by using a processor, or by configuring a preset image and third-party code library, mounting models and code, and running commands. For more information, see Configure parameters for custom deployment.
Scenario-based Model Deployment: provides the following deployment methods for general deployment scenarios. For more information about how to configure the parameters in each scenario, see Configure parameters for scenario-based model deployment.
ModelScope Model Deployment: deploys an open-source ModelScope model in a few clicks and starts model services.
Hugging Face Model Deployment: deploys an open-source Hugging Face model in a few clicks and starts model services.
Triton Deployment: deploys a model that uses AI frameworks, such as TensorRT, TensorFlow, PyTorch, or ONNX as an online inference service in a few clicks by using the Trition Server inference service.
TensorFlow Serving Deployment: deploys a model in the standard SavedModel format as an online service by using the TensorFlow Serving engine.
AI Painting - SD Web UI Deployment: deploys an AI painting service based on open-source SD web application in a few clicks, and calls the deployed service by using the web application or API operations. EAS isolates users and computing resources to implement enterprise-level applications.
Large Language Model (LLM) Deployment: deploys open-source foundation models or custom models that you trained and fine-tuned, and uses the built-in inference acceleration provided by PAI-Blade to implement simplified model deployment in a cost-effective manner.
Step 3: Deploy the service
Configure the parameters based on the deployment method. After you configure the parameters, click Deploy. When the Service Status changes to Running, the service is deployed.
Configure parameters for custom deployment
On the Deploy Service page, select a service type.
The following table describes the supported service types.
Service
Description
Create Service
Creates a service.
Update Service
Adds a version for a deployed service. EAS allows you to switch between different versions as needed.
Group Service
Creates a service and classifies the service into a service group. You can create a service group if no service group is available. For more information, see Manage service groups.
A service group has a unified data ingress. You can use the ingress to distribute traffic to services in the group based on the specific use scenario.
Add Blue-green Deployment
Creates a service and associates it with a deployed service. The two services are independent of each other and you can distribute traffic between the two services. After you verify the performance of the associated service, you can route all the traffic of the deployed service to the associated service. This ensures seamless switching from the deployed service to the associated service.
In the Model Service Information section, set the following parameters.
Service Name: The name of the service. You need to enter a custom service name only when you set Service Type to Create Service or Group Service. If you select Update Service or Add Blue-green Deployment, select the name of a deployed service from the drop-down list.
Group: This parameter is required only when the Service Type is set to Group Service. Valid values:
New Group: creates a service group to which the service that you want to create belongs.
Join: selects an existing service group to which the service that you want to create belongs.
Deployment Method: Three deployment methods are supported: Deploy Service by Using Image, Deploy Web App by Using Image, and Deploy Service by Using Model and Processor.
NoteIn complex model inference scenarios, such as AIGC and video processing, inference takes a long time. We recommend that you turn on the Asynchronous Service switch to implement asynchronous inference. For more information, see Asynchronous inference services. The asynchronous inference service is available only when the Deployment Method is set to Deploy Service by Using Image or Deploy Service by Using Model and Processor.
Deploy Service by Using Image: Select this deployment method if you want to quickly deploy AI inference services by mounting images, code, and models.
Deploy Web App by Using Image: Select this deployment method if you want to quickly deploy the web application by mounting images, code, and models.
Deploy Service by Using Model and Processor: Select this deployment method if you want to deploy AI inference services by using models and processors, such as built-in processors or custom processors. For more information, see Deploy model services by using built-in processors and Deploy services by using custom processors.
Deploy service or web application by using image
The following table describes the parameter configurations if you set Deployment Method to Deploy Service by Using Image or Deploy Web App by Using Image.
Parameter
Description
Select Image
Valid values:
Image Address: The URL of the image that is used to deploy the model service. Example:
registry.cn-shanghai.aliyuncs.com/xxx/image:tag
. You can specify the address of an image provided by PAI or a custom image. For more information about how to obtain the image address, see View and add images.ImportantThe specified image address must be in the same region as the service that you want to deploy.
If you want to use an image from a private repository, click enter and specify the username and password of the image repository.
Custom Image: Select a custom image. For more information about how to create a custom image, see View and add images.
PAI Image: Select an Alibaba Cloud image.
Model Settings
Click Specify Model Settings to configure the model. You can use one of the following methods to configure model files:
OSS
The path of the source OSS bucket.
In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified OSS path.
Mount NAS File System
NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system. For more information about how to create a general-purpose NAS file system, see Create a NAS file system.
NAS Source Path: the NAS path where the files are stored.
Mount Path: the mount path of the service instance. The mount path is used to read files from the NAS file system.
Mount PAI Model
Set Model Name and Model Version for an existing model that you want to use. For more information about how to view registered models, see Register and manage models.
Mount Path: the mount path of the service instance. The mount path is used to read the model file.
Code Settings
Click Specify Code Settings to configure the code. You can use one of the following mounting methods to provide access to the code that is required in the service deployment process.
OSS
The path of the source OSS bucket.
In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified OSS path.
Mount NAS File System
NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system.
NAS Source Path: the NAS path where the files are stored.
Mount Path: the mount path of the service instance. The mount path is used to read files from the specified NAS path.
Mount Git Path
Git Repository Address: the address of the Git repository.
Mount Path: the mount path of the service instance. The path is used to read the code file from the Git directory.
Mount PAI Dataset
Select an existing dataset. If no dataset is available, you can click Create Dataset to create a dataset.
In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI dataset.
Mount PAI Code
Select an existing code build. If no code set is available, you can click Create Code Build to create a new code build.
In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI code build.
Third-party Libraries
Click Specify Third-party Libraries to configure the third-party library. Valid values:
Third-party Libraries: specifies a third-party library in the text field.
Directory of requirements.txt: specifies the path of the requirements.txt file in the text field. You must include the address of the third-party library in the requirements.txt file.
Environment Variables
Click Specify Environment Variables to configure environment variables.
Set the variable name and variable value to add one or more environment variables.
Name: the name of the environment variable used to run the image.
Value: the value of the environment variable used to run the image.
Command
The command to run the image. Example:
python/run.py
.You also need to enter the port number, which is the local HTTP port on which the model service listens after the image is deployed.
ImportantYou cannot specify port 8080 and 9090 because the EAS engine listens on port 8080 and 9090.
Deploy service by using model and processor
The following table describes the parameter configuration if you set the Deployment Method parameter to Deploy Service by Using Model and Processor.
Parameter
Description
Model file
Valid values:
OSS
Select the OSS path that stores the model file.
Upload Data
Select an OSS path in the current region.
Click Browse Local Files and select the local model file that you want to upload. You can also directly drag and drop the model file to the blank area.
Publicly Accessible Download URL
Select Publicly Accessible Download URL. Then, enter a publicly accessible URL in the field below the parameter.
Model selection
Set Model Name and Model Version for an existing model that you want to use. For more information about how to view registered models, see Register and manage models.
Processor Type
The type of processor. You can select a built-in official processor or customize a processor based on your business requirements. For more information about built-in official processors, see Built-in processors.
Model Type
This parameter is required only when the Processor Type is set to EasyVision(CPU), EasyVision(GPU), EasyTransfer(CPU), EasyTransfer(GPU), EasyNLP, or EasyCV. The available model types vary based on the processor type. You can set the Processor Type and Model Type parameters based on your business requirements.
Processor Language
This parameter is available if you set the Processor Type parameter to Custom Processor.
Valid values: Cpp, Java, and python.
Processor package
This parameter is available if you set the Processor Type parameter to Custom Processor. Valid values:
Upload Local File
Select Upload Local File.
Select an OSS path in the current region.
Click Browse Local Files and select the local processor package that you want to upload. You can also directly drag and drop the processor package to the blank area.
The package is uploaded to the OSS path in the current region, and the Processor Package parameter is automatically set.
NoteYou can accelerate the loading speed of a processor during model deployment by uploading a local processor package.
Import OSS File
Select Import OSS File. Then, select the OSS path in which the processor package is stored.
Download from Internet
Select Download from Internet. Then, enter a public URL.
Processor Master File
This parameter is available if you set the Processor Type parameter to Custom Processor. This parameter specifies the main file of the processor package.
Mount Settings
Click Specify Mount Settings to configure the mounting method. You can use one of the following mount methods.
OSS
The path of the source OSS bucket.
In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified OSS path.
Mount NAS File System
NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system.
NAS Source Path: the NAS path where the files are stored.
Mount Path: the mount path of the service instance. The mount path is used to read files from the specified NAS path.
Mount PAI Dataset
Select an existing dataset. If no dataset is available, you can click Create Dataset to create a dataset.
In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI dataset.
Mount PAI Code
Select an existing code build. If no code set is available, you can click Create Code Build to create a new code build.
In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI code build.
Environment Variables
Click Specify Environment Variables to configure environment variables.
Set the variable name and variable value to add one or more environment variables.
Name: the name of the environment variable used to run the image.
Value: the value of the environment variable used to run the image.
In the Resource Deployment Information section, set the following parameters.
Parameter
Description
Resource Group Type
The type of resource group in which you want to deploy the model. You can deploy the model by using the public resource group or a dedicated resource group. For more information, see Work with dedicated resource groups.
NoteIf you run a small number of tasks and do not have high requirements on latency, we recommend that you use the public resource group.
Number Of Instances
We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.
If you set Resource Group Type to a dedicated resource group, you must set the CPU, Memory (MB), and GPU parameters for each service instance.
Resource Configuration Mode
This parameter is available only if you set the Resource Group Type parameter to Public Resource Group. Valid values:
General
You can select a single CPU or GPU instance type.
Cost-effective Resource Configuration
You can configure multiple instance types or use preemptible instances. For more information, see Specify multiple instance types and Create and use preemptible instances.
Preemptible Instance Protection Period: You can set a protection period of 1 hour for a preemptible instance. This means that during a protection period of 1 hour, the system ensures your access to the instance.
Deploy Resources: You can configure Common instances or Preemptible instances at the same time. Resources are started based on the sequence in which the instance types are configured. You can add up to five resource types. If you use Preemptible instances, you need to set a bid price to bid for preemptible instances.
Elastic resource pool
You can set this parameter only when the Resource Group Type is set to a dedicated resource group.
You can enable Elastic Resource Pool and configure your resources based on the instructions in the Resource Configuration Mode section.
If you enable Elastic Resource Pool and the dedicated resource group that you use to deploy services is fully occupied, the system automatically adds pay-as-you-go instances to the public resource group during scale-outs. The added instances are billed as public resources. The instances in the public resource group are first released during scale-ins.
System Disks
This parameter is available only if you set the Resource Group Type parameter to Public Resource Group.
Click System Disks to configure additional system disks for the EAS service. Unit: GB. Valid values: 0 to 2000. You have a free quota of 30 GB on the system disk. If you specify 20 in the field, the available storage space is
30 GB +20 GB = 50 GB
.Additional system disks are billed based on their capacity and usage duration. For more information, see Billing of EAS.
Optional. In the VPC Settings section, set the VPC, vSwitch, and Security Group Name parameters to enable VPC for the EAS service deployed in the public resource group.
After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.
In the Configuration Editor section, the configurations of the service are displayed in the code editor. You can add configuration items that are not included in previous steps. For more information, see the "Create a service" section in the Run commands to use the EASCMD client topic.
Configure parameters for scenario-based model deployment
The following section describes the parameters based on the scenario that you select.
ModelScope Model Deployment
Parameter | Description | |
Model Service Information | Service Name | The name of the service. |
Select Model | Select a ModelScope model from the drop-down list. | |
Model Version | Select a model version from the drop-down list. By default, the latest version is used. | |
Model Type | After you select a model, the system automatically specifies the Model Type parameter. | |
Number Of Instances | Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment. | |
Resource Configuration | Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. | |
VPC | VPC | Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group. After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC. |
vSwitch | ||
Security Group Name |
Hugging Face Model Deployment
Parameter | Description | |
Model Service Information | Service Name | The name of the service. |
Model ID | The ID of the Hugging Face model. Example: | |
Model Type | The type of the Hugging Face model. Example: text-classification. | |
Model Version | The version of the Hugging Face model. Example: main. | |
Instances | Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment. | |
Resource Configuration | Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. | |
VPC | VPC | Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group. After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC. |
vSwitch | ||
Security Group Name |
Triton Deployment
Parameter | Description | |
Model Service Information | Service Name | The name of the service. |
Model Settings | Make sure that the model you deploy meets the structure requirements of Trition. After you prepare the model, select one of the following method to deploy the model:
| |
Instances | Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment. | |
Resource Configuration | Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. | |
VPC | VPC | Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group. After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC. |
vSwitch | ||
Security Group Name |
TensorFlow Serving deployment
Parameter | Description | |
Model Service Information | Service Name | The name of the service. |
Deployment Mode | The following deployment methods are supported:
| |
Model Settings | The model structure deployed by TensorFlow Serving has fixed requirements.
| |
Instances | Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment. | |
Resource Configuration | Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. | |
VPC | VPC | Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group. After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC. |
vSwitch | ||
Security Group Name |
AI Painting - SD Web UI Deployment
Parameter | Description | |
Model Service Information | Service Name | The name of the service. |
Edition | Valid values:
| |
Specify Model Settings | You can specify model settings in the following scenarios: (1) you want to use an open source model that you downloaded from communities or a model that you fine-tuned; (2) you want to save the output data to your data source; (3) you need to install third-party plug-ins or configuration. Valid values:
| |
Instances | Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment. | |
Resource Configuration | Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. We recommend that you use the instance type ml.gu7i.c16m60.1-gu30 in terms of cost-effectiveness. | |
VPC | VPC | Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group. After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC. |
vSwitch | ||
Security Group Name |
Large Language Model (LLM)
Parameter | Description | |
Model Service Information | Service Name | The name of the service. |
Model Source | Valid values:
| |
Model Settings | This parameter is required if you set the Model Source parameter to Custom Fine-tuned Model. Valid values:
| |
Model Type | Select a model category. | |
Instances | Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment. | |
Resource Configuration | Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. We recommend that you use the instance type ml.gu7i.c16m60.1-gu30 in terms of cost-effectiveness. | |
VPC | VPC | Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group. After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC. |
vSwitch | ||
Security Group Name |
Manage online model services in EAS
On the Inference Service tab of the EAS-Online Model Services page, you can view deployed services, and stop, start, or delete services.
If you stop or delete a model service, requests that rely on the model service fail. Proceed with caution.
View service details
Click the name of the service that you want to manage to go to the Service Details tab. On the Service Details page, you can view the basic information, instances, and configurations of the service.
On the Service Details page, you can click different tabs to view information about service monitoring, logs, and deployment events.
Update service resource configurations
On the Service Details tab, click Resource Configuration in the Resource Information section. In the Resource Configuration dialog box, update the resources that are used to run the service. For more information, see Upload and deploy models in the console.
Add a version for a deployed model service
On the EAS-Online Model Services page, find the service that you want to update and click Update Service in the Actions column. For more information, see Upload and deploy models in the console.
WarningWhen you add a version for a model service, the service is temporarily interrupted. Consequently, the requests that rely on the service fail until the service recovers. Proceed with caution.
After you update the service, click the version number in the Current Version column to view the Version Information or change the service version.
Scale resources
On the EAS-Online Model Services page, find the service that you want to manage and click Scale in the Actions column. In the Scale dialogue box, specify the number of Instances to adjust the instances that are used to run the model service.
Enable auto scaling
You can configure automatic scaling for the service to enable the service to automatically adjust the resources that are used to run the online model services in EAS based on your business requirements. For more information, see the "Method 1: Manage the horizontal auto scaling feature in the console" section in the Enable or disable the horizontal auto-scaling feature topic.
Switch traffic
Follow the instructions in the following figure to switch traffic between services that are deployed by using blue-green deployment.
References
After you deploy the service, you can use Online Debugging to check whether the service runs as expected. For more information, see Online service debugging.
After you deploy a model based on the scenario-based deployment method, you can call the service to verify the model performance. For more information, see EAS use cases.
For more information about how to deploy model services in EAS, see Deploy a model service by using Machine Learning Designer or Deploy model services by using EASCMD or DSW.