All Products
Search
Document Center

Platform For AI:Model service deployment by using the PAI console

Last Updated:Feb 27, 2024

Elastic Algorithm Service (EAS) of Platform for AI (PAI) allows you to deploy trained models as model services or AI-powered web applications. The trained models may be obtained from open source communities or trained by yourself. EAS provides multiple methods for you to deploy models that are trained by using different methods. You can use the PAI console to deploy models as API services. This topic describes how to deploy models and manage EAS online services by using the PAI console.

Prerequisites

A trained model is obtained.

Background information

You can deploy models and manage EAS online services in the console.

  • Upload and deploy models in the console

    You can deploy the model in the following methods:

    • Custom deployment: Custom deployment allows you to deploy models in a more flexible manner. You can deploy a model as an AI-powered web application or an inference service by using images, models, or processors.

    • Scenario-based model deployment: EAS provides various scenario-specific deployment solutions that use different models, such as ModelScope, Hugging Face, Triton, TFServing, Stable Diffusion (for AI painting), and pre-trained large language models (LLMs). EAS provides simplified deployment solutions for these deployment scenarios.

  • Manage online model services

    You can manage deployed model services in the PAI console, such as viewing service details, updating service resource configurations, adding a version for a deployed model service, or scaling resources.

Upload and deploy models in the console

On the EAS-Online Model Services page, you can upload models that you trained or public models that you obtained from open-source communities and deploy the model as online model services

Step 1: Go to the EAS-Online Model Services page.

  1. Log on to the PAI console.

  2. In the left-side navigation pane, click Workspaces. On the Workspace list page, click the name of the workspace that you want to manage.

  3. In the left-side navigation pane, choose Model Deployment > Elastic Algorithm Service (EAS) to go to the EAS-Online Model Services page.

Step 2: Select a deployment method

  1. On the Inference Service tab, click Deploy Service.

  2. In the Select Deployment Mode dialog box, select a deployment method and click OK.

    The following deployment methods are supported:

    • Custom Deployment: provides a more flexible deployment method. This is the default value. You can quickly deploy a model as an online inference service by using a processor, or by configuring a preset image and third-party code library, mounting models and code, and running commands. For more information, see Configure parameters for custom deployment.

    • Scenario-based Model Deployment: provides the following deployment methods for general deployment scenarios. For more information about how to configure the parameters in each scenario, see Configure parameters for scenario-based model deployment.

      • ModelScope Model Deployment: deploys an open-source ModelScope model in a few clicks and starts model services.

      • Hugging Face Model Deployment: deploys an open-source Hugging Face model in a few clicks and starts model services.

      • Triton Deployment: deploys a model that uses AI frameworks, such as TensorRT, TensorFlow, PyTorch, or ONNX as an online inference service in a few clicks by using the Trition Server inference service.

      • TensorFlow Serving Deployment: deploys a model in the standard SavedModel format as an online service by using the TensorFlow Serving engine.

      • AI Painting - SD Web UI Deployment: deploys an AI painting service based on open-source SD web application in a few clicks, and calls the deployed service by using the web application or API operations. EAS isolates users and computing resources to implement enterprise-level applications.

      • Large Language Model (LLM) Deployment: deploys open-source foundation models or custom models that you trained and fine-tuned, and uses the built-in inference acceleration provided by PAI-Blade to implement simplified model deployment in a cost-effective manner.

Step 3: Deploy the service

Configure the parameters based on the deployment method. After you configure the parameters, click Deploy. When the Service Status changes to Running, the service is deployed.

Configure parameters for custom deployment

  1. On the Deploy Service page, select a service type.

    The following table describes the supported service types.

    Service

    Description

    Create Service

    Creates a service.

    Update Service

    Adds a version for a deployed service. EAS allows you to switch between different versions as needed.

    Group Service

    Creates a service and classifies the service into a service group. You can create a service group if no service group is available. For more information, see Manage service groups.

    A service group has a unified data ingress. You can use the ingress to distribute traffic to services in the group based on the specific use scenario.

    Add Blue-green Deployment

    Creates a service and associates it with a deployed service. The two services are independent of each other and you can distribute traffic between the two services. After you verify the performance of the associated service, you can route all the traffic of the deployed service to the associated service. This ensures seamless switching from the deployed service to the associated service.

  2. In the Model Service Information section, set the following parameters.

    • Service Name: The name of the service. You need to enter a custom service name only when you set Service Type to Create Service or Group Service. If you select Update Service or Add Blue-green Deployment, select the name of a deployed service from the drop-down list.

    • Group: This parameter is required only when the Service Type is set to Group Service. Valid values:

      • New Group: creates a service group to which the service that you want to create belongs.

      • Join: selects an existing service group to which the service that you want to create belongs.

    • Deployment Method: Three deployment methods are supported: Deploy Service by Using Image, Deploy Web App by Using Image, and Deploy Service by Using Model and Processor.

      Note

      In complex model inference scenarios, such as AIGC and video processing, inference takes a long time. We recommend that you turn on the Asynchronous Service switch to implement asynchronous inference. For more information, see Asynchronous inference services. The asynchronous inference service is available only when the Deployment Method is set to Deploy Service by Using Image or Deploy Service by Using Model and Processor.

      • Deploy Service by Using Image: Select this deployment method if you want to quickly deploy AI inference services by mounting images, code, and models.

      • Deploy Web App by Using Image: Select this deployment method if you want to quickly deploy the web application by mounting images, code, and models.

      • Deploy Service by Using Model and Processor: Select this deployment method if you want to deploy AI inference services by using models and processors, such as built-in processors or custom processors. For more information, see Deploy model services by using built-in processors and Deploy services by using custom processors.

      Deploy service or web application by using image

      The following table describes the parameter configurations if you set Deployment Method to Deploy Service by Using Image or Deploy Web App by Using Image.

      Parameter

      Description

      Select Image

      Valid values:

      • Image Address: The URL of the image that is used to deploy the model service. Example: registry.cn-shanghai.aliyuncs.com/xxx/image:tag. You can specify the address of an image provided by PAI or a custom image. For more information about how to obtain the image address, see View and add images.

        Important

        The specified image address must be in the same region as the service that you want to deploy.

        If you want to use an image from a private repository, click enter and specify the username and password of the image repository.

      • Custom Image: Select a custom image. For more information about how to create a custom image, see View and add images.

      • PAI Image: Select an Alibaba Cloud image.

      Model Settings

      Click Specify Model Settings to configure the model. You can use one of the following methods to configure model files:

      • OSS

        • The path of the source OSS bucket.

        • In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified OSS path.

      • Mount NAS File System

        • NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system. For more information about how to create a general-purpose NAS file system, see Create a NAS file system.

        • NAS Source Path: the NAS path where the files are stored.

        • Mount Path: the mount path of the service instance. The mount path is used to read files from the NAS file system.

      • Mount PAI Model

        • Set Model Name and Model Version for an existing model that you want to use. For more information about how to view registered models, see Register and manage models.

        • Mount Path: the mount path of the service instance. The mount path is used to read the model file.

      Code Settings

      Click Specify Code Settings to configure the code. You can use one of the following mounting methods to provide access to the code that is required in the service deployment process.

      • OSS

        • The path of the source OSS bucket.

        • In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified OSS path.

      • Mount NAS File System

        • NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system.

        • NAS Source Path: the NAS path where the files are stored.

        • Mount Path: the mount path of the service instance. The mount path is used to read files from the specified NAS path.

      • Mount Git Path

        • Git Repository Address: the address of the Git repository.

        • Mount Path: the mount path of the service instance. The path is used to read the code file from the Git directory.

      • Mount PAI Dataset

        • Select an existing dataset. If no dataset is available, you can click Create Dataset to create a dataset.

        • In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI dataset.

      • Mount PAI Code

        • Select an existing code build. If no code set is available, you can click Create Code Build to create a new code build.

        • In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI code build.

      Third-party Libraries

      Click Specify Third-party Libraries to configure the third-party library. Valid values:

      • Third-party Libraries: specifies a third-party library in the text field.

      • Directory of requirements.txt: specifies the path of the requirements.txt file in the text field. You must include the address of the third-party library in the requirements.txt file.

      Environment Variables

      Click Specify Environment Variables to configure environment variables.

      Set the variable name and variable value to add one or more environment variables.

      • Name: the name of the environment variable used to run the image.

      • Value: the value of the environment variable used to run the image.

      Command

      The command to run the image. Example: python/run.py.

      You also need to enter the port number, which is the local HTTP port on which the model service listens after the image is deployed.

      Important

      You cannot specify port 8080 and 9090 because the EAS engine listens on port 8080 and 9090.

      Deploy service by using model and processor

      The following table describes the parameter configuration if you set the Deployment Method parameter to Deploy Service by Using Model and Processor.

      Parameter

      Description

      Model file

      Valid values:

      • OSS

        Select the OSS path that stores the model file.

      • Upload Data

        1. Select an OSS path in the current region.

        2. Click Browse Local Files and select the local model file that you want to upload. You can also directly drag and drop the model file to the blank area.

      • Publicly Accessible Download URL

        Select Publicly Accessible Download URL. Then, enter a publicly accessible URL in the field below the parameter.

      • Model selection

        Set Model Name and Model Version for an existing model that you want to use. For more information about how to view registered models, see Register and manage models.

      Processor Type

      The type of processor. You can select a built-in official processor or customize a processor based on your business requirements. For more information about built-in official processors, see Built-in processors.

      Model Type

      This parameter is required only when the Processor Type is set to EasyVision(CPU), EasyVision(GPU), EasyTransfer(CPU), EasyTransfer(GPU), EasyNLP, or EasyCV. The available model types vary based on the processor type. You can set the Processor Type and Model Type parameters based on your business requirements.

      Processor Language

      This parameter is available if you set the Processor Type parameter to Custom Processor.

      Valid values: Cpp, Java, and python.

      Processor package

      This parameter is available if you set the Processor Type parameter to Custom Processor. Valid values:

      • Upload Local File

        1. Select Upload Local File.

        2. Select an OSS path in the current region.

        3. Click Browse Local Files and select the local processor package that you want to upload. You can also directly drag and drop the processor package to the blank area.

          The package is uploaded to the OSS path in the current region, and the Processor Package parameter is automatically set.

          Note

          You can accelerate the loading speed of a processor during model deployment by uploading a local processor package.

      • Import OSS File

        Select Import OSS File. Then, select the OSS path in which the processor package is stored.

      • Download from Internet

        Select Download from Internet. Then, enter a public URL.

      Processor Master File

      This parameter is available if you set the Processor Type parameter to Custom Processor. This parameter specifies the main file of the processor package.

      Mount Settings

      Click Specify Mount Settings to configure the mounting method. You can use one of the following mount methods.

      • OSS

        • The path of the source OSS bucket.

        • In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified OSS path.

      • Mount NAS File System

        • NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system.

        • NAS Source Path: the NAS path where the files are stored.

        • Mount Path: the mount path of the service instance. The mount path is used to read files from the specified NAS path.

      • Mount PAI Dataset

        • Select an existing dataset. If no dataset is available, you can click Create Dataset to create a dataset.

        • In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI dataset.

      • Mount PAI Code

        • Select an existing code build. If no code set is available, you can click Create Code Build to create a new code build.

        • In the Mount Path section, specify the mount path of the service instance. The mount path is used to read files from the specified PAI code build.

      Environment Variables

      Click Specify Environment Variables to configure environment variables.

      Set the variable name and variable value to add one or more environment variables.

      • Name: the name of the environment variable used to run the image.

      • Value: the value of the environment variable used to run the image.

  3. In the Resource Deployment Information section, set the following parameters.

    Parameter

    Description

    Resource Group Type

    The type of resource group in which you want to deploy the model. You can deploy the model by using the public resource group or a dedicated resource group. For more information, see Work with dedicated resource groups.

    Note

    If you run a small number of tasks and do not have high requirements on latency, we recommend that you use the public resource group.

    Number Of Instances

    We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.

    If you set Resource Group Type to a dedicated resource group, you must set the CPU, Memory (MB), and GPU parameters for each service instance.

    Resource Configuration Mode

    This parameter is available only if you set the Resource Group Type parameter to Public Resource Group. Valid values:

    • General

      You can select a single CPU or GPU instance type.

    • Cost-effective Resource Configuration

      You can configure multiple instance types or use preemptible instances. For more information, see Specify multiple instance types and Create and use preemptible instances.

      • Preemptible Instance Protection Period: You can set a protection period of 1 hour for a preemptible instance. This means that during a protection period of 1 hour, the system ensures your access to the instance.

      • Deploy Resources: You can configure Common instances or Preemptible instances at the same time. Resources are started based on the sequence in which the instance types are configured. You can add up to five resource types. If you use Preemptible instances, you need to set a bid price to bid for preemptible instances.

    Elastic resource pool

    You can set this parameter only when the Resource Group Type is set to a dedicated resource group.

    You can enable Elastic Resource Pool and configure your resources based on the instructions in the Resource Configuration Mode section.

    If you enable Elastic Resource Pool and the dedicated resource group that you use to deploy services is fully occupied, the system automatically adds pay-as-you-go instances to the public resource group during scale-outs. The added instances are billed as public resources. The instances in the public resource group are first released during scale-ins.

    System Disks

    This parameter is available only if you set the Resource Group Type parameter to Public Resource Group.

    Click System Disks to configure additional system disks for the EAS service. Unit: GB. Valid values: 0 to 2000. You have a free quota of 30 GB on the system disk. If you specify 20 in the field, the available storage space is 30 GB +20 GB = 50 GB.

    Additional system disks are billed based on their capacity and usage duration. For more information, see Billing of EAS.

  4. Optional. In the VPC Settings section, set the VPC, vSwitch, and Security Group Name parameters to enable VPC for the EAS service deployed in the public resource group.

    After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.

  5. In the Configuration Editor section, the configurations of the service are displayed in the code editor. You can add configuration items that are not included in previous steps. For more information, see the "Create a service" section in the Run commands to use the EASCMD client topic.

    对应配置编辑区域

Configure parameters for scenario-based model deployment

The following section describes the parameters based on the scenario that you select.

ModelScope Model Deployment

Parameter

Description

Model Service Information

Service Name

The name of the service.

Select Model

Select a ModelScope model from the drop-down list.

Model Version

Select a model version from the drop-down list. By default, the latest version is used.

Model Type

After you select a model, the system automatically specifies the Model Type parameter.

Number Of Instances

Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.

Resource Configuration

Select the instance type for model deployment based on your business requirements. Only the public resource group is supported.

VPC

VPC

Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group.

After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.

vSwitch

Security Group Name

Hugging Face Model Deployment

Parameter

Description

Model Service Information

Service Name

The name of the service.

Model ID

The ID of the Hugging Face model. Example: distilbert-base-uncased-finetuned-sst-2-english.

Model Type

The type of the Hugging Face model. Example: text-classification.

Model Version

The version of the Hugging Face model. Example: main.

Instances

Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.

Resource Configuration

Select the instance type for model deployment based on your business requirements. Only the public resource group is supported.

VPC

VPC

Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group.

After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.

vSwitch

Security Group Name

Triton Deployment

Parameter

Description

Model Service Information

Service Name

The name of the service.

Model Settings

Make sure that the model you deploy meets the structure requirements of Trition. After you prepare the model, select one of the following method to deploy the model:

  • OSS Path: the OSS bucket directory in which the model is stored.

  • NAS File System Path

    • NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system. For more information about how to create a general-purpose NAS file system, see Create a NAS file system.

    • NAS Source Path: the source path of the model in NAS.

  • Select PAI Model: Select a registered model based on the model name and model version. For more information about how to register models, see Register and manage models.

Instances

Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.

Resource Configuration

Select the instance type for model deployment based on your business requirements. Only the public resource group is supported.

VPC

VPC

Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group.

After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.

vSwitch

Security Group Name

TensorFlow Serving deployment

Parameter

Description

Model Service Information

Service Name

The name of the service.

Deployment Mode

The following deployment methods are supported:

  • Standard Model Deployment: used to deploy a single-model service.

  • Configuration File Deployment: used to deploy a multi-model service.

Model Settings

The model structure deployed by TensorFlow Serving has fixed requirements.

  • If you set the Deployment Method parameter to Standard Model Deployment, you must configure the OSS bucket directory in which the model file is stored.

  • If you set the Deployment Method parameter to Configuration File Deployment, you must configure the following parameters:

    • OSS Path: the OSS bucket directory in which the model is stored.

    • Mount Path: the mount path of the service instance. The mount path is used to read the model file.

    • Configuration File Path: the OSS path in which the configuration file is stored.

Instances

Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.

Resource Configuration

Select the instance type for model deployment based on your business requirements. Only the public resource group is supported.

VPC

VPC

Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group.

After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.

vSwitch

Security Group Name

AI Painting - SD Web UI Deployment

Parameter

Description

Model Service Information

Service Name

The name of the service.

Edition

Valid values:

  • Standard Edition

    Standard edition is suitable for single users to deploy common tests and applications, and supports both web application and API calls.

  • API Edition

    The API version is suitable for scenarios in which you need to integrate your business by calling API operations. The system automatically switches the service to Asynchronous inference services.

  • Cluster Edition WebUI

    The cluster edition is suitable for teamwork scenarios in which multiple members use the web application to generate images. The cluster edition ensures that each user has an independent model and output path. The backend computing resources are shared and scheduled in a centralized manner, which improves cost-effectiveness.

Specify Model Settings

You can specify model settings in the following scenarios: (1) you want to use an open source model that you downloaded from communities or a model that you fine-tuned; (2) you want to save the output data to your data source; (3) you need to install third-party plug-ins or configuration. Valid values:

  • OSS Path: an empty file directory in the OSS bucket. For more information about how to create a bucket, see Create a bucket. For more information about how to create an empty directory, see Manage directories.

  • NAS File System Path

    • NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system.

    • NAS Source Path: the NAS path where the files are stored.

Instances

Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.

Resource Configuration

Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. We recommend that you use the instance type ml.gu7i.c16m60.1-gu30 in terms of cost-effectiveness.

VPC

VPC

Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group.

After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.

vSwitch

Security Group Name

Large Language Model (LLM)

Parameter

Description

Model Service Information

Service Name

The name of the service.

Model Source

Valid values:

  • Open Source Model: You can select a model from the Model Type drop-down list to quickly load and deploy a built-in LLM without the need to upload your model.

  • Custom Fine-tuned Model: You need to configure model settings to mount the fine-tuning model and set the parameters to deploy the model.

Model Settings

This parameter is required if you set the Model Source parameter to Custom Fine-tuned Model.

Valid values:

  • OSS Path: the OSS bucket directory in which the fine-tuned model is stored.

  • NAS File System Path

    • NAS Mount Target: the mount point of the NAS file system. The EAS service uses the mount point to access the NAS file system.

    • NAS Source Path: the source path of the NAS file system in which the fine-tuned model is stored.

  • Select PAI Model: Select a registered model based on the model name and model version. For more information about how to register models, see Register and manage models.

Model Type

Select a model category.

Instances

Default value: 1. We recommend that you specify multiple service instances to prevent risks caused by single-instance deployment.

Resource Configuration

Select the instance type for model deployment based on your business requirements. Only the public resource group is supported. We recommend that you use the instance type ml.gu7i.c16m60.1-gu30 in terms of cost-effectiveness.

VPC

VPC

Enable virtual private cloud (VPC) connection for the EAS services deployed in the public resource group.

After you enable the feature, the ECS instances that reside in the VPC can access EAS services deployed in the public resource group by using the created elastic network interface (ENI). In addition, the EAS services can access other cloud services that reside in the VPC.

vSwitch

Security Group Name

Manage online model services in EAS

On the Inference Service tab of the EAS-Online Model Services page, you can view deployed services, and stop, start, or delete services.

Warning

If you stop or delete a model service, requests that rely on the model service fail. Proceed with caution.

  • View service details

    • Click the name of the service that you want to manage to go to the Service Details tab. On the Service Details page, you can view the basic information, instances, and configurations of the service.

    • On the Service Details page, you can click different tabs to view information about service monitoring, logs, and deployment events.

  • Update service resource configurations

    On the Service Details tab, click Resource Configuration in the Resource Information section. In the Resource Configuration dialog box, update the resources that are used to run the service. For more information, see Upload and deploy models in the console.

  • Add a version for a deployed model service

    On the EAS-Online Model Services page, find the service that you want to update and click Update Service in the Actions column. For more information, see Upload and deploy models in the console.

    Warning

    When you add a version for a model service, the service is temporarily interrupted. Consequently, the requests that rely on the service fail until the service recovers. Proceed with caution.

    After you update the service, click the version number in the Current Version column to view the Version Information or change the service version. image

  • Scale resources

    On the EAS-Online Model Services page, find the service that you want to manage and click Scale in the Actions column. In the Scale dialogue box, specify the number of Instances to adjust the instances that are used to run the model service.

  • Enable auto scaling

    You can configure automatic scaling for the service to enable the service to automatically adjust the resources that are used to run the online model services in EAS based on your business requirements. For more information, see the "Method 1: Manage the horizontal auto scaling feature in the console" section in the Enable or disable the horizontal auto-scaling feature topic.

  • Switch traffic

    Follow the instructions in the following figure to switch traffic between services that are deployed by using blue-green deployment. a8c2160746b106db832e1ec386e5d6d9

References