Online inference is a key component in applying algorithm models to businesses. To help users apply algorithms from end to end, Machine Learning Platform for AI (PAI) provides Elastic Algorithm Service (EAS) for online inference. EAS is used to load CPU compute and GPU compute-based models and handle service requests in real time. EAS provides multiple methods for you to deploy models online as RESTful APIs. It also supports auto scaling and blue-green deployment to deploy highly-concurrent and stable online model services with the lowest costs.
EAS allows you to deploy a model online as a RESTful API. You can then send HTTP requests to the API to call the service.
- Supported regions include China (Beijing), China (Shanghai), Singapore (Singapore), Germany (Frankfurt), India (Mumbai), and Indonesia (Jakarta).
- EAS supports version management, resource monitoring, blue-green deployment (rolling update), and auto scaling. You can use these features to apply models to your businesses.
- The service is charged after its status changes to Running. For more information, see Pricing.
- You must bind the service to your domain name in API Gateway. Otherwise, you can only call the service up to 1,000 times per day by using the public endpoint of the service.
In the PAI EAS console, you can manage model services:
view model call information,
debug services online,
view log, monitoring, and service deployment information,
scale services in or out, start or stop services, and delete services.
- Before you deploy models online to provide model services through PAI, you must activate Alibaba Cloud API Gateway. For more information, see API Gateway pricing.
- Billing of online prediction services in PAI: For more information, see EAS billing.
The following table defines the permissions involved in Elastic Algorithm Service (EAS).
|Model service permissions||eas:EditInstance||The write permission for creating, updating, and deleting model services.|
|eas:listInstance||The list permission for listing model services on the overview page of the console.|
|eas:ReadInstance||The read permission for viewing the monitoring, logs, service addresses, and online debugging of model services.|
|eas:OperateInstance||The operation permission for running or stopping model services, and distributing network traffic.|
|Resource group permissions||eas:ListResourceGroup||The list permission for listing resource groups on the Overview page of the console. Resource groups can be listed and used when a service is deployed.|
|eas:ReadResourceGroup||The read permission for viewing details about a resource group, including the number of servers, models, and service status.|
|eas:OperateResourceGroup||The operation permission for creating (purchasing), deleting, renewing (applicable for pay-as-you-go resource groups), scaling out, and scaling in (applicable for pay-as-you-go resource groups) resource groups, activating and deactivating Virtual Private Cloud (VPC) direct connect for resource groups.|
To allow a Resource Access Management (RAM) user to perform operations on model services or resource groups in the console, you must use the Alibaba Cloud account to authorize the RAM user in Alibaba Cloud RAM.
Log on to the RAM console at https://www.aliyun.com/product/ram.
Authorize the RAM user as follows:
Step 1: Customize a Policy overview. For definition of the permission policy, see the gray introduction.
Step 2: Grant the custom permission policy to a RAM user.
- A permission policy is the basic unit for authorizing an Alibaba Cloud RAM user.
- A permission policy is the parent concept of the permissions defined in the preceding table. That is, a permission policy may include one or more permissions in the preceding table.
- A policy has its own name. The Alibaba Cloud account can identify policies based on their policy names and grant permissions to RAM users.
- Alibaba Cloud provides two types of policies: Alibaba Cloud’s system policies (that is, general policies), and custom policies, which are used to meet users’ special requirements for specific Alibaba Cloud products. EAS permission policies are custom policies.
EAS permission policies apply to specific Alibaba Cloud products, but are not system policies of Alibaba Cloud. Therefore, you need to customize permission policies.
Log on to the RAM console. In the left-side navigation pane, choose Permissions > Policies. On the Policies page, click Create Policy. On the Create Custom Policy page, select Script for Configuration Mode and set other parameters as required.
Then, you can then define the policy and configure the policy name for easy identification and authorization. Define policies with caution based on the permissions required by a RAM user. One policy may include one or more permissions**.
For example, to define a policy that includes the model service read permission for RAM users and the model service write permission for RAM users, perform the following operations: You can name the policy Model_R&W. After the Alibaba Cloud account grants the “Model_R&W” permission policy to a RAM user, the RAM user has both the read and write permissions for model services.
After defining the policy, you can grant permissions to the RAM user.
On the Users page, find the target RAM user and click Add Permissions in the Actions column.
In the Add Permissions pane, select Custom Policy and find the custom policy you just defined. Select the policy for authorization.
To create and purchase a resource group, you need to grant both the eas:OperateResourceGroup permission and the finance permission to the RAM user. Otherwise, you cannot use the RAM user to place an order for payment when purchasing the resource group.
The finance permission AliyunFinanceConsoleFullAccess is a system policy that you do not need to define but can be directly granted to the RAM user.
Log on to the RAM console at https://ram.console.aliyun.com/users.On the Users page, find the target RAM user and click Add Permissions in the Actions column.
In the Add Permissions pane, select System Policy, click the AliyunFinanceConsoleFullAccess policy to add it to the Selected section, and then click OK.
To grant the permissions for deploying model services to a RAM user, you also need to log on to the DTplus console and bind the AccessKey pair of the RAM user in User Info. Use the Alibaba Cloud account to view the AccessKey pair of the RAM user to be authorized in User Management. If the RAM user does not have an AccessKey pair, you can click Create AccessKey to create an AccessKey pair for the RAM user and notify the RAM user of the AccessKey pair.
After you bind the AccessKey pair, you can then use the RAM user to create model services and perform other operations in EAS.