To enable users to apply algorithms from end to end, Machine Learning Platform for AI provides an online prediction service Elastic Algorithm Service (EAS). EAS can be used in scenarios in which online inference is involved. EAS allows you to load models based on heterogeneous hardware such as CPUs and GPUs and respond to data requests in real time.
You can use EAS to deploy a model as a RESTful API. Then, you can call the API by sending HTTP requests. EAS provides features such as auto scaling and blue-green deployment. These features allow you to use the online algorithm model service with high concurrency and stability at low resource costs. EAS also supports resource group management, versioning, and resource monitoring, which help you deploy model services in your business.
EAS allows you to deploy model services in a shared resource group or a dedicated resource group. If you use a shared resource group, you are charged based on the amount of resources that are used by your model services. If you use a dedicated resource group, you are charged based on the Elastic Compute Service (ECS) instances in the resource group. Both the subscription and pay-as-you-go billing methods are supported. For more information about the pricing and billing rules of EAS, see Billing of EAS.
EAS is supported in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Hong Kong), Singapore, Indonesia (Jakarta), India (Mumbai), US (Virginia), and Germany (Frankfurt).
|Resource group||EAS allows you to group resources in a cluster for isolation. When you create a model service, you can choose to deploy the model service in the default shared resource group or in a dedicated resource group that you purchase.|
|Model service||Model services are deployed with permanent residence based on model files and online prediction logic. You can create, update, start, stop, scale out, and scale in model services.|
|Model file||Model files are offline models that are obtained after offline training. Different frameworks provide models in different formats. In general, a model file is deployed together with a processor to provide a model service.|
|Processor||A processor is a package that contains online prediction logic. A processor is deployed together with a model file to provide a model service. EAS provides built-in processors for Predictive Model Markup Language (PMML), TensorFlow SavedModel, and Caffe models.|
|Custom processor||If the built-in processors of EAS cannot meet your service deployment requirements, you can use custom processors to flexibly deploy services. EAS allows you to develop custom processors for C++, Java, and Python.|
|Service instance||You can deploy multiple service instances for each service. This helps increase the maximum number of concurrent requests that a service can handle. If a resource group contains multiple ECS instances, EAS automatically distributes instances to different ECS instances. This ensures high service availability.|
|High-speed direct connection||After a dedicated resource group is connected to your virtual private cloud (VPC), you can access the service instances of a service by using the client.|
Resource group management
EAS allows you to deploy models in a shared resource group that is provided by the system or a dedicated resource group that you create and manage. For more information, see Dedicated resource groups.
Model deployment methods
In the Machine Learning Platform for AI console, you can upload and deploy models on the Elastic Algorithm Service page, and deploy models in Machine Learning Studio, in Data Science Workshop (DSW), and on on-premises clients. For more information, see Deploy models.
Model service management
- View model calling information.
- Perform online debugging.
- View logs, monitoring information, and service deployment information.
- Scale in, scale out, start, stop, and delete model services.
To facilitate model service deployment in more environments, EAS allows you to use EASCMD to perform all service deployment operations. For more information, see Run commands to use the EASCMD client.
- To access a deployed model service from a public endpoint, you must activate the API Gateway service.
- To access a service from a public endpoint, you must bind your domain name to API Gateway. Otherwise, you can call the service up to 1,000 times per day.