Elastic Algorithm Service (EAS) provides preset images to deploy community models and includes acceleration mechanisms for model distribution and image startup. You can quickly deploy community models to the EAS platform by configuring only a few parameters. This topic describes how to deploy community models from Hugging Face.
Background information
Open model communities, such as Hugging Face, offer a vast collection of machine learning models and code implementations. Their library APIs encapsulate models, frameworks, and the associated pre-processing and post-processing logic. This lets you perform end-to-end tasks, such as model training and inference, with only a few lines of code. You do not need to manage complex environment dependencies, framework types, or other challenges related to model deployment. This ecosystem represents an evolution of the traditional framework-model paradigm led by TensorFlow and PyTorch.
EAS is optimized to support this approach, letting you easily deploy community models.
Deploy a HuggingFace model
Platform for AI (PAI) lets you quickly deploy models from the official Hugging Face tasks library as services on EAS. Follow these steps:
Go to the official tasks library. library and choose a model to deploy. This topic uses the distilbert-base-uncased-finetuned-sst-2-english text classification model as an example. On the model page, find and save the values for
MODEL_ID,TASK, andREVISION, as shown in the image below.
Use the following table to find the correct
TASKvalue for deploying an EAS service. Only theTASKtypes listed in the table are supported.TASK displayed on the Hugging Face page
TASK to specify in EAS
Audio Classification
audio-classification
Automatic Speech Recognition(ASR)
automatic-speech-recognition
Feature Extraction
feature-extraction
Fill Mask
fill-mask
Image Classification
image-classification
Question Answering
question-answering
Summarization
summarization
Text Classification
text-classification
Sentiment Analysis
sentiment-analysis
Text Generation
text-generation
Translation
translation
Translation (xx-to-yy)
translation_xx_to_yy
Text-to-Text Generation
text2text-generation
Zero-Shot Classification
zero-shot-classification
Document Question Answering
document-question-answering
Visual Question Answering
visual-question-answering
Image-to-Text
image-to-text
Deploy the Hugging Face model on the EAS Model Online Service page.
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
Click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.
On the Custom Deployment page, configure the following key parameters. For more information about other parameters, see Custom deployment.
Parameter
Description
Basic Information
Service Name
Enter a custom service name as prompted.
Environment Information
Deployment Method
Select Image Deployment and select Enable Web App.
Image Configuration
From the Official Image list, select huggingface-inference. Then, select an image name based on the version.
Environment Variable
Click Add and configure the following environment variables using the values from Step 1:
MODEL_ID: distilbert-base-uncased-finetuned-sst-2-english.
TASK: text-classification.
REVISION: main.
Run Command
The system automatically populates the run command after you select the image. You do not need to change it.
Resource Information
Configure System Disk
Set the system disk to 130 GB.
Click Deploy. When the Service Status changes to Running, the service is deployed.
Call the deployed model service.
Invoke the service from the console
Click the name of the target service to open its details page. Then, click View Web Application in the upper-right corner.

In the Actions column of the target service, click Online Debugging. On the Body tab, enter the request data, such as
{"data": ["hello"]}, and then click Send Request.
NoteThe input data format (
{"data": ["XXX"]}) for the text classification model is defined by/api/predictof the Gradio framework. If you use other types of models, such as those for image classification or speech data processing, refer to the/api/predictdefinition to build your request.
Invoke the service by using an API
Click the service name to open the service details page. On the Overview tab, click View Invocation Information.
In the Invocation Information dialog box, on the Shared Gateway tab, view and save the values for Endpoint and Token.
Use the following code to call the service. Replace
<service_url>and<token>with the endpoint and token that you obtained in the previous step.import requests resp = requests.post(url="<service_url>", headers={"Authorization": "<token>"}, json={"data": ["hello"]}) print(resp.json())Output:
{ "data": [ { "label": "POSITIVE", "confidences": [ { "label": "POSITIVE", "confidence": 0.9995185136795044 } ] } ], "is_generating": false, "duration": 0.280987024307251, "average_duration": 0.280987024307251 }