EMR Serverless Spark lets you register external model services, including Model Studio, PAI-EAS, and self-hosted models. You can then use SQL for codeless batch sentiment analysis, content generation, tag extraction, and vector embedding, seamlessly integrating AI inference into your data processing workflows.
Procedure
This topic shows how to deploy the Qwen3-0.6B model in PAI-EAS, register it in EMR Serverless Spark, and call it for AI inference.
|
Step |
Goal |
Platform |
|
Deploy a service |
Publish a service in PAI-EAS |
Alibaba Cloud PAI console |
|
Obtain credentials |
Get the |
Alibaba Cloud PAI console |
|
Register the service |
Register the external model in EMR Serverless Spark |
EMR Serverless Spark console |
|
Call the model |
Execute ai_query() using SQL |
EMR Serverless Spark console |
If you have already deployed a service in the PAI console, you can skip to Obtain credentials.
Deploy a service
This example shows how to deploy the Qwen3-0.6B public model.
Public models include pre-configured deployment templates for one-click deployment, so you do not need to prepare model files. If you choose to deploy a custom model, you must mount the model files from a source such as OSS.
-
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
-
On the Inference Services tab, click Components. In the Scenario-based Model Deployment section, click Deploy LLM.
-
On the Deploy LLM page, configure the following key parameters.
-
Model configuration: Select Public Model. Search for and select Qwen3-0.6B from the list.
-
Inference engine: SGLang/vLLM is recommended for its high compatibility with the OpenAI API standard. This example uses vLLM. For more information, see Choose a suitable inference engine.
-
Deployment template: Select Standalone. The template automatically provides recommended parameters, such as instance specification and image.
-
-
Click Deploy. The deployment process takes about 5 minutes. When the service status changes to Running, the deployment is successful.
NoteIf the deployment fails, see Troubleshoot service deployment and status issues.
Obtain credentials
After the service is deployed, you need to get its VPC endpoint and token. These credentials are required to register the service in EMR Serverless Spark.
-
On the Inference Services tab, click the name of your service to go to the Overview page. In the Basic Information section, click View Invocation Information.
-
In the Invocation Information dialog box, select the shared gateway tab to view and copy the public endpoint, VPC endpoint, and token.
Register the service
Register the PAI-EAS service in EMR Serverless Spark to allow the ai_query() function in Spark SQL to call it.
-
Navigate to the model service page.
-
Log on to the E-MapReduce console.
-
In the left navigation pane, choose .
-
On the Spark page, click the name of your target workspace.
-
On the EMR Serverless Spark page, click in the left navigation pane.
-
-
On the model service tab, click Create External Model Service and configure the following parameters:
Parameter
Example Value
Description
Model service name
my_qwen_serviceThis name is used in a subsequent
AI functionas the value for theendpointNameinput parameter, is unique within the workspace, and cannot be modified later.Endpoint
http://12*******39.vpc.cn-hangzhou.pai-eas.aliyuncs.com/api/predict/<ServiceName>/v1Enter the VPC endpoint from the previous step and manually append
/v1to the end.Model name
Qwen3-0.6BThe model name used in service calls.
Model type
ChatSelect
ChatorEmbeddingbased on the deployed model type.API key
nMzI**********************Zg==Enter the
Tokenthat you obtained in the previous step.Description
Latest Qwen multimodal model service
Enter a brief description for easy identification.
-
After configuring the parameters, click create to register the model service.
Call the model
Once the model service is registered, you can use the built-in ai_query() function in EMR Serverless Spark. This function lets you call the large model service with standard SQL statements, enabling codeless AI capabilities like data masking.
Gateway-type jobs (Livy, Kyuubi) are not currently supported.
-
Create a Spark SQL job and enable the AI function
-
On the Development tab, click the
(New) icon. -
In the dialog box that appears, enter a Name, select SparkSQL as the type, and then click OK.
-
In the drop-down list in the upper-right corner, click Connect to SQL Session and configure the following information:
Parameter
Description
Engine version
Select one of the following versions.
-
esr-4.x: esr-4.6.0 or later.
-
esr-3.x: esr-3.5.0 or later.
-
esr-2.x: esr-2.9.0 or later.
Advanced configurations
In the custom configuration, add the
spark.emr.serverless.ai.function.enable trueSpark configuration to enable the AI feature. -
-
-
Write SQL to call the model
On the data development page, call the model using the ai_query() function in an SQL statement.
-- The second parameter, 'my_qwen_service', is the Model Service Name you set during registration. select ai_query('Perform data masking on the following text according to these rules: 1) Replace all Chinese names with "" 2) Keep the first 5 digits of phone numbers and replace the rest with "*" 3) Replace complete addresses with "*****" 4) Keep all other text unchanged 5) Output only the masked text without any explanation Original text: My name is Zhang San, my phone number is 12345678900, please navigate to Smart Home, Longgang District, Shenzhen City', 'my_qwen_service'); -
View the masked result
After successful execution, the result is as follows.
My name is , my phone number is 12345*****, please navigate to *****
Related documentation
-
To learn more about deploying custom models on PAI-EAS, see Quick start for Elastic Algorithm Service (EAS).
-
PAI-EAS provides a one-stop solution for LLM deployment. For details, see Deploy large language models (LLMs).