All Products
Search
Document Center

E-MapReduce:Integrate EMR Serverless Spark with an external model service

Last Updated:Jun 18, 2026

EMR Serverless Spark lets you register external model services, including Model Studio, PAI-EAS, and self-hosted models. You can then use SQL for codeless batch sentiment analysis, content generation, tag extraction, and vector embedding, seamlessly integrating AI inference into your data processing workflows.

Procedure

This topic shows how to deploy the Qwen3-0.6B model in PAI-EAS, register it in EMR Serverless Spark, and call it for AI inference.

Step

Goal

Platform

Deploy a service

Publish a service in PAI-EAS

Alibaba Cloud PAI console

Obtain credentials

Get the endpoint and token

Alibaba Cloud PAI console

Register the service

Register the external model in EMR Serverless Spark

EMR Serverless Spark console

Call the model

Execute ai_query() using SQL

EMR Serverless Spark console

If you have already deployed a service in the PAI console, you can skip to Obtain credentials.

Deploy a service

This example shows how to deploy the Qwen3-0.6B public model.

Note

Public models include pre-configured deployment templates for one-click deployment, so you do not need to prepare model files. If you choose to deploy a custom model, you must mount the model files from a source such as OSS.

  1. Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).

  2. On the Inference Services tab, click Components. In the Scenario-based Model Deployment section, click Deploy LLM.

  3. On the Deploy LLM page, configure the following key parameters.

    • Model configuration: Select Public Model. Search for and select Qwen3-0.6B from the list.

    • Inference engine: SGLang/vLLM is recommended for its high compatibility with the OpenAI API standard. This example uses vLLM. For more information, see Choose a suitable inference engine.

    • Deployment template: Select Standalone. The template automatically provides recommended parameters, such as instance specification and image.

  4. Click Deploy. The deployment process takes about 5 minutes. When the service status changes to Running, the deployment is successful.

    Note

    If the deployment fails, see Troubleshoot service deployment and status issues.

Obtain credentials

After the service is deployed, you need to get its VPC endpoint and token. These credentials are required to register the service in EMR Serverless Spark.

  1. On the Inference Services tab, click the name of your service to go to the Overview page. In the Basic Information section, click View Invocation Information.

  2. In the Invocation Information dialog box, select the shared gateway tab to view and copy the public endpoint, VPC endpoint, and token.

Register the service

Register the PAI-EAS service in EMR Serverless Spark to allow the ai_query() function in Spark SQL to call it.

  1. Navigate to the model service page.

    1. Log on to the E-MapReduce console.

    2. In the left navigation pane, choose EMR Serverless > Spark.

    3. On the Spark page, click the name of your target workspace.

    4. On the EMR Serverless Spark page, click AI center > > model service in the left navigation pane.

  2. On the model service tab, click Create External Model Service and configure the following parameters:

    Parameter

    Example Value

    Description

    Model service name

    my_qwen_service

    This name is used in a subsequent AI function as the value for the endpointName input parameter, is unique within the workspace, and cannot be modified later.

    Endpoint

    http://12*******39.vpc.cn-hangzhou.pai-eas.aliyuncs.com/api/predict/<ServiceName>/v1

    Enter the VPC endpoint from the previous step and manually append /v1 to the end.

    Model name

    Qwen3-0.6B

    The model name used in service calls.

    Model type

    Chat

    Select Chat or Embedding based on the deployed model type.

    API key

    nMzI**********************Zg==

    Enter the Token that you obtained in the previous step.

    Description

    Latest Qwen multimodal model service

    Enter a brief description for easy identification.

  3. After configuring the parameters, click create to register the model service.

Call the model

Once the model service is registered, you can use the built-in ai_query() function in EMR Serverless Spark. This function lets you call the large model service with standard SQL statements, enabling codeless AI capabilities like data masking.

Note

Gateway-type jobs (Livy, Kyuubi) are not currently supported.

  1. Create a Spark SQL job and enable the AI function

    1. On the Development tab, click the image (New) icon.

    2. In the dialog box that appears, enter a Name, select SparkSQL as the type, and then click OK.

    3. In the drop-down list in the upper-right corner, click Connect to SQL Session and configure the following information:

      Parameter

      Description

      Engine version

      Select one of the following versions.

      • esr-4.x: esr-4.6.0 or later.

      • esr-3.x: esr-3.5.0 or later.

      • esr-2.x: esr-2.9.0 or later.

      Advanced configurations

      In the custom configuration, add the spark.emr.serverless.ai.function.enable true Spark configuration to enable the AI feature.

  2. Write SQL to call the model

    On the data development page, call the model using the ai_query() function in an SQL statement.

    -- The second parameter, 'my_qwen_service', is the Model Service Name you set during registration.
    select ai_query('Perform data masking on the following text according to these rules:
    1) Replace all Chinese names with ""
    2) Keep the first 5 digits of phone numbers and replace the rest with "*"
    3) Replace complete addresses with "*****"
    4) Keep all other text unchanged
    5) Output only the masked text without any explanation
    Original text: My name is Zhang San, my phone number is 12345678900, please navigate to Smart Home, Longgang District, Shenzhen City', 'my_qwen_service');
  3. View the masked result

    After successful execution, the result is as follows.

    My name is , my phone number is 12345*****, please navigate to *****

Related documentation