Integrate external model services with PolarSearch - PolarDB

PolarSearch uses the ML Plugin's connector framework to connect to external model services. This guide shows you how to integrate external services, such as self-hosted models or Alibaba Cloud Model Studio (Bailian), with PolarSearch to perform text embedding, result reranking, and large language model (LLM) inference.

Overview

The integration uses a connector architecture. You first create a connector to define how PolarSearch communicates with the external model's API. Then, you register the connector as a model in PolarSearch and deploy it. After deployment, you can start sending requests to the model.

Prepare your environment: Configure credentials and enable outbound network access.
Create a connector: Define the API endpoint, authentication method, and request/response format for each model type.
Register and deploy the model: Register the connector as a callable model in PolarSearch.
Test the model: Verify end-to-end connectivity by sending a test request.

Step 1: Prepare your environment

Configure access credentials and environment variables

Before you begin, prepare the following information and set it as environment variables. Centralizing these values simplifies the following curl commands and makes them easier to copy and run.

Parameter	Description	Example value
`POLARSEARCH_HOST_PORT`	The connection address for your PolarSearch node. For more information, see Get connection string.	`pc-xxx.polardbsearch.rds.aliyuncs.com:3001`
`USER_PASSWORD`	The PolarSearch node's Admin account.	`polarsearch_user:your_password`
`YOUR_API_KEY`	The API key for the external model service. Note If you use Alibaba Cloud Model Studio (Bailian), see Get API key. If your self-hosted model service does not require authentication, you can skip setting the `YOUR_API_KEY` variable.	`sk-xxxxxxxxxxxxxxxxxxxxxxxx`

Procedure: Run the following commands in your terminal, replacing the example values with your actual information.

# Set the PolarSearch host and port
export POLARSEARCH_HOST_PORT="pc-xxx.polardbsearch.rds.aliyuncs.com:3001"

# Set the PolarSearch admin username and password
export USER_PASSWORD="polarsearch_user:your_password"

# Set your Alibaba Cloud Model Studio (Bailian) API key
export YOUR_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxx"

Add trusted connector endpoints

For security, PolarSearch blocks access to external networks by default. You must modify the cluster settings to add the model service's endpoint to a trusted list, which allows PolarSearch to call it.

Self-hosted models: Ensure that your self-hosted model and PolarSearch are in the same Virtual Private Cloud (VPC).
Alibaba Cloud Model Studio (Bailian): We recommend accessing the model service over a private network by using an endpoint.
- When you create the endpoint, ensure that the endpoint and PolarSearch are in the same Virtual Private Cloud (VPC).
- After you create the endpoint, you get an endpoint domain name, such as ep-xxx.dashscope.cn-beijing.privatelink.aliyuncs.com. Add this domain to the trusted list.

CLI

curl -XPUT "https://${POLARSEARCH_HOST_PORT}/_cluster/settings" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "persistent": {
    "plugins.ml_commons.connector.private_ip_enabled": true,
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^http://ep-***.dashscope.cn-beijing.privatelink.aliyuncs.com/.*$"
    ]
  }
}'

Dashboard

PUT _cluster/settings
{
  "persistent": {
    "plugins.ml_commons.connector.private_ip_enabled": true,
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^http://ep-***.dashscope.cn-beijing.privatelink.aliyuncs.com/.*$"
    ]
  }
}

Step 2: Create a connector

A connector defines how PolarSearch communicates with an external model service, including the API endpoint, authentication method, and request/response format. You must create a separate connector for each model type you want to use.

The following examples use Alibaba Cloud Model Studio (Bailian) (DashScope). To connect to other service providers, adjust the endpoint, model name, and request format accordingly.

Text embedding connector

A text embedding connector converts text into vector representations for semantic search. The following is an example of a connector for the text-embedding-v4 model, created using Alibaba Cloud Model Studio (Bailian):

Note

If your self-hosted embedding model does not require a credential, you can omit the credential and headers.Authorization fields.

CLI

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
    "name": "text-embedding connector",
    "description": "text-embedding connector",
    "version": "1",
    "protocol": "http",
    "parameters": {
        "endpoint": "<your_model_endpoint>",
        "model": "text-embedding-v4"
    },
    "credential": {
        "api_key": "${YOUR_API_KEY}"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "headers": {
                "Authorization": "Bearer ${credential.api_key}"
            },
            "url": "http://${parameters.endpoint}/compatible-mode/v1/embeddings",
            "request_body": "{ \"model\": \"${parameters.model}\", \"input\": ${parameters.input} }",
            "pre_process_function": "connector.pre_process.openai.embedding",
            "post_process_function": "connector.post_process.openai.embedding"
        }
    ]
}'

Dashboard

POST /_plugins/_ml/connectors/_create
{
    "name": "text-embedding connector",
    "description": "text-embedding connector",
    "version": "1",
    "protocol": "http",
    "parameters": {
        "endpoint": "<your_model_endpoint>",
        "model": "text-embedding-v4"
    },
    "credential": {
        "api_key": "<your_api_key>"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "headers": {
                "Authorization": "Bearer ${credential.api_key}"
            },
            "url": "http://${parameters.endpoint}/compatible-mode/v1/embeddings",
            "request_body": "{ \"model\": \"${parameters.model}\", \"input\": ${parameters.input} }",
            "pre_process_function": "connector.pre_process.openai.embedding",
            "post_process_function": "connector.post_process.openai.embedding"
        }
    ]
}

Upon successful execution, the connector_id is returned. Record this ID for use in subsequent steps.

{
  "connector_id": "zocsGp0BFhPfW-xxxxxx"
}

Reranking connector

A reranking connector reorders search results based on relevance to improve the quality of results in a Retrieval-Augmented Generation (RAG) workflow. The following example shows a connector for the gte-rerank model created in Alibaba Cloud Model Studio (Bailian):

CLI

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "text-rerank connector",
  "description": "text-rerank connector",
  "version": "1",
  "protocol": "http",
  "parameters": {
    "endpoint": "<your_model_endpoint>",
    "model": "qwen3-rerank",
    "query": "",
    "documents": []
  },
  "credential": {
    "api_key": "${YOUR_API_KEY}"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}"
      },
      "url": "http://${parameters.endpoint}/api/v1/services/rerank/text-rerank/text-rerank",
      "request_body": "{ \"model\": \"${parameters.model}\", \"input\": { \"query\": \"${parameters.query}\", \"documents\": ${parameters.documents} } }"
    }
  ]
}'

Dashboard

POST /_plugins/_ml/connectors/_create
{
  "name": "text-rerank connector",
  "description": "text-rerank connector",
  "version": "1",
  "protocol": "http",
  "parameters": {
    "endpoint": "<your_model_endpoint>",
    "model": "qwen3-rerank",
    "query": "",
    "documents": []
  },
  "credential": {
    "api_key": "<your_api_key>"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}"
      },
      "url": "http://${parameters.endpoint}/api/v1/services/rerank/text-rerank/text-rerank",
      "request_body": "{ \"model\": \"${parameters.model}\", \"input\": { \"query\": \"${parameters.query}\", \"documents\": ${parameters.documents} } }"
    }
  ]
}

Large language model (LLM) connector

A large language model (LLM) connector calls an LLM to generate text. This supports conversational AI and content generation in PolarSearch. The following example creates a connector for the qwen-max model using Alibaba Cloud Model Studio (Bailian):

Command line

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
    "name": "qwen-llm connector",
    "description": "qwen-llm connector",
    "version": "1",
    "protocol": "http",
    "parameters": {
        "endpoint": "<your_model_endpoint>",
        "model": "qwen-max"
    },
    "credential": {
        "api_key": "${YOUR_API_KEY}"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "headers": {
                "Authorization": "Bearer ${credential.api_key}"
            },
            "url": "http://${parameters.endpoint}/compatible-mode/v1/chat/completions",
            "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
        }
    ]
}'

Dashboard

POST /_plugins/_ml/connectors/_create
{
    "name": "qwen-llm connector",
    "description": "qwen-llm connector",
    "version": "1",
    "protocol": "http",
    "parameters": {
        "endpoint": "<your_model_endpoint>",
        "model": "qwen-max"
    },
    "credential": {
        "api_key": "<your_api_key>"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "headers": {
                "Authorization": "Bearer ${credential.api_key}"
            },
            "url": "http://${parameters.endpoint}/compatible-mode/v1/chat/completions",
            "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
        }
    ]
}

Step 3: Register and deploy the model

After creating a connector, register it as a model in PolarSearch and then deploy it. The model must be deployed to process prediction requests. Register and deploy each connector that you want to use, such as Embedding, Rerank, and LLM.

Register the model

This example uses the connector created in Step 2 to register a remote model.

Command line

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/_register" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "embedding model",
  "function_name": "remote",
  "description": "embedding model description",
  "connector_id": "<connector_id>"
}'

Dashboard

POST /_plugins/_ml/models/_register
{
  "name": "embedding model",
  "function_name": "remote",
  "description": "embedding model description",
  "connector_id": "<connector_id>"
}

A successful operation returns a model_id. Record this ID for subsequent deployment and prediction requests.

{
  "task_id": "wv83Gp0Bd4rNxxxx",
  "status": "CREATED",
  "model_id": "w_83Gp0Bdxxxx"
}

Deploy the model

Command line

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<model_id>/_deploy" \
--user "${USER_PASSWORD}"

Dashboard

POST /_plugins/_ml/models/<model_id>/_deploy

If the operation is successful, the response contains "status": "COMPLETED". This indicates that the model is deployed and ready for use.

{
  "task_id": "0Yc4Gp0BFhPfW-xxxx",
  "task_type": "DEPLOY_MODEL",
  "status": "COMPLETED"
}

Step 4: Test the model

After you deploy the model, send a prediction request to verify the end-to-end connectivity between PolarSearch and the external model service.

Test the text embedding model

The following example sends a text snippet to the embedding model to verify that it returns a vector:

Command line

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<embedding_model_id>/_predict" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "parameters": {
    "input": ["hello world"]
  }
}'

Dashboard

POST /_plugins/_ml/models/<embedding_model_id>/_predict
{
  "parameters": {
    "input": ["hello world"]
  }
}

If the command runs successfully and the inference_results array in the response contains vector data, the model is successfully integrated.

Test the rerank model

The following example sends a query and a set of candidate documents to the rerank model:

Command line

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<rerank_model_id>/_predict" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "parameters": {
    "query": "What is machine learning?",
    "documents": [
      "Machine learning is a subfield of artificial intelligence.",
      "Deep learning uses neural networks.",
      "The weather is nice today."
    ],
    "top_n": 3
  }
}'

Dashboard

POST /_plugins/_ml/models/<rerank_model_id>/_predict
{
  "parameters": {
    "query": "What is machine learning?",
    "documents": [
      "Machine learning is a subfield of artificial intelligence.",
      "Deep learning uses neural networks.",
      "The weather is nice today."
    ],
    "top_n": 3
  }
}

If the command runs successfully and the response returns a list of documents sorted by relevance with a relevance_score for each document, the model is successfully integrated.

Test the large language model

The following example sends a conversation message to the large language model (LLM):

Command line

curl -XPOST "https://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<llm_model_id>/_predict" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "parameters": {
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is PolarSearch?"}
    ]
  }
}'

Dashboard

POST /_plugins/_ml/models/<llm_model_id>/_predict
{
  "parameters": {
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is PolarSearch?"}
    ]
  }
}

If the command runs successfully and the inference_results array in the response contains a text reply from the model, the model is successfully integrated.

(Optional) Clean up resources

When model resources are no longer needed, clean them up in the following order to avoid dependency errors:

Important

You cannot delete a deployed model directly. Undeploy the model first.

Undeploy the model: Undeploying a model changes its status from DEPLOYED to UNDEPLOYED. The model will no longer accept prediction requests.
```
POST /_plugins/_ml/models/<model_id>/_undeploy
```
Delete the model: Deleting a registered model removes its registration information.
```
DELETE /_plugins/_ml/models/<model_id>
```
Delete the connector: You can delete a connector if it is not used by any other models. The delete operation fails if any model depends on the connector.
```
DELETE /_plugins/_ml/connectors/<connector_id>
```