使用PolarSearch搭建检索增强生成RAG系统 - 云原生数据库 PolarDB

工作原理

PolarSearch的RAG功能主要包含数据入库和查询问答两个核心流程，其工作机制如下图所示。

数据入库流程：

将原始文本文档写入PolarSearch索引。
配置好的Ingest Pipeline（摄取管道）会自动拦截写入请求。
管道内的text_embedding处理器调用外部的文本向量化模型（如text-embedding-v4），将指定文本字段转换为向量。
原始文本和生成的向量一同存储在向量索引中。

查询问答流程：

发起一个包含问题的搜索请求。
Search Pipeline（搜索管道）接收请求，首先通过neural查询将问题向量化，并在向量索引中召回一批语义最相关的文档。
rerank处理器调用重排模型，对召回的文档进行二次排序，提升结果相关性。
retrieval_augmented_generation处理器将问题和筛选后的文档组合成一个提示（Prompt）。
处理器调用外部的文本生成模型（如千问大模型），并将组合好的提示发送给模型。
模型根据提示生成最终答案，并由PolarSearch返回给用户。

适用范围

功能节点：添加PolarSearch搜索节点并设置搜索节点的管理员账号。
获取API Key：获取阿里云大模型服务平台百炼的API Key。

准备工作：配置访问凭证与环境变量

在开始操作前，请先准备好以下信息，并设置为环境变量。这将有效简化后续的curl命令，避免重复修改。统一管理所有配置和凭证，方便后续命令的复制和执行。

变量名	含义	示例值
`POLARSEARCH_HOST_PORT`	PolarSearch节点的连接地址与端口。	`pc-xxx.polardbsearch.rds.aliyuncs.com:3001`
`USER_PASSWORD`	PolarSearch节点的管理员账号。	`polarsearch_user:your_assword`
`YOUR_API_KEY`	阿里云大模型服务平台百炼的API Key。	`sk-xxxxxxxxxxxxxxxxxxxxxxxx`

操作步骤：在您的终端中执行以下命令，将示例值替换为您的真实信息。

# 设置 PolarSearch 访问地址和端口
export POLARSEARCH_HOST_PORT="pc-xxx.polardbsearch.rds.aliyuncs.com:3001"

# 设置 PolarSearch 管理员密码
export USER_PASSWORD="polarsearch_user:your_assword"

# 设置您的千问 API Key
export YOUR_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxx"

步骤一：配置并部署大语言模型

在PolarSearch中注册并部署RAG流程所需的外部模型，包括用于生成答案的文本生成模型和用于向量化的文本向量化模型。

配置外部模型访问白名单

出于安全考虑，PolarSearch要求将所有外部模型服务的API Endpoint加入信任列表。这里使用千问官方模型调用地址。

说明

PolarSearch访问阿里云大模型服务平台百炼中的大模型，需首先建立PolarSearch所属集群的VPC与百炼VPC之间的访问通道。若您有相关需求，请提交工单联系我们为您处理。

命令行

curl -XPUT "http://${POLARSEARCH_HOST_PORT}/_cluster/settings" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "persistent": {
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^https://dashscope.aliyuncs.com/compatible-mode.*$",
      "^https://dashscope.aliyuncs.com/compatible-api.*$"
    ]
  }
}'

Dashboard

PUT _cluster/settings
{
  "persistent": {
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^https://dashscope.aliyuncs.com/compatible-mode.*$",
      "^https://dashscope.aliyuncs.com/compatible-api.*$"
    ]
  }
}

配置并部署文本生成模型（LLM）

创建一个连接器（Connector）来调用千问的qwen-plus模型，并将其注册、部署为PolarSearch中的一个可用模型，用于最终的问答生成。

创建Connector：此步骤定义了如何连接到外部的qwen-plus模型。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "QWen Chat",
  "description": "The connector to QWen Chat",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "qwen-plus",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
      "api_key": "${YOUR_API_KEY}"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${YOUR_API_KEY}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/chat/completions",
      "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
      }
    ]
}'

Dashboard

重要

请替换以下命令中的<YOUR_API_KEY>为您真实的阿里云大模型服务平台百炼的API Key。

POST _plugins/_ml/connectors/_create
{
  "name": "QWen Chat",
  "description": "The connector to QWen Chat",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "qwen-plus",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
      "api_key": "<YOUR_API_KEY>"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/chat/completions",
      "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
      }
    ]
}

执行成功后，系统会返回一个connector_id。请记录该ID，后续步骤将使用。

{"connector_id":"GUCpu5sBy-xxx"}

注册并部署模型：此步骤将上一步创建的Connector注册为一个模型。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/_register" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "QWen Chat model",
  "function_name": "remote",
  "description": "QWen Chat model",
  "connector_id": "GUCpu5sBy-xxx"
}'

Dashboard

POST _plugins/_ml/models/_register
{
  "name": "QWen Chat model",
  "function_name": "remote",
  "description": "QWen Chat model",
  "connector_id": "GUCpu5sBy-xxx"
}

执行成功后，系统会返回一个 model_id。请记录该 ID，后续步骤将使用。

# 记录返回的 model_id
{"task_id":"xxx_nIW","status":"CREATED","model_id":"H0Cwu5sBy-xxx"}

发布模型：

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<上一步获取的模型ID>/_deploy" \
--user "${USER_PASSWORD}"

Dashboard

POST _plugins/_ml/models/<上一步获取的模型ID>/_deploy

执行成功后，如果status为COMPLETED代表已部署成功。

{"task_id":"xxx","task_type":"DEPLOY_MODEL","status":"COMPLETED"}

测试模型：通过下述查询可以直接与大模型对话并查看大模型返回结果。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<上一步获取的模型ID>/_predict" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'
{
   "parameters": {
     "messages": [
       {
         "role": "system",
         "content": "You are a helpful assistant."
       },
       {
         "role": "user",
         "content": "Hello!"
       }
     ]
   }
 }
'

Dashboard

POST _plugins/_ml/models/<上一步获取的模型ID>/_predict
{
   "parameters": {
     "messages": [
       {
         "role": "system",
         "content": "You are a helpful assistant."
       },
       {
         "role": "user",
         "content": "Hello!"
       }
     ]
   }
 }

预期返回结果如下：

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "choices": [
              {
                "message": {
                  "role": "assistant",
                  "content": "Hello! How can I assist you today?"
                },
                "finish_reason": "stop",
                "index": 0,
                "logprobs": null
              }
            ],
            "object": "chat.completion",
            "usage": {
              "prompt_tokens": 21,
              "completion_tokens": 20,
              "total_tokens": 41,
              "prompt_tokens_details": {
                "cached_tokens": 0
              }
            },
            "created": 1768380502,
            "system_fingerprint": null,
            "model": "qwen-plus",
            "id": "chatcmpl-908cdd4e-xxxx-xxxx-xxxx-04787257f31a"
          }
        }
      ],
      "status_code": 200
    }
  ]
}

配置并部署文本向量化模型

创建另一个Connector来调用千问的text-embedding-v4模型，并将其部署，用于将文本数据转换为向量。

创建Connector：此步骤定义了如何连接到外部的 text-embedding-v4 模型。其中pre_process_function和post_process_function是内置函数，用于将PolarSearch的数据格式与OpenAI兼容的API格式进行自动转换。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "qwen embedding connector",
  "description": "The connector to qwen embedding model",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "text-embedding-v4",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
      "api_key": "${YOUR_API_KEY}"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/embeddings",
      "request_body": "{ \"model\": \"${parameters.model}\", \"input\": ${parameters.input} }",
      "pre_process_function": "connector.pre_process.openai.embedding",
      "post_process_function":"connector.post_process.openai.embedding"
      }
    ]
}'

Dashboard

重要

请替换以下命令中的<YOUR_API_KEY>为您真实的阿里云大模型服务平台百炼的API Key。

POST _plugins/_ml/connectors/_create
{
  "name": "qwen embedding connector",
  "description": "The connector to qwen embedding model",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "model": "text-embedding-v4",
    "endpoint": "dashscope.aliyuncs.com/compatible-mode"
  },
  "credential": {
      "api_key": "<YOUR_API_KEY>"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "Authorization": "Bearer ${credential.api_key}",
        "content-type": "application/json"
      },
      "url": "https://${parameters.endpoint}/v1/embeddings",
      "request_body": "{ \"model\": \"${parameters.model}\", \"input\": ${parameters.input} }",
      "pre_process_function": "connector.pre_process.openai.embedding",
      "post_process_function":"connector.post_process.openai.embedding"
      }
    ]
}

执行成功后，系统会返回一个connector_id。请记录该ID，后续步骤将使用。

{"connector_id":"GUCpu5sBy-xxx"}

注册并部署模型：此步骤将上一步创建的Connector注册为一个模型。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/_register" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "name": "qwen embedding model",
  "function_name": "remote",
  "description": "Embedding model for memory",
  "connector_id": "GUCpu5sBy-xxx"
}'

Dashboard

POST _plugins/_ml/models/_register
{
  "name": "qwen embedding model",
  "function_name": "remote",
  "description": "Embedding model for memory",
  "connector_id": "GUCpu5sBy-xxx"
}

执行成功后，系统会返回一个 model_id。请记录该 ID，后续步骤将使用。

# 记录返回的 model_id
{"task_id":"xxx_nIW","status":"CREATED","model_id":"J0C8u5sBy-xxx"}

发布模型：

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<上一步获取的模型ID>/_deploy" \
--user "${USER_PASSWORD}"

Dashboard

POST _plugins/_ml/models/<上一步获取的模型ID>/_deploy

执行成功后，如果status为COMPLETED代表已部署成功。

{"task_id":"xxx","task_type":"DEPLOY_MODEL","status":"COMPLETED"}

测试模型：通过下述查询可以直接与大模型对话并查看大模型返回结果。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/<上一步获取的模型ID>/_predict" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
  "text_docs":[ "Bob likes swimming. Context: He expressed his interes t in swimming."],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}'

Dashboard

POST /_plugins/_ml/_predict/text_embedding/<上一步获取的模型ID>
{
  "text_docs":[ "Bob likes swimming. Context: He expressed his interes t in swimming."],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}

预期返回结果如下：

{
  "inference_results": [
    {
      "output": [
        {
          "name": "sentence_embedding",
          "data_type": "FLOAT32",
          "shape": [
            1024
          ],
          "data": [
            0.019832463935017586,
            -0.017113497480750084,
            ...
          ]
        }
      ],
      "status_code": 200
    }
  ]
}

步骤二：构建向量数据入库管道

创建一个自动化流程，使得文档在存入索引时，其指定的文本字段能被自动向量化。

创建数据处理管道（Ingest Pipeline）

定义一个名为nlp-ingest-pipeline的管道，它使用上一步部署的向量化模型，在文档入库时自动将text字段的内容转换为向量，并存入content_embedding字段。

命令行

curl -XPUT "http://${POLARSEARCH_HOST_PORT}/_ingest/pipeline/nlp-ingest-pipeline" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'
{
  "description": "An NLP ingest pipeline",
  "processors": [
    {
      "text_embedding": {
        "model_id": "<步骤一中文本向量化模型的ID>",
        "field_map": {
          "text": "content_embedding"
        }
      }
    }
  ]
}
'

Dashboard

PUT _ingest/pipeline/nlp-ingest-pipeline
{
  "description": "An NLP ingest pipeline",
  "processors": [
    {
      "text_embedding": {
        "model_id": "<步骤一中文本向量化模型的ID>",
        "field_map": {
          "text": "content_embedding"
        }
      }
    }
  ]
}

预期返回结果如下：

{"acknowledged": true}

创建向量索引

创建一个名为my-nlp-index的索引，用于存储知识库文档和其对应的向量。

index.knn: true：开启k-NN向量检索功能。
default_pipeline: "nlp-ingest-pipeline"：将索引与上一步创建的摄取管道绑定。
content_embedding字段：
- type: "knn_vector"：定义为向量类型。
- dimension: 1024：关键参数。此维度需与所使用的向量化模型（本文为text-embedding-v4）的输出维度严格一致。
- space_type: "l2"：向量空间距离度量方式，l2为欧氏距离。常用的还有cosinesimil（余弦相似度），请根据模型推荐和业务场景选择。

命令行

curl -XPUT "http://${POLARSEARCH_HOST_PORT}/my-nlp-index" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'
{
  "settings": {
    "index.knn": true,
    "default_pipeline": "nlp-ingest-pipeline"
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "text"
      },
      "content_embedding": {
        "type": "knn_vector",
        "dimension": 1024,
        "space_type": "l2"
      },
      "text": {
        "type": "text"
      }
    }
  }
}
'

Dashboard

PUT /my-nlp-index
{
  "settings": {
    "index.knn": true,
    "default_pipeline": "nlp-ingest-pipeline"
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "text"
      },
      "content_embedding": {
        "type": "knn_vector",
        "dimension": 1024,
        "space_type": "l2"
      },
      "text": {
        "type": "text"
      }
    }
  }
}

预期返回结果如下：

{"acknowledged": true,"shards_acknowledged": true, "index": "my-nlp-index"}

步骤三：构建基础RAG查询管道

创建一个名为my-rag-search-pipeline的搜索管道，用于处理用户的查询请求，并调用大模型生成答案。

retrieval_augmented_generation处理器是RAG的核心，它负责整合搜索结果和用户问题，并与大模型交互。

model_id：使用步骤一中部署的文本生成模型ID。
context_field_list：指定将哪些字段的内容作为上下文提供给大模型，这里使用原始的text字段。
system_prompt和user_instructions：定义了给大模型的指令，可根据业务需求进行精调，以控制回答的风格、长度和格式。

命令行

curl -XPUT "http://${POLARSEARCH_HOST_PORT}/_search/pipeline/my-conversation-search-pipeline-qwen-chat" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'
{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "Demo pipeline",
        "description": "Demo pipeline Using QWen Chat",
        "model_id": "<步骤一中文本生成模型的ID>",
        "context_field_list": ["text"],
        "system_prompt": "You are a helpful assistant.",
        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
      }
    }
  ]
}
'

Dashboard

PUT _search/pipeline/my-conversation-search-pipeline-qwen-chat
{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "Demo pipeline",
        "description": "Demo pipeline Using QWen Chat",
        "model_id": "<步骤一中文本生成模型的ID>",
        "context_field_list": ["text"],
        "system_prompt": "You are a helpful assistant.",
        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
      }
    }
  ]
}

预期返回结果如下：

{"acknowledged": true}

步骤四：数据入库与RAG查询测试

向索引中写入一批示例文档，并执行一次端到端的RAG查询，验证整个流程是否正常工作。

写入示例文档

使用_bulk API批量写入多篇关于PolarDB和PolarSearch的文档。由于索引已绑定摄取管道，text字段将在写入时被自动向量化。

说明

以下文本中的年份[具体年份]仅为示例，不影响功能。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_bulk?refresh=true" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '
{"index": {"_index": "my-nlp-index", "_id": "1"}}
{"text": "PolarSearch是PolarDB基于Opensearch研发的高性能分布式数据检索与分析引擎，依托兼容Elasticsearch、Opensearch生态。依托于PolarStore分布式共享存储及云原生计算存储分离架构，定位服务于海量PB级异构数据的存储、分析与多路融合实时检索。产品深度融合自研的智能搜索引擎与分布式计算框架，兼容Elasticsearch DSL语法协议，支持文本文档、图片特征、日志等多模态数据的毫秒级全文检索、向量检索与智能分析，帮助企业快速构建高并发、高可用的数据搜索服务，释放数据价值。"}
{"index": {"_index": "my-nlp-index", "_id": "2"}}
{"text": "PolarDB MySQL版新增向量检索功能，完全兼容MySQL 9.0语法并支持InnoDB事务保证，同时提供100%召回率的精确KNN（基于列存加速）与高性能的近似ANN（基于HNSW等索引）检索。适用于希望在关系型数据库中进行向量检索、RAG应用的客户。"}
{"index": {"_index": "my-nlp-index", "_id": "3"}}
{"text": "[具体年份]2月26日，在阿里云PolarDB开发者大会上，云原生数据库PolarDB正式推出内置大模型的PolarDB AI版本，帮助个人和企业开发者快速部署并上线AI应用。PolarDB AI节点采用模型算子化形态，支持用户在数据库内部直接进行搜索推理优化，在线推理吞吐量可提升10倍以上，显著降低用户部署成本。"}
{"index": {"_index": "my-nlp-index", "_id": "4"}}
{"text": "阿里云瑶池数据库致力于从云原生数据底座全面演进为“AI就绪”的多模态数据底座，为客户提供了未来数据战略的坚实保障。凭借面向AI时代的多模数据管理架构、完整的Data+AI平台体系与丰富的行业实践，阿里云瑶池数据库产品家族荣获「Data & Al最具价值平台奖」。"}
{"index": {"_index": "my-nlp-index", "_id": "5"}}
{"text": "[具体年份]云栖大会的硬件基础设施展区，PolarDB磐久CXL内存池化服务器极具突破性。这是全球首款基于CXL（Compute Express Link）2.0 Switch技术的PolarDB数据库专用服务器，在英特尔至强6处理器的支持下，它用CXL技术替代了原来的RDMA网络。在相同配置下，与本地内存相比，阿里云PolarDB数据库的扩展性可提升16倍。"}
{"index": {"_index": "my-nlp-index", "_id": "6"}}
{"text": "[具体年份]2月26日，阿里云宣布PolarDB登顶全球数据库性能及性价比排行榜。根据国际数据库事务处理性能委员会（TPC）官网披露，阿里云PolarDB云原生数据库以每分钟20.55亿笔交易（tpmC）和单位成本0.8元人民币（price/tpmC）的成绩刷新TPC-C性能和性价比双榜的世界纪录。"}
'

Dashboard

POST /_bulk
{"index": {"_index": "my-nlp-index", "_id": "1"}}
{"text": "PolarSearch是PolarDB基于Opensearch研发的高性能分布式数据检索与分析引擎，依托兼容Elasticsearch、Opensearch生态。依托于PolarStore分布式共享存储及云原生计算存储分离架构，定位服务于海量PB级异构数据的存储、分析与多路融合实时检索。产品深度融合自研的智能搜索引擎与分布式计算框架，兼容Elasticsearch DSL语法协议，支持文本文档、图片特征、日志等多模态数据的毫秒级全文检索、向量检索与智能分析，帮助企业快速构建高并发、高可用的数据搜索服务，释放数据价值。"}
{"index": {"_index": "my-nlp-index", "_id": "2"}}
{"text": "PolarDB MySQL版新增向量检索功能，完全兼容MySQL 9.0语法并支持InnoDB事务保证，同时提供100%召回率的精确KNN（基于列存加速）与高性能的近似ANN（基于HNSW等索引）检索。适用于希望在关系型数据库中进行向量检索、RAG应用的客户。"}
{"index": {"_index": "my-nlp-index", "_id": "3"}}
{"text": "[具体年份]2月26日，在阿里云PolarDB开发者大会上，云原生数据库PolarDB正式推出内置大模型的PolarDB AI版本，帮助个人和企业开发者快速部署并上线AI应用。PolarDB AI节点采用模型算子化形态，支持用户在数据库内部直接进行搜索推理优化，在线推理吞吐量可提升xxx倍以上，显著降低用户部署成本。"}
{"index": {"_index": "my-nlp-index", "_id": "4"}}
{"text": "阿里云瑶池数据库致力于从云原生数据底座全面演进为“AI就绪”的多模态数据底座，为客户提供了未来数据战略的坚实保障。凭借面向AI时代的多模数据管理架构、完整的Data+AI平台体系与丰富的行业实践，阿里云瑶池数据库产品家族荣获「Data & Al最具价值平台奖」。"}
{"index": {"_index": "my-nlp-index", "_id": "5"}}
{"text": "[具体年份]云栖大会的硬件基础设施展区，PolarDB磐久CXL内存池化服务器极具突破性。这是全球首款基于CXL（Compute Express Link）2.0 Switch技术的PolarDB数据库专用服务器，在英特尔至强6处理器的支持下，它用CXL技术替代了原来的RDMA网络。在相同配置下，与本地内存相比，阿里云PolarDB数据库的扩展性可提升xxx倍。"}
{"index": {"_index": "my-nlp-index", "_id": "6"}}
{"text": "[具体年份]2月26日，阿里云宣布PolarDB登顶全球数据库性能及性价比排行榜。根据国际数据库事务处理性能委员会（TPC）官网披露，阿里云PolarDB云原生数据库以每分钟xxx亿笔交易（tpmC）和单位成本xxx元人民币（price/tpmC）的成绩刷新TPC-C性能和性价比双榜的世界纪录。"}

执行RAG查询

使用my-rag-search-pipeline对my-nlp-index进行搜索。

search_pipeline=my-rag-search-pipeline：在URL中指定要使用的搜索管道。
query.neural：发起一个向量检索请求。
- query_text：原始问题。
- model_id：指定用于将问题文本转换为向量的文本向量化模型ID。
- k：指定向量检索召回的top-K个最相似的文档。该值是效果和性能的平衡点。
ext.generative_qa_parameters：在查询时动态传递参数，可以覆盖搜索管道中定义的默认值。
- llm_question：再次传递用户问题，供RAG处理器使用。
- context_size：最终传递给大模型的文档数量。
- timeout：调用大模型的超时时间。

命令行

curl -XGET "http://${POLARSEARCH_HOST_PORT}/my-nlp-index/_search?search_pipeline=my-conversation-search-pipeline-qwen-chat" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d'
{
  "query": {
    "neural": {
      "content_embedding": {
        "query_text": "PolarDB是否支持向量检索",
        "model_id": "<步骤一中文本向量化模型的ID>",
        "k": 5
      }
    }
  },
  "size": 4,
  "_source": [
    "text"
  ],
  "ext": {
    "generative_qa_parameters": {      
      "llm_model": "qwen-plus",
      "llm_question": "PolarDB是否支持向量检索",
      "context_size": 5,
      "timeout": 15
    }
  }
}
'

Dashboard

GET /my-nlp-index/_search?search_pipeline=my-conversation-search-pipeline-qwen-chat
{
  "query": {
    "neural": {
      "content_embedding": {
        "query_text": "PolarDB是否支持向量检索",
        "model_id": "<步骤一中文本向量化模型的ID>",
        "k": 5
      }
    }
  },
  "size": 4,
  "_source": [
    "text"
  ],
  "ext": {
    "generative_qa_parameters": {      
      "llm_model": "qwen-plus",
      "llm_question": "PolarDB是否支持向量检索",
      "context_size": 5,
      "timeout": 15
    }
  }
}

预期返回结果：返回的JSON中，除了包含hits（向量检索召回的原始文档），还应包含一个ext.retrieval_augmented_generation字段，其中answer字段即为大模型生成的总结性回答。

{
  // ... (took, timed_out, _shards, hits)
  "hits": {
    // ... (召回的文档列表)
  },
  "ext": {
    "retrieval_augmented_generation": {
      "answer": "是的，PolarDB支持向量检索。PolarDB MySQL版提供精确KNN（基于列存加速）和高性能近似ANN（基于HNSW索引）检索，完全兼容MySQL 9.0语法，并支持InnoDB事务。此外，PolarSearch作为PolarDB生态的一部分，也支持多模态数据的向量与全文融合检索，适用于RAG等AI应用场景。"
    }
  }
}

（可选）步骤五：集成重排模型优化召回效果

在基础RAG流程中加入重排（Rerank）步骤。向量检索（召回）旨在快速从海量数据中找出可能相关的文档，而重排则是在此基础上，使用更精准的模型对少量候选文档进行二次排序，从而提升最终送给大模型上下文的质量。

配置并部署重排模型

与前序模型类似，创建Connector并部署千问的qwen-reranker模型。

创建Connector：

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/connectors/_create" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json'
-d '
{
    "name": "qwen3 rerank",
    "description": "The connector to qwen3 reranker model",
    "version": "1",
    "protocol": "http",
    "credential": {
        "api_key": "${YOUR_API_KEY}"
    },
    "parameters": {
        "model": "qwen3-rerank"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "url": "https://dashscope.aliyuncs.com/compatible-api/v1/reranks",
            "headers": {
                "Authorization": "Bearer ${credential.api_key}"
            },
            "request_body": "{ \"documents\": ${parameters.documents}, \"query\": \"${parameters.query}\", \"model\": \"${parameters.model}\", \"top_n\": ${parameters.top_n} }",
            "pre_process_function": "connector.pre_process.cohere.rerank",
            "post_process_function": "connector.post_process.cohere.rerank"
        }
    ]
}'

Dashboard

重要

请替换以下命令中的<YOUR_API_KEY>为您真实的阿里云大模型服务平台百炼的API Key。

POST _plugins/_ml/connectors/_create
{
    "name": "qwen3 rerank",
    "description": "The connector to qwen3 reranker model",
    "version": "1",
    "protocol": "http",
    "credential": {
        "api_key": "<YOUR_API_KEY>"
    },
    "parameters": {
        "model": "qwen3-rerank"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "url": "https://dashscope.aliyuncs.com/compatible-api/v1/reranks",
            "headers": {
                "Authorization": "Bearer ${credential.api_key}"
            },
            "request_body": "{ \"documents\": ${parameters.documents}, \"query\": \"${parameters.query}\", \"model\": \"${parameters.model}\", \"top_n\": ${parameters.top_n} }",
            "pre_process_function": "connector.pre_process.cohere.rerank",
            "post_process_function": "connector.post_process.cohere.rerank"
        }
    ]
}

执行成功后，系统会返回一个connector_id。请记录该ID，后续步骤将使用。
```
{"connector_id":"JO0Dju5sBy-xxx"}
```

注册并部署模型：此步骤将上一步创建的Connector注册为一个模型，并添加参数deploy=true自动发布模型。

命令行

curl -XPOST "http://${POLARSEARCH_HOST_PORT}/_plugins/_ml/models/_register?deploy=true" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '{
    "name": "QWen rerank model",
    "function_name": "remote",
    "description": "QWen rerank model",
    "connector_id": "JO0Dju5sBy-xxx"
}'

Dashboard

POST _plugins/_ml/models/_register?deploy=true
{
    "name": "QWen rerank model",
    "function_name": "remote",
    "description": "QWen rerank model",
    "connector_id": "JO0Dju5sBy-xxx"
}

执行成功后，系统会返回一个 model_id。请记录该 ID，后续步骤将使用。

# 记录返回的 model_id
{"task_id":"xxx_nIW","status":"CREATED","model_id":"PUDlu5sBy-xxx"}

创建带重排功能的查询管道

创建一个新的搜索管道my-rag-search-pipeline-with-reranker，它在retrieval_augmented_generation处理器之前增加了一个rerank处理器。

rerank处理器：调用上一步部署的重排模型。
- document_fields：关键参数。指定需要进行重排的文本字段，必须与索引中的字段名text一致。
处理顺序：PolarSearch会先执行向量召回，然后将结果交给rerank处理器，最后将重排后的结果交给retrieval_augmented_generation处理器。

命令行

curl -XPUT "http://${POLARSEARCH_HOST_PORT}/_search/pipeline/my-rag-search-pipeline-with-reranker" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '
{
  "response_processors": [
    {
      "rerank": {
        "ml_opensearch": {
          "model_id": "<上一步创建的重排模型ID>"
        },
        "context": {
          "document_fields": [
            "texts"
          ]
        }
      }
    },
    {
      "retrieval_augmented_generation": {
        "tag": "Demo pipeline",
        "description": "Demo pipeline Using QWen Chat",
        "model_id": "<步骤一中文本生成模型的ID>",
        "context_field_list": ["text"],
        "system_prompt": "You are a helpful assistant.",
        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
      }
    }
  ]
}
'

Dashboard

PUT _search/pipeline/my-rag-search-pipeline-with-reranker
{
  "response_processors": [
    {
      "rerank": {
        "ml_opensearch": {
          "model_id": "<上一步创建的重排模型ID>"
        },
        "context": {
          "document_fields": [
            "texts"
          ]
        }
      }
    },
    {
      "retrieval_augmented_generation": {
        "tag": "Demo pipeline",
        "description": "Demo pipeline Using QWen Chat",
        "model_id": "<步骤一中文本生成模型的ID>",
        "context_field_list": ["text"],
        "system_prompt": "You are a helpful assistant.",
        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
      }
    }
  ]
}

预期返回结果如下：

{"acknowledged": true}

执行带重排的RAG查询

使用新的带重排功能的管道进行查询。

search_pipeline：URL参数仍为步骤三：创建的基础RAG查询管道（my-conversation-search-pipeline-qwen-chat），而非重排模型搜索管道。
k：向量召回的k值通常会设置得更大（如50），以保证有足够多的候选文档供重排模型筛选。
ext.rerank：
- top_n：指定重排后保留的文档数量，这些文档将作为最终上下文提供给大模型。

流程解读：以下查询会先从索引中召回50个与问题相关的文档，然后Reranker模型会从这50个文档中挑选出最相关的5个，最后这5个文档被送给大模型进行总结回答。

命令行

curl -XGET "http://${POLARSEARCH_HOST_PORT}/my-nlp-index/_search?search_pipeline=my-conversation-search-pipeline-qwen-chat" \
--user "${USER_PASSWORD}" \
-H 'Content-Type: application/json' \
-d '
{
  "query": {
    "neural": {
      "content_embedding": {
        "query_text": "PolarDB是否支持向量检索",
        "model_id": "<步骤一中文本向量化模型的ID>",
        "k": 50
      }
    }
  },
  "size": 5,
  "_source": [
    "text"
  ],
  "ext": {
    "rerank": {
      "query_context": {
         "query_text": "PolarDB的扩展性如何？",
         "top_n": 5
      }
    },
    "generative_qa_parameters": {
      "llm_model": "qwen-plus",
      "llm_question": "PolarDB的扩展性如何？",
      "context_size": 5,
      "timeout": 15
    }
  }
}
'

Dashboard

GET /my-nlp-index/_search?search_pipeline=my-conversation-search-pipeline-qwen-chat
{
  "query": {
    "neural": {
      "content_embedding": {
        "query_text": "PolarDB是否支持向量检索",
        "model_id": "<步骤一中文本向量化模型的ID>",
        "k": 50
      }
    }
  },
  "size": 50,
  "_source": [
    "text"
  ],
  "ext": {
    "rerank": {
      "query_context": {
         "query_text": "PolarDB是否支持向量检索",
         "top_n": 5
      }
    },
    "generative_qa_parameters": {
      "llm_model": "qwen-plus",
      "llm_question": "PolarDB是否支持向量检索",
      "context_size": 5,
      "timeout": 15
    }
  }
}

工作原理

适用范围

准备工作：配置访问凭证与环境变量

步骤一：配置并部署大语言模型

配置外部模型访问白名单

命令行

Dashboard

配置并部署文本生成模型（LLM）

命令行

Dashboard

命令行

Dashboard

命令行

Dashboard

命令行

Dashboard

配置并部署文本向量化模型

命令行

Dashboard

命令行

Dashboard

命令行

Dashboard

命令行

Dashboard

步骤二：构建向量数据入库管道

创建数据处理管道（Ingest Pipeline）

命令行

Dashboard

创建向量索引

命令行

Dashboard

步骤三：构建基础RAG查询管道

命令行

Dashboard

步骤四：数据入库与RAG查询测试

写入示例文档

命令行

Dashboard

执行RAG查询

命令行

Dashboard

（可选）步骤五：集成重排模型优化召回效果

配置并部署重排模型

命令行

Dashboard

命令行

Dashboard

创建带重排功能的查询管道

命令行

Dashboard

执行带重排的RAG查询

命令行

Dashboard

相关文档