All Products
Search
Document Center

OpenSearch:Unified Q&A API

Last Updated:Nov 04, 2025

This topic describes the MultiSearch unified question and answer (Q&A) API. You can use this API to retrieve content from a knowledge base and perform text-based and table-based Q&A queries.

Prerequisites

  • Obtain an API key. You must use this key for identity authentication when you call operations of OpenSearch LLM-Based Conversational Search. For more information, see Manage API keys.

  • Obtain a service endpoint. You must provide a service endpoint when you call the operations of OpenSearch LLM-Based Conversational Search. For more information, see Obtain service endpoints.

Precautions

  • If a custom table exists, a table-based Q&A query is performed first. Otherwise, a text-based Q&A query is performed.

  • If only one custom table exists, the Q&A query is performed on that table. If multiple custom tables exist, the system automatically selects the table most relevant to the user's question and performs the Q&A query on that table.

  • If the table-based Q&A query returns no results, a text-based Q&A query is performed.

  • If you use this API and multiple custom tables exist, ensure that the tables are highly distinct from each other. Otherwise, incorrect answers may be returned.

API information

Request method

Protocol

Request data format

POST

HTTP

JSON

Request URL

{host}/v3/openapi/apps/[app_group_identity]/actions/multi-search

Request parameters

Header parameters

Parameter

Type

Required

Description

Example

Content-Type

string

Yes

The data format of the request. Set the value to "application/json".

application/json

Authorization

string

Yes

The API key for request authentication. The value must start with Bearer.

Bearer OS-d1**2a

accept

String

No

For Server-Sent Events (SSE) requests, set the value to "text/event-stream".

text/event-stream

Body parameters

Parameter

Type

Required

Description

Example

question

map

Yes

The input question.

{

"text":"user

question",

"type": "TEXT",

"session" : ""

}

question.text

string

Yes

The text of the input question.

user question

question.session

string

No

The session ID for a multi-round conversation. This ID identifies the context of the conversation. Valid values:

  • If you do not set this parameter or leave it empty, multi-round conversation is disabled.

  • If you set this parameter to a non-empty value, multi-round conversation is enabled. The system retains conversations with the same session ID as the context for the multi-round conversation.

1725530408586

question.type

string

No

The type of the input question. Set the value to TEXT.

TEXT

options

map

No

The additional request parameters that control retrieval, models, prompts, and more.

options.chat

map

No

The parameters related to large language model (LLM) access.

options.chat.disable

boolean

No

Specifies whether to disable LLM access.

  • false (default): Accesses the LLM to summarize and generate results.

  • true: Does not access the LLM.

false

options.chat.stream

boolean

No

Specifies whether to return results in a stream.

  • true (default): Returns results in a stream.

  • false: Returns results in a non-streaming manner.

true

options.chat.model

string

No

The LLM to use. Valid values:

Singapore region
  • opensearch-llama2-13b

  • opensearch-falcon-7b

  • qwen-turbo

  • qwen-plus

  • qwen-max

  • qwen2-72b-instruct

opensearch-llama2-13b

options.chat.enable_deep_search

boolean

No

Specifies whether to enable deep search.

  • true: Enables deep search. This requires multiple rounds of inference to synthesize data and return results. A single conversation consumes more time and computing resources.

  • false: Disables deep search.

false

options.chat.model_generation

integer

No

When using a product-customized model, set the corresponding model version. By default, the oldest version is used.

20

options.chat.prompt_template

string

No

The name of the custom prompt template. If this parameter is empty, the system's built-in prompt template is used.

user_defined_prompt_name

options.chat.prompt_config

object

No

The key-value pairs configured in the custom prompt. The parameter format is:

{
  "String_key": "value",
  "Integer_key" : 1
}
{
  "attitude": "normal",
  "rule" : "detailed",
  "noanswer": "sorry",
  "language": "Chinese",
  "role": false,
  "role_name": "AI Assistant",
}

options.chat.prompt_config.attitude

string

No

A parameter in the built-in template that controls the tone of the conversation. The default value is normal.

  • normal: Neutral.

  • polite: Uses a kind and polite tone.

  • patience: Uses a gentle and patient tone.

normal

options.chat.prompt_config.rule

string

No

The level of detail in the conversation. The default value is detailed.

  • detailed: Detailed and professional.

  • stepbystep: Detailed and step-by-step.

detailed

options.chat.prompt_config.noanswer

string

No

The response when an answer cannot be found. The default value is sorry.

  • sorry: Sorry, I cannot answer this question based on the available information.

  • uncertain: I don't know.

sorry

options.chat.prompt_config.language

string

No

The language used for the answer. The default value is Chinese.

  • Chinese

  • English

  • Thai

  • Korean

Chinese

options.chat.prompt_config.role

boolean

No

Specifies whether to enable a custom role for the answer. If enabled, a custom role for the answer is used.

false

options.chat.prompt_config.role_name

string

No

The custom role for the answer. Example: AI Assistant.

AI Assistant

options.chat.prompt_config.out_format

string

No

The format of the output content. The default value is text.

  • text

  • table

  • list

  • markdown

text

options.chat.generate_config.repetition_penalty

float

No

Controls the repetition of consecutive sequences in the model's output. Increasing the repetition_penalty reduces the repetition in the generated text. A value of 1.0 means no penalty. There is no strict value range.

1.01

options.chat.generate_config.top_k

integer

No

The size of the candidate set for sampling during generation. For example, a value of 50 means that only the top 50 tokens with the highest scores in a single generation are used to form the random sampling candidate set. A larger value increases the randomness of the generated text. A smaller value increases the determinism. The default value is 0, which disables the top_k strategy. In this case, only the top_p strategy takes effect.

50

options.chat.generate_config.top_p

float

No

The probability threshold for nucleus sampling during generation. For example, a value of 0.8 means that only the smallest set of most likely tokens with a cumulative probability of 0.8 or higher is retained as the candidate set. The value must be in the range of (0, 1.0). A larger value increases the randomness of the generated text. A smaller value increases the determinism.

0.5

options.chat.generate_config.temperature

float

No

Controls the degree of randomness and diversity. Specifically, the temperature value controls the degree of smoothing applied to the probability distribution of each candidate word during text generation. A higher temperature value flattens the probability distribution, allowing more low-probability words to be selected and making the output more diverse. A lower temperature value sharpens the probability distribution, making high-probability words more likely to be selected and making the output more deterministic.

The value must be in the range of [0, 2). We recommend that you do not set this parameter to 0.

python version >=1.10.1

java version >= 2.5.1

0.7

options.chat.history_max

integer

No

The maximum number of historical rounds for a multi-round conversation. The maximum value is 20. The default value is 1.

20

options.chat.link

boolean

No

Specifies whether to return links. This controls whether the content generated by the model indicates the source of the reference. Valid values:

  • true

  • false (default)

The following sample response is returned if the content includes the source:

An ECS disk can be resized online or offline[^1^]. Online resizing does not require an instance restart, but offline resizing does[^1^]. To resize a disk, select the disk on the ECS console, choose Resize from the Actions column, and then select a resizing method as needed[^1^]. To resize partitions and file systems, obtain the required information from the console[^2^]. After a disk is resized, its capacity cannot be reduced. Plan your storage space carefully[^3^].

The number enclosed by [^ and ^] indicates the index of the document in the reference of the result. For example, [^1^] indicates the first document in the reference.

false

options.chat.rich_text_strategy

string

No

The post-processing method for the output of a rich text LLM. If this parameter is not configured or is empty, the rich text feature is disabled and the default behavior is used.

  • inside_response: The tags in the answer are restored directly to the original text in Markdown format. Note that tables are inserted directly into the Markdown file in HTML format.

  • extend_response: If the answer contains rich text tags, the actual content of each tag is returned separately in rich_text_ref. Image content is returned as a URL, table content is in HTML format, and code is in text format.

inside_response

options.chat.agent

map

No

The options for configuring retrieval-augmented generation (RAG) tool capabilities. When enabled, the model decides whether to execute the corresponding tool based on the existing content. The following LLMs support this feature:

  • qwen-plus

  • qwen-max

  • qwen2-72b-instruct

options.chat.agent.think_process

boolean

No

Specifies whether to return the thinking process.

true

options.chat.agent.max_think_round

integer

No

The number of thinking rounds. The maximum value is 20.

10

options.chat.agent.language

string

No

The language for the thinking process and the answer.

AUTO: Automatically determines whether to use Chinese or English based on the user query.

CN: Chinese.

EN: English.

AUTO

options.chat.agent.tools

list of string

No

The names of the RAG tools to use. The following tool is available:

  • knowledge_search: knowledge base retrieval

["knowledge_search"]

options.retrieve

map

No

The additional request parameters that control retrieval, models, prompts, and more.

options.retrieve.web_search.enable

boolean

No

Specifies whether to enable web search.

  • true: Enables web search. The results are returned based on web search data. A single conversation consumes more time and computing resources.

  • false: Disables web search.

false

doc

map

No

The parameters that control retrieval.

options.retrieve.doc.disable

boolean

No

Specifies whether to disable knowledge base retrieval.

  • false (default)

  • true

false

options.retrieve.doc.filter

string

No

Filters the data retrieved from the knowledge base. By default, this parameter is empty. For more information about how to use the filter parameter, see filter parameter.

Supported fields:

  • table: The table where the document is located.

  • raw_pk: The primary key of the document.

  • category: The category of the document.

  • score: The score of the document.

  • timestamp: The timestamp of the document.

Example format:

"filter" : "raw_pk=\"123\""   # Retrieves data only from the document with id=123.
"filter" : "category=\"value1\""   # Retrieves data only from documents in the category 'value1'.
"filter" : "category=\"value1\" OR category=\"value2\"" # Retrieves data only from documents in the categories 'value1' and 'value2'.
"filter" : "score>1.0"   # Retrieves data only from documents with a score greater than 1.0.
"filter" : "timestamp>1356969600"   # Retrieves data only from documents with a timestamp after 2013-01-01.

category=\"value1\"

options.retrieve.doc.sf

float

No

The threshold for the vector score in vector retrieval.

  • If sparse vectors are not enabled, the value must be in the range of [0, 2.0]. The default value is 1.3. A smaller value indicates that the results are more relevant, but fewer results are returned. A larger value may retrieve less relevant results.

  • If sparse vectors are enabled, the default value is 0.35. A larger value indicates that the retrieved results are more relevant, but fewer results are returned. A smaller value may retrieve less relevant results.

0.35

options.retrieve.doc.top_n

integer

No

The number of documents to retrieve. The default value is 5. The value must be in the range of (0, 50].

5

options.retrieve.doc.formula

string

No

The formula for sorting documents during retrieval.

Note

For more information about the syntax, see Fine sort functions. The algorithm relevance and geographical location relevance features are not supported.

-timestamp: Sorts documents in descending order by the timestamp field.

options.retrieve.doc.rerank_size

integer

No

The number of documents to rerank when the rerank feature is enabled. The default value is 30. The value must be in the range of (0, 100].

30

options.retrieve.doc.operator

string

No

The relationship between the terms after the question.text is tokenized for knowledge base retrieval. This parameter takes effect only when sparse vectors are not enabled.

  • AND (default): All terms must appear to be retrieved.

  • OR: At least one term must appear to be retrieved.

AND

options.retrieve.doc.dense_weight

float

No

The weight of the dense vector during document retrieval when sparse vectors are enabled. The value must be in the range of (0.0, 1.0). The default value is 0.7.

0.7

options.retrieve.entry

map

No

The parameters that control the retrieval of results from manually intervened data.

options.retrieve.entry.disable

boolean

No

Specifies whether to disable the retrieval of manually intervened data.

  • false (default)

  • true

false

options.retrieve.entry.sf

float

No

The threshold for the vector score for retrieving manually intervened data. The value must be in the range of [0, 2.0]. The default value is 0.3. A smaller value indicates that the results are more relevant, but fewer results are returned. A larger value may retrieve less relevant results.

0.3

options.retrieve.image

map

No

The parameters that control the retrieval of image results from the knowledge base.

options.retrieve.image.disable

boolean

No

Specifies whether to disable the retrieval of image data. The default value is false.

  • false (default)

  • true

false

options.retrieve.image.sf

float

No

The threshold for the vector score in vector retrieval.

  • If sparse vectors are not enabled, the value must be in the range of [0, 2.0]. The default value is 1.0. A smaller value indicates that the results are more relevant, but fewer results are returned. A larger value may retrieve less relevant results.

  • If sparse vectors are enabled, the default value is 0.5. A larger value indicates that the retrieved results are more relevant, but fewer results are returned. A smaller value may retrieve less relevant results.

1.0

options.retrieve.image.dense_weight

float

No

The weight of the dense vector during image retrieval when sparse vectors are enabled. The value must be in the range of (0.0, 1.0). The default value is 0.7.

0.7

options.retrieve.qp

map

No

The options for query rewriting.

options.retrieve.qp.query_extend

boolean

No

Specifies whether to expand the user query. The expanded queries are used to retrieve document segments in the engine. The default value is false.

  • false (default): Does not perform query expansion. This maintains consistency with the original logic.

  • true: Performs an additional interaction with the LLM, which slows down the response speed. Do not enable this for time-sensitive applications.

false

options.retrieve.qp.query_extend_num

integer

No

The maximum number of queries to expand when similar query expansion is enabled. The default value is 5.

5

options.retrieve.rerank

map

No

The options for reranking during document retrieval.

options.retrieve.rerank.enable

boolean

No

Specifies whether to use a model to rerank the retrieved results based on relevance. Valid values:

  • true

  • false

  • The default value is false if options.retrieve.doc.formula is not empty. Otherwise, the default value is true.

true

options.retrieve.rerank.model

string

No

The name of the LLM used for reranking.

  • ops-bge-reranker-larger (default): The bge-reranker model.

  • ops-text-reranker-001: A self-developed reranker model.

ops-bge-reranker-larger

options.retrieve.return_hits

boolean

No

Specifies whether to return the document retrieval results in the response, which is the search_hits parameter.

false

Sample request body

{
    "question" : {
        "text" : "user question",
        "session" : "The session of the conversation. If you specify this parameter, the multi-round conversation feature is enabled.",
        "type" : "TEXT"
    },
    "options": {
        "chat": {
            "disable" : false, # Specifies whether to disable LLM access and directly return document retrieval results. The default value is false, which indicates that LLM generation is enabled.
            "stream" : false, # Specifies whether to return the results in a stream. The default value is false.
            "model" : "Qwen", # The LLM to use.
            "prompt_template" : "user_defined_prompt_name", # The name of the custom prompt.
            "prompt_config" : { # The custom prompt configuration. This parameter is optional.
                "key" : "value" # Replace the key and value with a specific key-value pair.
            },
            "generate_config" : {
                "repetition_penalty": 1.01,
                "top_k": 50,
                "top_p": 0.5,
                "temperature": 0.7
            },
            "history_max": 20, # The maximum number of historical rounds for a multi-round conversation.
            "link": false, # Return links.
            "agent":{
                "tools":["knowledge_search"]
            }
        },
        "retrieve": {
            "doc": {
                "disable": false, # Specifies whether to disable document retrieval. The default value is false.
                "filter": "category=\"type\"", # Filters documents by the category field during retrieval. By default, this parameter is empty.
                "sf": 0.35,    # The vector retrieval threshold. 
                "top_n": 5,    # The number of documents to retrieve. The default value is 5. The value must be in the range of (0, 50].
                "formula" : "", # The default is vector similarity.
                "rerank_size" : 5,  # The number of documents for fine sorting. You do not need to set this parameter. The system determines the value.
                "operator":"OR"  # Specifies that the relationship between text tokens is OR during text retrieval. The default is AND.
            },
            "web_search":{
                      "enable": false # Specifies whether to enable web search. The default is false.
            },
            "entry": {
                "disable": false, # Specifies whether to disable the retrieval of manually intervened data. The default value is false.
                "sf": 0.3 # The vector relevance for the retrieval of intervened data. The default value is 0.3.
            },
            "image": {
                "disable": false,  # Specifies whether to disable image data retrieval. The default value is false.
                "sf": 1.0          # The vector relevance for image data retrieval. The default value is 1.0.
            },
            "qp": {
                "query_extend": false, # Specifies whether to perform query expansion on the user query.
                "query_extend_num": 5 # The number of expanded queries. The default value is 5.
            },
            "rerank" : {
                "enable": true,  # Specifies whether to use an LLM to rerank the retrieved results. The default value is true.
                "model":"model_name" # Replace with a specific model name.
            },
            "return_hits": false   # Specifies whether to return the document retrieval results in the response, which is the search_hits parameter.
        }
    }
}

Response parameters

Parameter

Type

Description

request_id

string

The request ID.

status

string

The processing status of the request.

  • OK

  • FAIL

latency

float

The time it took the server to process the request. Unit: milliseconds.

id

integer

The primary key ID.

title

string

The title of the document.

category

string

The category name.

url

string

The document link.

answer

string

The Q&A result.

type

string

The type of the returned result.

scores

array

The document content score.

event

string

The thinking event.

A round of the thinking process consists of THINK, ACTION, and ANSWER. The THINK event is not always returned. THINK indicates the thinking process. ACTION indicates the action performed. ANSWER indicates the conclusion of the current thinking round. SUMMARY is the final answer. Only one SUMMARY event of the text type is returned.

event_status

string

Indicates whether the result is complete.

PROCESSING

FINISHED

code

string

The error code. This parameter is returned if an error occurs.

message

string

The error message. This parameter is returned if an error occurs.

Sample response body

{
  "request_id": "6859E98D-D885-4AEF-B61C-9683A0184744",
  "status": "OK",
  "latency": 6684.410397,
  "result" : {
    "data" : [
      {
        "answer" : "answer text",
        "type" : "TEXT",
        "reference" : [
          {"url" : "http://....","title":"doc title"}
    		]
      },
      {
        "reference": [
          {"id": "16","title": "Test title","category": "Test category","url": "Test URL"}
        ],
        "answer": "https://ecmb.bdimg.com/tam-ogel/-xxxx.jpg",
        "type": "IMAGE"
      }
    ],
    "search_hits" : [  // This parameter is returned only if you set options.retrieve.return_hits to true in the request.
      {
        "fields" : {
          "content" : "...."
          "key1" : "value1"
        },
        "scores" : ["10000.1234"],
        "type" : "doc"
      },
      {
        "fields"{
          "answer" : "...",
          "key1" : "value1"
        },
        "scores" : ["10000.1234"],
        "type" : "entry"
      }
    ]
  }
  "errors" : [
    {
      "code" : "The error code. This parameter is returned if an error occurs.",
      "message" : "The error message. This parameter is returned if an error occurs."
    }
  ]
}