All Products
Search
Document Center

OpenSearch:Content generation LLM

Last Updated:Apr 01, 2026

Sends a multi-turn conversation to a hosted model and returns generated text via an OpenAI-compatible /chat/completions endpoint.

Endpoint

{host}/compatible-mode/v1/chat/completions

host is the service endpoint address. The service is reachable over the internet or through a Virtual Private Cloud (VPC). See Query service endpoint for your address.

AI apikey截图.png

Quick start

The following example sends a two-message conversation and returns a single completion:

curl http://xxxx-cn-shanghai.opensearch.aliyuncs.com/compatible-mode/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <your-api-key>" \
  -d '{
        "model": "ops-qwen-turbo",
        "messages": [
          {"role": "system", "content": "You are a robot assistant"},
          {"role": "user", "content": "Recommend 1 science fiction book"}
        ]
      }'

Replace <your-api-key> with your API key. For valid model values, see List of supported services.

Sample response:

{
  "id": "fb4b3860e051ecad0b019971******",
  "object": "chat.completion",
  "created": 1749804786,
  "model": "ops-qwen-turbo",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The 'Three-Body Problem' series by Liu Cixin. This is a story about......"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 48,
    "total_tokens": 70
  }
}

Request parameters

messages

FieldTypeRequiredDescription
messagesList[Dict]YesThe conversation history. Each item has a role and content.

Each message object requires two fields:

  • role: The speaker. Valid values: system, user, assistant.

  • content: The message text. Cannot be empty.

Role constraints:

RolePositionDescription
systemMust be messages[0] if presentSets the model's behavior for the session. Optional, but must appear first if included.
userAny position after systemA message from the end user.
assistantAny position after systemA message from the model. Use this to supply conversation history.

user and assistant messages should alternate to simulate a real conversation turn.

Example messages array:

[
  {"role": "system", "content": "You are a robot assistant"},
  {"role": "user", "content": "What is the capital of Henan?"},
  {"role": "assistant", "content": "Zhengzhou"},
  {"role": "user", "content": "What are some fun places to visit there?"}
]

model

FieldTypeRequiredDescriptionExample
modelStringYesThe service ID that identifies which model to call.ops-qwen-turbo

For valid values, see List of supported services.

Generation parameters

ParameterTypeRequiredRangeDefaultDescription
max_tokensIntNoMaximum number of tokens to generate. If the model reaches this limit before finishing, finish_reason is set to length.
temperatureFloatNo[0, 2)Controls output randomness. Lower values produce more deterministic responses — suitable for factual Q&A. Higher values produce more varied responses — suitable for creative tasks. Note: 0 is meaningless.
top_pFloatNo(0, 1.0)Nucleus sampling threshold. Lower values restrict the token selection pool and increase determinism. Higher values allow more diverse word choices.
presence_penaltyFloatNo[-2.0, 2.0]0Penalizes tokens that have appeared anywhere in the output so far, reducing repetition of topics.
frequency_penaltyFloatNo[-2.0, 2.0]0Penalizes tokens based on how often they appear in the output so far, reducing repetition of specific phrases.
stopString or List[String]NonullOne or more sequences that stop generation when encountered. The stop sequence itself is not included in the output.
streamBooleanNofalseSet to true to receive output as a stream of incremental chunks. In stream mode, the interface returns results as a generator, which must be iterated to retrieve the incremental sequences.

Response parameters

ParameterTypeDescriptionExample
idStringThe request ID.2244F3A8-4201-4F37-BF86-42013B1026D6
objectStringAlways chat.completion.chat.completion
createdLongUnix timestamp (seconds) when the response was created.1719313883
modelStringThe service ID used to generate the response.ops-qwen-turbo
choices.indexIntIndex of this result. 0 is the first result.0
choices.messageMapThe model's response message, with role and content fields.{"role": "assistant", "content": "This is an example"}
choices.finish_reasonStringReason generation stopped. See Finish reasons.stop
usage.prompt_tokensIntNumber of tokens in the input messages.180
usage.completion_tokensIntNumber of tokens in the generated response.150
usage.total_tokensIntTotal tokens used (prompt_tokens + completion_tokens).330

Finish reasons

ValueMeaning
stopThe model returned a complete response.
lengthGeneration stopped because max_tokens was reached. Increase max_tokens to get longer output.
content_filter*The response was filtered by content safety. Values starting with content_filter indicate a safety filter result.

Status codes

For HTTP status code details, see Status codes.