The Qwen models from Alibaba Cloud Model Studio support OpenAI-compatible interfaces. You can migrate your existing OpenAI code to Model Studio by updating only the API key, BASE_URL, and model name.
Information required for OpenAI compatibility
BASE_URL
The BASE_URL is the network endpoint of the model service. This address lets you access the features or data that the service provides. When you use web services or APIs, the BASE_URL is typically the root URL for the API, to which specific endpoints are appended. You must set the BASE_URL when you use an OpenAI-compatible interface to call Model Studio.
When you make a call using the OpenAI SDK or other OpenAI-compatible SDKs, set the BASE_URL as follows:
Singapore: https://dashscope-intl.aliyuncs.com/compatible-mode/v1 China (Beijing): https://dashscope.aliyuncs.com/compatible-mode/v1When you make a call using an HTTP request, set the complete access endpoint as follows:
Singapore: POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions China (Beijing): POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
Model availability
The following table lists the Qwen models currently supported by the OpenAI-compatible interface.
Category | Model |
Qwen | qwen-max qwen-max-latest qwen-max-2025-01-25 qwen-plus qwen-plus-latest qwen-plus-2025-04-28 qwen-plus-2025-01-25 qwen-turbo qwen-turbo-latest qwen-turbo-2025-04-28 qwen-turbo-2024-11-01 |
Qwen open-source series | qwq-32b qwen3-235b-a22b qwen3-32b qwen3-30b-a3b qwen3-14b qwen3-8b qwen3-4b qwen3-1.7b qwen3-0.6b qwen2.5-14b-instruct-1m qwen2.5-7b-instruct-1m qwen2.5-72b-instruct qwen2.5-32b-instruct qwen2.5-14b-instruct qwen2.5-7b-instruct qwen2-72b-instruct qwen2-7b-instruct qwen1.5-110b-chat qwen1.5-72b-chat qwen1.5-32b-chat qwen1.5-14b-chat qwen1.5-7b-chat |
Use the OpenAI SDK
Prerequisites
Make sure that a Python environment is installed on your computer.
Install the latest version of the OpenAI SDK.
# If the following command reports an error, replace pip with pip3. pip install -U openaiActivate Model Studio and create an API key. For more information, see Preparations: Create and export an API key.
Export the API key as an environment variable to reduce the risk of API key leakage, see Export the API key as an environment variable. You can also set the API key in the code, but this increases the risk of leakage.
Select a model to use. For more information, see Model availability.
Usage
The following examples show how to use the OpenAI SDK to access Qwen models on Model Studio.
Non-streaming call example
from openai import OpenAI
import os
def get_response():
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"), # If you have not configured an environment variable, replace this line with api_key="sk-xxx" using your Model Studio API key.
# Set the base_url of the DashScope SDK. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1.
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="qwen-plus", # This example uses qwen-plus. You can replace the model name as needed. For a list of models, see https://www.alibabacloud.com/help/en/model-studio/getting-started/models.
messages=[{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Who are you?'}]
)
print(completion.model_dump_json())
if __name__ == '__main__':
get_response()Running the code produces the following result:
{
"id": "chatcmpl-xxx",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "I am a large-scale pre-trained model from Alibaba Cloud. My name is Qwen.",
"role": "assistant",
"function_call": null,
"tool_calls": null
}
}
],
"created": 1716430652,
"model": "qwen-plus",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"completion_tokens": 18,
"prompt_tokens": 22,
"total_tokens": 40
}
}Streaming call example
from openai import OpenAI
import os
def get_response():
client = OpenAI(
# If you have not configured an environment variable, replace the following line with api_key="sk-xxx" using your Model Studio API key.
api_key=os.getenv("DASHSCOPE_API_KEY"),
# Set the base_url of the DashScope SDK. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1.
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="qwen-plus", # This example uses qwen-plus. You can replace the model name as needed. For a list of models, see https://www.alibabacloud.com/help/en/model-studio/getting-started/models.
messages=[{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Who are you?'}],
stream=True,
# Use the following setting to display token usage information in the last line of the streaming output.
stream_options={"include_usage": True}
)
for chunk in completion:
print(chunk.model_dump_json())
if __name__ == '__main__':
get_response()
Running the code produces the following result:
{"id":"chatcmpl-xxx","choices":[{"delta":{"content":"","function_call":null,"role":"assistant","tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":null}
{"id":"chatcmpl-xxx","choices":[{"delta":{"content":"I am"},"function_call":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":null}
{"id":"chatcmpl-xxx","choices":[{"delta":{"content":" a large-scale"},"function_call":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":null}
{"id":"chatcmpl-xxx","choices":[{"delta":{"content":" language"},"function_call":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":null}
{"id":"chatcmpl-xxx","choices":[{"delta":{"content":" model from Alibaba Cloud"},"function_call":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":null}
{"id":"chatcmpl-xxx","choices":[{"delta":{"content":", and my name is Qwen."},"function_call":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":null}
{"id":"chatcmpl-xxx","choices":[{"delta":{"content":"","function_call":null,"role":null,"tool_calls":null},"finish_reason":"stop","index":0,"logprobs":null}],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":null}
{"id":"chatcmpl-xxx","choices":[],"created":1719286190,"model":"qwen-plus","object":"chat.completion.chunk","system_fingerprint":null,"usage":{"completion_tokens":16,"prompt_tokens":22,"total_tokens":38}}Function calling example
This section provides an example of how to implement the tool calling feature with an OpenAI-compatible interface using weather and time query tools. The sample code supports multi-round tool calling.
from openai import OpenAI
from datetime import datetime
import json
import os
client = OpenAI(
# If you have not configured an environment variable, replace the following line with api_key="sk-xxx" using your Model Studio API key.
api_key=os.getenv("DASHSCOPE_API_KEY"),
# Set the base_url of the DashScope SDK. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1.
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
# Define the tool list. The model refers to the name and description of the tools when selecting which tool to use.
tools = [
# Tool 1: Get the current time.
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Useful for when you want to know the current time.",
# Because no input parameters are required to get the current time, parameters is an empty dictionary.
"parameters": {}
}
},
# Tool 2: Get the weather of a specified city.
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Useful for querying the weather in a specific city.",
"parameters": {
"type": "object",
"properties": {
# A location must be provided to query the weather, so the parameter is set to location.
"location": {
"type": "string",
"description": "A city or county, such as Beijing, Hangzhou, or Yuhang District."
}
}
},
"required": [
"location"
]
}
}
]
# Simulate the weather query tool. Sample result: "It is rainy in Beijing today."
def get_current_weather(location):
return f"It is rainy in {location} today. "
# A tool to query the current time. Sample result: "Current time: 2024-04-15 17:15:18."
def get_current_time():
# Get the current date and time.
current_datetime = datetime.now()
# Format the current date and time.
formatted_time = current_datetime.strftime('%Y-%m-%d %H:%M:%S')
# Return the formatted current time.
return f"Current time: {formatted_time}."
# Encapsulate the model response function.
def get_response(messages):
completion = client.chat.completions.create(
model="qwen-plus", # This example uses qwen-plus. You can replace the model name as needed. For a list of models, see https://www.alibabacloud.com/help/en/model-studio/getting-started/models.
messages=messages,
tools=tools
)
return completion.model_dump()
def call_with_messages():
print('\n')
messages = [
{
"content": input('Please enter your query:'), # Sample questions: "What time is it?" "What time will it be in an hour?" "What is the weather in Beijing?"
"role": "user"
}
]
print("-"*60)
# The first round of model calling.
i = 1
first_response = get_response(messages)
assistant_output = first_response['choices'][0]['message']
print(f"\nLarge model output in round {i}: {first_response}\n")
if assistant_output['content'] is None:
assistant_output['content'] = ""
messages.append(assistant_output)
# If no tool needs to be called, return the final answer directly.
if assistant_output['tool_calls'] == None: # If the model determines that no tool needs to be called, print the assistant's reply directly. A second round of model calling is not required.
print(f"No tool call is needed. I can answer directly: {assistant_output['content']}")
return
# If a tool needs to be called, perform multiple rounds of model calling until the model determines that no tool needs to be called.
while assistant_output['tool_calls'] != None:
# If the model determines that the weather query tool needs to be called, run the weather query tool.
if assistant_output['tool_calls'][0]['function']['name'] == 'get_current_weather':
tool_info = {"name": "get_current_weather", "role":"tool"}
# Extract the location parameter information.
location = json.loads(assistant_output['tool_calls'][0]['function']['arguments'])['location']
tool_info['content'] = get_current_weather(location)
# If the model determines that the time query tool needs to be called, run the time query tool.
elif assistant_output['tool_calls'][0]['function']['name'] == 'get_current_time':
tool_info = {"name": "get_current_time", "role":"tool"}
tool_info['content'] = get_current_time()
print(f"Tool output: {tool_info['content']}\n")
print("-"*60)
messages.append(tool_info)
assistant_output = get_response(messages)['choices'][0]['message']
if assistant_output['content'] is None:
assistant_output['content'] = ""
messages.append(assistant_output)
i += 1
print(f"Large model output in round {i}: {assistant_output}\n")
print(f"Final answer: {assistant_output['content']}")
if __name__ == '__main__':
call_with_messages()If you enter What is the weather in Singapore?, the program generates the following output:

Request parameters
The input parameters are aligned with the OpenAI interface parameters. The following parameters are supported:
Parameter | Type | Default | Description |
model | string | - | Specifies the model to use. For a list of available models, see Model availability. |
messages | array | - | The conversation history between the user and the model. Each element in the array is in the format |
top_p (optional) | float | - | The probability threshold for nucleus sampling. For example, if you set this parameter to 0.8, the model generates tokens from the smallest set of top tokens that have a cumulative probability of 0.8 or higher. Valid values: (0, 1.0). A larger value indicates higher randomness. A smaller value indicates higher determinism. |
temperature (optional) | float | - | Controls the randomness and diversity of the model's responses. A higher temperature value smooths the probability distribution of candidate words, which allows less probable words to be selected and generates more diverse results. A lower temperature value sharpens the probability distribution, which makes more probable words easier to be selected and generates more deterministic results. Valid values: [0, 2). We recommend that you do not set the value to 0. |
presence_penalty (optional) | float | - | Controls the repetition of the entire sequence in the generated content. A higher value for presence_penalty reduces repetition. Valid values: [-2.0, 2.0]. Note Currently, this parameter is supported only on Qwen commercial models and open-source models of qwen1.5 and later. |
n (optional) | integer | 1 | The number of responses to generate. Valid values: 1 to A larger value for n does not increase input token consumption but increases output token consumption. Currently, this parameter is supported only for the qwen-plus model. The value must be 1 when the tools parameter is passed. |
max_tokens (optional) | integer | - | The maximum number of tokens that the model can generate. For example, if the maximum output length of the model is 2,000 tokens, you can set this parameter to 1,000 to prevent the model from generating excessively long content. Different models have different output limits. For more information, see the model list. |
seed (optional) | integer | - | The random number seed used during generation to control the randomness of the generated content. The seed parameter supports unsigned 64-bit integers. |
stream (optional) | boolean | False | Specifies whether to use streaming output. When results are streamed, the API returns a generator. You must iterate to get the results. Each output is the incremental sequence that is currently generated. |
stop (optional) | string or array | None | The stop parameter provides precise control over the content generation procedure. The model automatically stops generating content when it is about to include a specified string or token ID. The stop parameter can be a string or an array.
|
tools (optional) | array | None | Specifies the tool library that the model can call. In a tool calling process, the model selects one tool from the library. The structure of each tool in the tools array is as follows:
In the tool calling process, you must set the tools parameter both when you initiate a round of tool calls and when you submit the execution results of the tool function to the model. The currently supported models include qwen-turbo, qwen-plus, and qwen-max. Note The tools parameter cannot be used with stream=True at the same time. |
stream_options (optional) | object | None | Configures whether to display the number of tokens used during streaming output. This parameter takes effect only when stream is set to True. To count the number of tokens in streaming output mode, set this parameter to |
Response parameters
Parameter | Data type | Description | Note |
id | string | The system-generated ID for this call. | None |
model | string | The name of the model that is called. | None |
system_fingerprint | string | The configuration version used by the model runtime. This is not currently supported and returns an empty string "". | None |
choices | array | Details of the content generated by the model. | None |
choices[i].finish_reason | string | The following three cases apply:
| |
choices[i].message | object | The message that the model outputs. | |
choices[i].message.role | string | The role of the model, which is fixed to assistant. | |
choices[i].message.content | string | The text generated by the model. | |
choices[i].index | integer | The sequence number of the generated result. Default value: 0. | |
created | integer | The UNIX timestamp (in seconds) when the result was generated. | None |
usage | object | The token consumption of the request. | None |
usage.prompt_tokens | integer | The number of tokens in the input text. | None |
usage.completion_tokens | integer | The number of tokens in the generated response. | None |
usage.total_tokens | integer | The sum of usage.prompt_tokens and usage.completion_tokens. | None |
Use the langchain_openai SDK
Prerequisites
Make sure that a Python environment is installed on your computer.
Run the following command to install the langchain_openai SDK.
# If the following command reports an error, replace pip with pip3. pip install -U langchain_openai
Activate Model Studio and create an API key. For more information, see Preparations: Create and export an API key.
Export the API key as an environment variable to reduce the risk of API key leakage, see Export the API key as an environment variable. You can also set the API key in the code, but this increases the risk of leakage.
Select a model to use. For more information, see Model availability.
Usage
The following examples show how to use the langchain_openai SDK to call the Qwen models from Model Studio.
Non-streaming output
Non-streaming output is implemented using the invoke method. The following is the sample code:
from langchain_openai import ChatOpenAI
import os
def get_response():
llm = ChatOpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"), # If you have not configured an environment variable, replace this line with api_key="sk-xxx" using your Model Studio API key.
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1.
model="qwen-plus" # This example uses qwen-plus. You can replace the model name as needed. For a list of models, see https://www.alibabacloud.com/help/en/model-studio/getting-started/models.
)
messages = [
{"role":"system","content":"You are a helpful assistant."},
{"role":"user","content":"Who are you?"}
]
response = llm.invoke(messages)
print(response.json())
if __name__ == "__main__":
get_response()Running the code produces the following result:
{
"content": "I am a large language model from Alibaba Cloud. My name is Qwen.",
"additional_kwargs": {},
"response_metadata": {
"token_usage": {
"completion_tokens": 16,
"prompt_tokens": 22,
"total_tokens": 38
},
"model_name": "qwen-plus",
"system_fingerprint": "",
"finish_reason": "stop",
"logprobs": null
},
"type": "ai",
"name": null,
"id": "run-xxx",
"example": false,
"tool_calls": [],
"invalid_tool_calls": []
}Streaming output
Streaming output is implemented using the stream method. You do not need to set the stream parameter.
from langchain_openai import ChatOpenAI
import os
def get_response():
llm = ChatOpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"), # If you have not configured an environment variable, replace this line with api_key="sk-xxx" using your Model Studio API key.
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1.
model="qwen-plus", # This example uses qwen-plus. You can replace the model name as needed. For a list of models, see https://www.alibabacloud.com/help/en/model-studio/getting-started/models.
stream_usage=True
)
messages = [
{"role":"system","content":"You are a helpful assistant."},
{"role":"user","content":"Who are you?"},
]
response = llm.stream(messages)
for chunk in response:
print(chunk.model_dump_json())
if __name__ == "__main__":
get_response()Running the code produces the following result:
{"content": "", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": "I am", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": " a large", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": " language model", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": " from", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": " Alibaba", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": " Cloud", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": ", and my name is Qwen.", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": "", "additional_kwargs": {}, "response_metadata": {"finish_reason": "stop"}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null, "tool_call_chunks": []}
{"content": "", "additional_kwargs": {}, "response_metadata": {}, "type": "AIMessageChunk", "name": null, "id": "run-xxx", "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": {"input_tokens": 22, "output_tokens": 16, "total_tokens": 38}, "tool_call_chunks": []}For information about input parameter settings, see Request parameters. The relevant parameters are defined in the ChatOpenAI object.
Use the HTTP interface
You can call Model Studio using the HTTP interface to receive a response that has the same structure as a response from the OpenAI service.
Prerequisites
Activate Model Studio and create an API key. For more information, see Preparations: Create and export an API key.
Export the API key as an environment variable to reduce the risk of API key leakage, see Export the API key as an environment variable. You can also set the API key in the code, but this increases the risk of leakage.
Submit an API request
Singapore: POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions
China (Beijing): POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completionsSample request
The following example shows a script that calls the API using the cURL command.
If you have not configured the API key as an environment variable, you must replace $DASHSCOPE_API_KEY with your API key.
Non-streaming output
# If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions.
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-plus",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who are you?"
}
]
}'
Running the command produces the following result:
{
"choices": [
{
"message": {
"role": "assistant",
"content": "I am a large language model from Alibaba Cloud. My name is Qwen."
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 11,
"completion_tokens": 16,
"total_tokens": 27
},
"created": 1715252778,
"system_fingerprint": "",
"model": "qwen-plus",
"id": "chatcmpl-xxx"
}Streaming output
If you want to use streaming output, set the stream parameter to true in the request body.
# If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions.
curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-plus",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who are you?"
}
],
"stream":true
}'Running the command produces the following result:
data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"finish_reason":null,"delta":{"content":"I am"},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":" a large-scale"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":" language model"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":" from Alibaba Cloud"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":", and my name is Qwen."},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":""},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: [DONE]
For details about the input parameters, see Request parameters.
Abnormal response example
If an error occurs during the request, the response indicates the cause of the error in the code and message fields.
{
"error": {
"message": "Incorrect API key provided. ",
"type": "invalid_request_error",
"param": null,
"code": "invalid_api_key"
}
}Status codes
Error code | Description |
400 - Invalid Request Error | The request is invalid. For more information, see the error message. |
401 - Incorrect API key provided | The API key is incorrect. |
429 - Rate limit reached for requests | The queries per second (QPS), queries per minute (QPM), or other limits are exceeded. |
429 - You exceeded your current quota, please check your plan and billing details | Your quota is exceeded or your account has an overdue payment. |
500 - The server had an error while processing your request | A server-side error occurred. |
503 - The engine is currently overloaded, please try again later | The server is overloaded. Try again later. |