This topic describes the input and output parameters for Qwen-ASR. Call the API using the OpenAI compatible protocol or the DashScope protocol.
User guide: For model details and how to select them, see Audio file recognition - Qwen.
Model connection types
Different models support different connection types. Select the appropriate integration method from the following table.
Model | Connection type |
Qwen3-ASR-Flash-Filetrans | Only supports the DashScope asynchronous method |
Qwen3-ASR-Flash |
OpenAI compatible
The US region does not support the OpenAI compatible mode.
URL
International
In the International deployment mode, the endpoint and data storage are located in the Singapore region, and model inference compute resources are dynamically scheduled globally, excluding Mainland China.
HTTP endpoint: POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions
base_url for SDK: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
Mainland China
In the Mainland China deployment mode, the endpoint and data storage are located in the Beijing region, and model inference compute resources are restricted to Mainland China.
HTTP endpoint: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
base_url for SDK: https://dashscope.aliyuncs.com/compatible-mode/v1
Request body | Input: Audio file URLPython SDKNode.js SDKcURLYou can configure the context for customized recognition using the Input: Base64-encoded audio fileYou can input Base64-encoded data (Data URL) in the format:
Python SDKThe audio file used in the example is welcome.mp3. Node.js SDKThe audio file used in the example is welcome.mp3. |
model The model name. This parameter is applicable only to Qwen3-ASR-Flash. | |
messages The list of messages. | |
asr_options Specifies whether to enable certain features.
| |
stream Specifies whether to use streaming output for the response. For more information, see Streaming output. Valid values:
We recommend that you set this parameter to | |
stream_options The configuration items for streaming output. This parameter takes effect only when |
Response body | Non-streaming outputStreaming output |
id The unique identifier for this call. | |
choices The output information of the model. | |
created The UNIX timestamp (in seconds) when the request was created. | |
model The model used for this request. | |
object Always | |
usage The token consumption information for this request. |
DashScope synchronous
URL
International
In the international deployment mode, the endpoint and data storage are located in the Singapore region, and model inference compute resources are dynamically scheduled globally, excluding Mainland China.
HTTP endpoint: POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
base_url for SDK: https://dashscope-intl.aliyuncs.com/api/v1
United States
In the US deployment mode, the endpoint and data storage are located in the US (Virginia) region, and model inference compute resources are restricted to the United States.
HTTP endpoint: POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
base_url for SDK: https://dashscope-us.aliyuncs.com/api/v1
Mainland China
In the Mainland China deployment mode, the endpoint and data storage are located in the Beijing region, and model inference compute resources are available only in Mainland China.
HTTP endpoint: POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
base_url for SDK: https://dashscope.aliyuncs.com/api/v1
Request body | Qwen3-ASR-FlashThe following example shows how to recognize audio from a URL. For an example of how to recognize a local audio file, see QuickStart. cURLJavaPython |
model The name of the model. This parameter is applicable only to Qwen-3-ASR-Flash. | |
messages The list of messages. When you make an HTTP call, place messages in the input object. | |
asr_options Specifies whether to enable certain features. This parameter is supported only by Qwen3-ASR-Flash. |
Response body | Qwen3-ASR-Flash |
request_id The unique identifier for this call. The parameter returned by the Java SDK is requestId. | |
output The information about the call result. | |
usage The token consumption information for this request. |
DashScope asynchronous
Process description
Unlike the OpenAI compatible mode or DashScope synchronous call, which involve a single request and an immediate response, asynchronous invocation is designed for processing long audio files or other time-consuming tasks. This mode uses a two-step "submit-poll" process to prevent request timeouts caused by long waiting times:
Step 1: Submit a task
The client initiates an asynchronous processing request.
After validating the request, the server does not execute the task immediately. Instead, it returns a unique
task_idto indicate that the task has been successfully created.
Step 2: Get the result
The client uses the obtained
task_idto repeatedly call the result query API through polling.After the task is complete, the result query API returns the final recognition result.
You can choose to use an SDK or directly call the RESTful API based on your integration environment.
Use an SDK (see Getting started for sample code, Submit a task's request body for request parameters, and Asynchronous call recognition result for returned results).
The SDK encapsulates the underlying API call details, providing a more convenient programming experience.
Submit a task: Call the
async_call()(Python) orasyncCall()(Java) method to submit the task. This method returns a task object that contains atask_id.Get the result: Use the task object returned in the previous step or the
task_idto call thefetch()method to retrieve the result. The SDK automatically handles the polling logic until the task is complete or times out.
2. Use a RESTful API
Directly calling the HTTP API provides maximum flexibility.
Submit a task. If the request is successful, the response body contains a
task_id.Use the
task_idfrom the previous step to retrieve the task execution result.
Submit a task
URL
International
In the International deployment mode, the endpoint and data storage are located in the Singapore region, and model inference compute resources are dynamically scheduled globally, excluding Mainland China.
HTTP endpoint: POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription
base_url for SDK: https://dashscope-intl.aliyuncs.com/api/v1
Mainland China
In the Mainland China deployment mode, the endpoint and data storage are located in the Beijing region, and model inference compute resources are restricted to Mainland China.
HTTP endpoint: POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription
base_url for SDK: https://dashscope.aliyuncs.com/api/v1
Request body | cURLJavaFor SDK samples, see Getting Started. PythonFor SDK examples, see Getting Started. |
model The name of the model. Only applicable to Qwen3-ASR-Flash-Filetrans. | |
input | |
parameters |
Response body | |
request_id The unique identifier for this call. | |
output The information about the call result. |
Get the task execution result
URL
International
HTTP endpoint: GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}
base_url for SDK: https://dashscope-intl.aliyuncs.com/api/v1
Mainland China
base_url for SDK: https://dashscope.aliyuncs.com/api/v1
HTTP endpoint: GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}
Request body | cURLJavaFor SDK examples, see Getting Started. PythonFor SDK examples, see Getting Started. |
task_id The ID of the task. Pass the task_id returned by the Submit a task operation to query the speech recognition result. |
Response body | RUNNINGSUCCEEDEDFAILED |
request_id The unique identifier for this call. | |
output The information about the call result. |
Asynchronous call recognition result description | |
file_url The URL of the recognized audio file. | |
audio_info Information about the recognized audio file. | |
transcripts A list of complete recognition results. Each element corresponds to the recognized content of an audio track. |