Audio file recognition (Qwen-ASR) API reference - Alibaba Cloud Model Studio

This topic describes the input and output parameters for Qwen-ASR.

User guide: For more information about the model and how to select one, see Audio file recognition - Qwen.

The Qwen3-ASR-Flash and Qwen-Audio ASR models must be accessed using synchronous invocation. The Qwen3-ASR-Flash-Filetrans model must be accessed using asynchronous invocation. The two invocation methods differ in the request body, response body, and process. Do not mix them.

Synchronous invocation

China (Beijing): POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

International (Singapore): POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

Request body	Qwen3-ASR-Flash The following example shows how to recognize an audio file from a URL. For an example of how to recognize a local audio file, see Getting started. cURL # ======= Important ======= # The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation # The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key. # === Delete this comment before execution === curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \ --header 'Authorization: Bearer $DASHSCOPE_API_KEY' \ --header 'Content-Type: application/json' \ --data '{ "model": "qwen3-asr-flash", "input": { "messages": [ { "content": [ { "text": "" } ], "role": "system" }, { "content": [ { "audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" } ], "role": "user" } ] }, "parameters": { "asr_options": { "enable_itn": false } } }' Java import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.Map; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult; import com.alibaba.dashscope.common.MultiModalMessage; import com.alibaba.dashscope.common.Role; import com.alibaba.dashscope.exception.ApiException; import com.alibaba.dashscope.exception.NoApiKeyException; import com.alibaba.dashscope.exception.UploadFileException; import com.alibaba.dashscope.utils.Constants; import com.alibaba.dashscope.utils.JsonUtils; public class Main { public static void simpleMultiModalConversationCall() throws ApiException, NoApiKeyException, UploadFileException { MultiModalConversation conv = new MultiModalConversation(); MultiModalMessage userMessage = MultiModalMessage.builder() .role(Role.USER.getValue()) .content(Arrays.asList( Collections.singletonMap("audio", "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"))) .build(); MultiModalMessage sysMessage = MultiModalMessage.builder().role(Role.SYSTEM.getValue()) // Configure the context for customized recognition here .content(Arrays.asList(Collections.singletonMap("text", ""))) .build(); Map<String, Object> asrOptions = new HashMap<>(); asrOptions.put("enable_itn", false); // asrOptions.put("language", "en"); // Optional. If the language of the audio is known, specify the language with this parameter to improve recognition accuracy. MultiModalConversationParam param = MultiModalConversationParam.builder() // The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key. // If the environment variable is not configured, replace the following line with your Model Studio API key: .apiKey("sk-xxx") .apiKey(System.getenv("DASHSCOPE_API_KEY")) .model("qwen3-asr-flash") .message(userMessage) .message(sysMessage) .parameter("asr_options", asrOptions) .build(); MultiModalConversationResult result = conv.call(param); System.out.println(JsonUtils.toJson(result)); } public static void main(String[] args) { try { // The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1 Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1"; simpleMultiModalConversationCall(); } catch (ApiException \| NoApiKeyException \| UploadFileException e) { System.out.println(e.getMessage()); } System.exit(0); } } Python import os import dashscope # The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1 dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1' messages = [ {"role": "system", "content": [{"text": ""}]}, # Configure the context for customized recognition {"role": "user", "content": [{"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"}]} ] response = dashscope.MultiModalConversation.call( # The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key. # If the environment variable is not configured, replace the following line with your Model Studio API key: api_key = "sk-xxx" api_key=os.getenv("DASHSCOPE_API_KEY"), model="qwen3-asr-flash", messages=messages, result_format="message", asr_options={ #"language": "en", # Optional. If the language of the audio is known, specify the language with this parameter to improve recognition accuracy. "enable_itn":False } ) print(response)
model `string` (Required) The name of the model. This parameter applies only to the Qwen3-ASR-Flash and Qwen-Audio ASR models.
messages `array` (Required) The list of messages. When you make an HTTP call, place messages in the input object. Message types System Message `object` (Optional) The goal or role of the model. If you specify a system message, it must be the first message in the list. This parameter is supported only by Qwen3-ASR-Flash. Qwen-Audio ASR does not support this parameter. Properties content `array` (Required) The message content. Properties text `string` Specifies the context. Qwen3-ASR-Flash lets you provide reference information, such as background text and entity vocabularies, as context during speech recognition to obtain customized results. Length limit: 10,000 tokens. For more information, see Context enhancement. role `string` (Required) Set to `system`. User Message `object` (Required) The message sent from the user to the model. Properties content `array` (Required) The content of the user message. Properties audio `string` (Required) The audio to be recognized. For more information about how to use this parameter, see Getting started. The Qwen3-ASR-Flash model supports three input formats: a Base64-encoded file, the absolute path of a local file, or a URL of a file accessible over the public network. The Qwen-Audio ASR model supports two input formats: the absolute path of a local file or a URL of a file accessible over the public network. When you use a SDK, if the audio file is stored in Object Storage Service (OSS), temporary URLs that start with `oss://` are not supported. When you use a RESTful API, if the audio file is stored in OSS, temporary URLs that start with `oss://` are supported. However, note the following: Important The temporary URL is valid for 48 hours and cannot be used after it expires. Do not use it in a production environment. The API for obtaining an upload credential is limited to 100 QPS and does not support scaling out. Do not use it in production environments, high-concurrency scenarios, or stress testing scenarios. For production environments, use a stable storage service such as Alibaba Cloud OSS to ensure long-term file availability and avoid rate limiting issues. role `string` (Required) The role of the user message. Set to `user`.
asr_options `object` (Optional) Specifies whether to enable certain features. This parameter is supported only by Qwen3-ASR-Flash. Qwen-Audio ASR does not support this parameter. Properties language string (Optional) No default value If you know the language of the audio, you can specify it using this parameter to improve recognition accuracy. You can specify only one language. If the language of the audio is uncertain or includes multiple languages, such as a mix of Chinese, English, Japanese, and Korean, do not specify this parameter. Valid values: zh: Chinese (Mandarin, Sichuanese, Minnan, Wu) yue: Cantonese en: English ja: Japanese de: German ko: Korean ru: Russian fr: French pt: Portuguese ar: Arabic it: Italian es: Spanish hi: Hindi id: Indonesian th: Thai tr: Turkish uk: Ukrainian vi: Vietnamese enable_itn `boolean` (Optional) Defaults to: `false` Specifies whether to enable Inverse Text Normalization (ITN). This feature is applicable only to Chinese and English audio. Valid values: true false

Response body	Qwen3-ASR-Flash `{ "output": { "choices": [ { "finish_reason": "stop", "message": { "annotations": [ { "language": "zh", "type": "audio_info", "emotion": "neutral" } ], "content": [ { "text": "Welcome to Alibaba Cloud." } ], "role": "assistant" } } ] }, "usage": { "input_tokens_details": { "text_tokens": 0 }, "output_tokens_details": { "text_tokens": 6 }, "seconds": 1 }, "request_id": "568e2bf0-d6f2-97f8-9f15-a57b11dc6977" }`
request_id `string` The unique identifier for the call. The parameter returned by the Java SDK is requestId
output `object` Information about the call result. Properties choices array A list of output choices generated by the model. Properties finish_reason `string` Consider the following three cases: The value is null during generation. stop: The generation stopped naturally or was terminated by a stop condition specified in the input parameters. length: The generation stopped because the output reached the maximum length. message `object` The message object output by the model. Properties role `string` The role of the output message. The value is assistant. content `array` The content of the output message. Properties text `string` The speech recognition result. annotations `array` Output annotation information, such as the language. Properties language `string` The language of the recognized audio. If you specify the `language` request parameter, this parameter returns the same value. Valid values: zh: Chinese (Mandarin, Sichuanese, Minnan, Wu) yue: Cantonese en: English ja: Japanese de: German ko: Korean ru: Russian fr: French pt: Portuguese ar: Arabic it: Italian es: Spanish hi: Hindi id: Indonesian th: Thai tr: Turkish uk: Ukrainian vi: Vietnamese type `string` The value is fixed to `audio_info`, which indicates audio information. emotion `string` The emotion detected in the recognized audio. The following emotions are supported: `surprised` `neutral` `happy` `sad` `disgusted` `angry` `fearful`
usage `object` The token usage information for the request. Properties input_tokens_details `integer` The length of the input content for Qwen3-ASR-Flash in tokens. Properties text_tokens `integer` The length of the input text in tokens when you use the context enhancement feature of Qwen3-ASR-Flash. The maximum value is 10,000 tokens. output_tokens_details `integer` The length of the output content from Qwen3-ASR-Flash in tokens. Properties text_tokens `integer` The length of the recognized text that is output by Qwen3-ASR-Flash in tokens. seconds `integer` The duration of the audio for Qwen3-ASR-Flash in seconds. input_tokens `integer` The length of the input audio for Qwen-Audio ASR in tokens. The audio-to-token conversion rule is as follows: Each second of audio is converted to 25 tokens. Durations less than 1 second are counted as 1 second. output_tokens `integer` The length of the recognized text that is output by Qwen-Audio ASR in tokens. audio_tokens `integer` The length of the input audio for Qwen-Audio ASR in tokens. The audio-to-token conversion rule is as follows: Each second of audio is converted to 25 tokens. Durations less than 1 second are counted as 1 second.

Asynchronous invocation

Submit a task

China (Beijing): POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription

International (Singapore): POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription

Request body	cURL # ======= Important ======= # The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription # The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key. # === Delete this comment before running === curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription' \ --header "Authorization: Bearer $DASHSCOPE_API_KEY" \ --header "Content-Type: application/json" \ --header "X-DashScope-Async: enable" \ --data '{ "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id":[ 0 ], "enable_itn": false } }' Java import com.google.gson.Gson; import com.google.gson.annotations.SerializedName; import okhttp3.; import java.io.IOException; public class Main { // The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription private static final String API_URL = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription"; public static void main(String[] args) { // The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key. // If the environment variable is not set, replace the following line with String apiKey = "sk-xxx" using your Model Studio API key. String apiKey = System.getenv("DASHSCOPE_API_KEY"); OkHttpClient client = new OkHttpClient(); Gson gson = new Gson(); /String payloadJson = """ { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], "enable_itn": false, "language": "zh", "corpus": { "text": "" } } } """;*/ String payloadJson = """ { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], "enable_itn": false } } """; RequestBody body = RequestBody.create(payloadJson, MediaType.get("application/json; charset=utf-8")); Request request = new Request.Builder() .url(API_URL) .addHeader("Authorization", "Bearer " + apiKey) .addHeader("Content-Type", "application/json") .addHeader("X-DashScope-Async", "enable") .post(body) .build(); try (Response response = client.newCall(request).execute()) { if (response.isSuccessful() && response.body() != null) { String respBody = response.body().string(); // Parse JSON with Gson ApiResponse apiResp = gson.fromJson(respBody, ApiResponse.class); if (apiResp.output != null) { System.out.println("task_id: " + apiResp.output.taskId); } else { System.out.println(respBody); } } else { System.out.println("task failed! HTTP code: " + response.code()); if (response.body() != null) { System.out.println(response.body().string()); } } } catch (IOException e) { e.printStackTrace(); } } static class ApiResponse { @SerializedName("request_id") String requestId; Output output; } static class Output { @SerializedName("task_id") String taskId; @SerializedName("task_status") String taskStatus; } } Python import requests import json import os # The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription" # The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key. # If the environment variable is not set, replace the following line with DASHSCOPE_API_KEY = "sk-xxx" using your Model Studio API key. DASHSCOPE_API_KEY = os.getenv("DASHSCOPE_API_KEY") headers = { "Authorization": f"Bearer {DASHSCOPE_API_KEY}", "Content-Type": "application/json", "X-DashScope-Async": "enable" } payload = { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], # "language": "zh", "enable_itn": False # "corpus": { # "text": "" # } } } response = requests.post(url, headers=headers, data=json.dumps(payload)) if response.status_code == 200: print(f"task_id: {response.json()['output']['task_id']}") else: print("task failed!") print(response.json())
model `string` (Required) The name of the model. This parameter applies only to the Qwen3-ASR-Flash-Filetrans model.
input `object` (Required) Properties file_url `string` (Required) The URL of the audio file to recognize. The URL must be accessible over the Internet. When you use a RESTful API, if the audio file is stored in OSS, temporary URLs that start with `oss://` are supported. However, note the following: Important The temporary URL is valid for 48 hours and cannot be used after it expires. Do not use it in a production environment. The API for obtaining an upload credential is limited to 100 QPS and does not support scaling out. Do not use it in production environments, high-concurrency scenarios, or stress testing scenarios. For production environments, use a stable storage service such as Alibaba Cloud OSS to ensure long-term file availability and avoid rate limiting issues.
parameters `object` (Optional) Properties language string (Optional) No default value If you know the language of the audio, you can specify it using this parameter to improve recognition accuracy. You can specify only one language. If the language of the audio is uncertain or includes multiple languages, such as a mix of Chinese, English, Japanese, and Korean, do not specify this parameter. Valid values: zh: Chinese (Mandarin, Sichuanese, Minnan, Wu) yue: Cantonese en: English ja: Japanese de: German ko: Korean ru: Russian fr: French pt: Portuguese ar: Arabic it: Italian es: Spanish hi: Hindi id: Indonesian th: Thai tr: Turkish uk: Ukrainian vi: Vietnamese enable_itn `boolean` (Optional) Defaults to: `false` Specifies whether to enable Inverse Text Normalization (ITN). This feature is applicable only to Chinese and English audio. Valid values: true false text `string` Specifies the context. Qwen3-ASR-Flash lets you provide reference information, such as background text and entity vocabularies, as context during speech recognition to obtain customized results. Length limit: 10,000 tokens. For more information, see Context enhancement. channel_id `array` (Optional) Defaults to: `[0]` Specifies the index of the audio tracks to recognize in a multi-track audio file. For example, [0] recognizes only the first track, and [0, 1] recognizes both the first and second tracks.

Response body	`{ "request_id": "92e3decd-0c69-47a8-**********", "output": { "task_id": "8fab76d0-0eed-4d20-**********", "task_status": "PENDING" } }`
request_id `string` The unique identifier for the call.
output `object` Information about the call result. Properties task_id `string` The task ID. This ID is passed as a request parameter to the API for querying the speech recognition task. task_status `string` The task status: PENDING RUNNING SUCCEEDED FAILED UNKNOWN: The task does not exist or its status cannot be determined.

Get the task execution result

China (Beijing): GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

International (Singapore): GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}

Request body

Pass the task_id from the response of submitting a task as a parameter to query the speech recognition result.

cURL

# ======= Important =======
# The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}. Note: Replace {task_id} with the ID of the task to query.
# The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key.
# === Delete this comment before running ===

curl --location --request GET 'https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "X-DashScope-Async: enable" \
--header "Content-Type: application/json"

Java

import okhttp3.*;

import java.io.IOException;

public class Main {
    public static void main(String[] args) {
        // Replace with the actual task_id.
        String taskId = "xxx";
        // The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key.
        // If the environment variable is not set, replace the following line with String apiKey = "sk-xxx" using your Model Studio API key.
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        // The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}. Note: Replace {task_id} with the ID of the task to query.
        String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/tasks/" + taskId;

        OkHttpClient client = new OkHttpClient();

        Request request = new Request.Builder()
                .url(apiUrl)
                .addHeader("Authorization", "Bearer " + apiKey)
                .addHeader("X-DashScope-Async", "enable")
                .addHeader("Content-Type", "application/json")
                .get()
                .build();

        try (Response response = client.newCall(request).execute()) {
            if (response.body() != null) {
                System.out.println(response.body().string());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Python

import os
import requests


# The API keys for the Singapore and Beijing regions are different. To get an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key.
# If the environment variable is not set, replace the following line with DASHSCOPE_API_KEY = "sk-xxx" using your Model Studio API key.
DASHSCOPE_API_KEY = os.getenv("DASHSCOPE_API_KEY")

# Replace with the actual task_id.
task_id = "xxx"
# The following URL is for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}. Note: Replace {task_id} with the ID of the task to query.
url = f"https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}"

headers = {
    "Authorization": f"Bearer {DASHSCOPE_API_KEY}",
    "X-DashScope-Async": "enable",
    "Content-Type": "application/json"
}

response = requests.get(url, headers=headers)
print(response.json())

Response body	RUNNING `{ "request_id": "6769df07-2768-4fb0-ad59-**********", "output": { "task_id": "9be1700a-0f8e-4778-be74-********", "task_status": "RUNNING", "submit_time": "2025-10-27 14:19:31.150", "scheduled_time": "2025-10-27 14:19:31.233", "task_metrics": { "TOTAL": 1, "SUCCEEDED": 0, "FAILED": 0 } } }` SUCCEEDED { "request_id": "1dca6c0a-0ed1-4662-aa39-********", "output": { "task_id": "8fab76d0-0eed-4d20-929f-********", "task_status": "SUCCEEDED", "submit_time": "2025-10-27 13:57:45.948", "scheduled_time": "2025-10-27 13:57:46.018", "end_time": "2025-10-27 13:57:47.079", "result": { "transcription_url": "http://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/pre/pre-funasr-mlt-v1/20251027/13%3A57/7a3a8236-ffd1-4099-a280-0299686ac7da.json?Expires=1761631066&OSSAccessKeyId=LTAI**********&Signature=1lKv4RgyWCarRuUdIiErOeOBnwM%3D&response-content-disposition=attachment%3Bfilename%3D7a3a8236-ffd1-4099-a280-0299686ac7da.json" } }, "usage": { "seconds": 3 } } FAILED `{ "request_id": "3d141841-858a-466a-9ff9-********", "output": { "task_id": "c58c7951-7789-4557-9ea3-**********", "task_status": "FAILED", "submit_time": "2025-10-27 15:06:06.915", "scheduled_time": "2025-10-27 15:06:06.967", "end_time": "2025-10-27 15:06:07.584", "code": "FILE_403_FORBIDDEN", "message": "FILE_403_FORBIDDEN" } }`
request_id `string` The unique identifier for the call.
output `object` Information about the call result. Properties task_id `string` The task ID. This ID is passed as a request parameter to the API for querying the speech recognition task. task_status `string` The task status: PENDING RUNNING SUCCEEDED FAILED UNKNOWN: The task is not found or its status cannot be determined. result `object` The speech recognition result. Properties transcription_url `string` The download URL for the recognition result file. The link is valid for 24 hours. After it expires, you cannot query the task or download the result using the previous URL. The recognition result is saved as a JSON file. You can download the file from this link or read the file content directly using an HTTP request. For more information, see Recognition result details. submit_time `string` The time the task was submitted. schedule_time `string` The time the task was scheduled, which is when it started running. end_time `string` The time the task ended. task_metrics `object` Task metrics, which include statistics on the status of sub-tasks. Properties TOTAL `integer` The total number of sub-tasks. SUCCEEDED `integer` The number of successful sub-tasks. FAILED `integer` The number of failed sub-tasks. code `string` The error code. This is returned only when the task fails. message `string` The error message. This is returned only when the task fails. usage `object` The token usage information for the request. Properties seconds `integer` The duration of the audio for Qwen3-ASR-Flash in seconds.

Recognition result details	`{ "file_url":"https://***.wav", "audio_info":{ "format":"wav", "sample_rate": 16000 }, "transcripts":[ { "channel_id":0, "text":"The weather is okay today.", "sentences":[ { "begin_time":100, "end_time":3820, "text":"The weather is okay today.", "sentence_id":0, "language":"zh", "emotion":"neutral" } ] } ] }`
file_url `string` The URL of the recognized audio file.
audio_info `object` Information about the recognized audio file. Properties format `string` The audio format. sample_rate `integer` The audio sampling rate.
transcripts `array` A list of complete recognition results. Each element corresponds to the recognized content of one audio track. Properties channel_id `integer` The audio track index, which starts from 0. text `string` The recognized text. sentences `object` A list of sentence-level recognition results. Properties begin_time `integer` The start timestamp of the sentence in milliseconds. end_time `integer` The end timestamp of the sentence in milliseconds. text `string` The recognized text. sentence_id `integer` The sentence index, which starts from 0. language `string` The language of the recognized audio. If you specify the `language` request parameter, this parameter returns the same value. Valid values: zh: Chinese (Mandarin, Sichuanese, Minnan, Wu) yue: Cantonese en: English ja: Japanese de: German ko: Korean ru: Russian fr: French pt: Portuguese ar: Arabic it: Italian es: Spanish hi: Hindi id: Indonesian th: Thai tr: Turkish uk: Ukrainian vi: Vietnamese emotion `string` The emotion detected in the recognized audio. The following emotions are supported: `surprised` `neutral` `happy` `sad` `disgusted` `angry` `fearful`

Synchronous invocation

Request body

Qwen3-ASR-Flash

cURL

Java

Python

Response body

Qwen3-ASR-Flash

Asynchronous invocation

Submit a task

Request body

cURL

Java

Python

Response body

Get the task execution result

Request body

cURL

Java

Python

Response body

RUNNING

SUCCEEDED

FAILED

Recognition result details