Server-side events for the qwen3-livetranslate-flash-realtime API.
Reference: Real-time audio and video translation - Qwen
error
Error message returned by the server.
event_id string The unique identifier for this event. | {
"event_id": "event_RoUu4T8yExPMI37GKwaOC",
"type": "error",
"error": {
"type": "invalid_request_error",
"code": "invalid_value",
"message": "Invalid modalities: ['audio']. Supported combinations are: ['text'] and ['audio', 'text'].",
"param": "session.modalities"
}
}
|
type string The event type. The value is always error. |
error object Detailed information about the error. Properties type string The error type. code string The error code. message string The error message. param string The parameter that is related to the error, such as session.modalities. |
session.created
When a client connects, the server returns this event first with the default session configurations.
event_id string The unique identifier for this event. | {
"event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
"type": "session.created",
"session": {
"id": "sess_OozZ1vtbPt2muDflHODIH",
"object": "realtime.session",
"model": "qwen3-livetranslate-flash-realtime",
"modalities": [
"text",
"audio"
],
"voice": "Cherry",
"input_audio_format": "pcm16",
"output_audio_format": "pcm24",
"translation": {
"language": "en"
}
}
}
|
type string The event type. The value is always session.created. |
session object The session configuration. Properties id string The unique identifier for the session. object string The value is always realtime.session. model string The model in use. modalities array The output modality settings for the model. voice string The voice for the audio generated by the model. input_audio_format string The format of the input audio. The value is always pcm16. output_audio_format string The format of the output audio. The value is always pcm24. translation object (Optional) The translation configuration. Properties translation string (Optional) The target language for translation. |
session.updated
Returned after a successful session.update request. If the request fails, an error event is returned instead.
event_id string The unique identifier for this event. | {
"event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
"type": "session.updated",
"session": {
"id": "sess_OozZ1vtbPt2muDflHODIH",
"object": "realtime.session",
"model": "qwen3-livetranslate-flash-realtime",
"modalities": [
"text",
"audio"
],
"voice": "Ethan",
"input_audio_format": "pcm16",
"output_audio_format": "pcm24",
"translation": {
"language": "en"
}
}
}
|
type string The event type. The value is always session.updated. |
session object The session configuration. Properties id string The unique identifier for the session. object string The value is always realtime.session. model string The model in use. modalities array The output modality settings for the model. voice string The voice for the audio generated by the model. input_audio_format string The format of the input audio. The value is always pcm16. output_audio_format string The format of the output audio. The value is always pcm24. translation object (Optional) The translation configuration. Properties translation string (Optional) The target language for translation. |
session.finished
Session is finished and all audio translations are complete.
Sent only after the client sends a session.finish request. The client can then disconnect.
event_id string The unique identifier for this event. | {
"event_id": "event_xxx",
"type": "session.finished"
}
|
type string The event type. The value is always session.finished. |
response.created
Returned when the server generates a new model response.
event_id string The unique identifier for this event. | {
"event_id": "event_L8hHVI5jYis6BzAjnPWJh",
"type": "response.created",
"response": {
"id": "resp_P79OOMs8LnrXVpiIHUCKR",
"object": "realtime.response",
"conversation_id": "conv_UFClXtYkRkFXrs48y8pmK",
"status": "in_progress",
"modalities": [
"text",
"audio"
],
"voice": "Cherry",
"output_audio_format": "pcm24",
"output": []
}
}
|
type string The event type. The value is always response.created. |
response object The response object. Properties id string The unique identifier for the response. conversation_id string The unique identifier for the current session. object string The object type. For this event, the value is always realtime.response. status string The response status. Valid values: completed
failed
in_progress
incomplete
modalities array Response modality. voice string The voice of the generated audio. output_audio_format string The format of the output audio. The value is fixed to pcm24. output string This event is currently empty. |
response.done
Returned after response generation is complete. Contains all output items except raw audio data.
event_id string The unique identifier for this event. | {
"event_id": "event_CNea8oXNipVanSg2VIzkO",
"type": "response.done",
"response": {
"id": "resp_TfhYTqej692vsGA2jNEtH",
"object": "realtime.response",
"conversation_id": "conv_ZtyLfKVm8XqLwYRlsuDih",
"status": "completed",
"modalities": [
"text",
"audio"
],
"voice": "Cherry",
"output_audio_format": "pcm24",
"output": [
{
"id": "item_MKtkMwN9RtcyE9eJShyWy",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "audio",
"transcript": "Hello? "
}
]
}
],
"usage": {
"total_tokens": 56,
"input_tokens": 47,
"output_tokens": 9,
"input_tokens_details": {
"text_tokens": 20,
"audio_tokens": 27
},
"output_tokens_details": {
"text_tokens": 2,
"audio_tokens": 7
}
}
}
}
|
type string The event type. The value is always response.done. |
response object The response object. Properties id string The unique identifier for the response. conversation_id string The unique identifier for the current session. object string The object type. For this event, the value is always realtime.response. status string The status of the response. modalities array The modality of the response. voice string The voice used for the audio generated by the model. output_audio_format string The format of the output audio. The value is always pcm24. output object The output of the response. Properties id string The unique identifier for the response output. type string The type of the output item. The value is currently always message. object string The object type of the output item. The value is currently always realtime.item. status string The status of the output item. role string The role of the output item. content array The content of the output item. Properties type string The type of the output content. The value is text for plain text output and audio when the output includes audio. text string The text content of the output. transcript string The text transcription of the audio content. usage object The token consumption information for this response. |
response.text.text
Returned for text-only output when the model generates text incrementally.
event_id string A unique identifier for the event. | {
"event_id": "event_B1lIeyOXR7qJMEExbqtTG",
"type": "response.text.text",
"response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
"item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
"output_index": 0,
"content_index": 0,
"text": "How are",
"stash": " you today?"
}
|
type string The type of the event. The value is always response.text.text. |
text string The incremental text that is returned. |
response_id string The response ID. |
item_id string A unique identifier for the message item. |
output_index integer Currently, the value is always 0. |
content_index integer Currently, the value is always 0. |
stash string A temporary text generated by the model. Concatenate this with the current text to form a temporary result. The system updates text and stash continuously using response.text.text events until it receives a response.text.done event. At that point, get the final full text from the text field. |
response.text.done
Returned when text generation finishes for text-only output, or if the response is interrupted, incomplete, or canceled.
event_id string The unique identifier for this event. | {
"event_id": "event_B1lIeE2Nac33zn5V7h2mm",
"type": "response.text.done",
"response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
"item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
"output_index": 0,
"content_index": 0,
"text": "How can I assist you today?"
}
|
type string The event type. The value is always response.text.done. |
response_id string The unique identifier for the response. |
item_id string The unique identifier for the message item. |
output_index integer The value is currently always 0. |
content_index integer The value is currently always 0. |
text string The complete text output from the model. |
response.audio.delta
Returned when audio output is enabled and the model generates audio incrementally.
event_id string A unique identifier for the event. | {
"event_id": "event_B1osWMZBtrEQbiIwW0qHQ",
"type": "response.audio.delta",
"response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
"item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
"output_index": 0,
"content_index": 0,
"delta": "UklGRnoGAABXQVZFZm10IBAAAAAB..."
}
|
type string The event type. The value is always response.audio.delta. |
response_id string A unique identifier for the response. |
item_id string A unique identifier for the message item. |
output_index integer The value is always 0. |
content_index integer The value is always 0. |
delta string The incremental audio data that is output by the model. The data is Base64-encoded. |
response.audio.done
Returned when audio generation is complete. Also returned if the response is interrupted, incomplete, or canceled. Does not contain complete audio data.
event_id string The unique identifier for this event. | {
"event_id": "event_B1osWMWoDRYyITDyNYcBu",
"type": "response.audio.done",
"response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
"item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
"output_index": 0,
"content_index": 0
}
|
type string The event type. This is always response.audio.done. |
response_id string The unique identifier for the response. |
item_id string The unique identifier for the message item. |
output_index integer The value is always 0. |
content_index integer The value is always 0. |
conversation.item.input_audio_transcription.text
When input_audio_transcription.model is configured, the server streams speech recognition results in the original source language.
event_id string The unique identifier for this event. | {
"event_id": "event_xxx",
"type": "conversation.item.input_audio_transcription.text",
"item_id": "item_xxx",
"content_index": 0,
"text": "",
"stash": "The weather is really nice today",
"language": "zh"
}
|
type string The event type. The value is always conversation.item.input_audio_transcription.text. |
item_id string The unique identifier for the message item. |
content_index integer The value is currently always 0. |
text string The confirmed recognition text. |
stash string The recognition text that is pending confirmation. This text may be corrected by subsequent events. |
language string The detected source language. |
conversation.item.input_audio_transcription.completed
When input_audio_transcription.model is configured, returns the final recognition result after speech recognition completes.
event_id string The unique identifier for this event. | {
"event_id": "event_xxx",
"type": "conversation.item.input_audio_transcription.completed",
"item_id": "item_xxx",
"content_index": 0,
"transcript": "The weather is really nice today, let's go for a walk in the park.",
"language": "zh"
}
|
type string The event type. This is always conversation.item.input_audio_transcription.completed. |
item_id string The unique identifier for the message item. |
content_index integer This is currently always 0. |
transcript string The complete speech recognition result in the original source language. |
language string The detected source language. |
response.audio_transcript.text
Returned for audio output to display real-time translation.
event_id string The unique identifier for this event. | {
"event_id": "event_xxx",
"type": "response.audio_transcript.text",
"response_id": "resp_xxx",
"item_id": "item_xxx",
"output_index": 0,
"content_index": 0,
"text": "Hello,",
"stash": " who are you?"
}
|
type string The type of the event. The value is always response.audio_transcript.text. |
response_id string The unique identifier for the response. |
item_id string The unique identifier for the message item. |
output_index integer Currently, the value is always 0. |
content_index integer Currently, the value is always 0. |
text string The confirmed translation text segment. |
stash string Temporary translation text, concatenated with text to form interim results. The system updates text and stash continuously via response.audio_transcript.text events until a response.audio_transcript.done event is received. Then retrieve the final translation from transcript. | |
response.audio_transcript.done
Returned when audio output is enabled and text generation finishes.
event_id string The unique identifier for this event. | {
"event_id": "event_VN4Q4GJugLcc1S23viW8E",
"type": "response.audio_transcript.done",
"response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
"item_id": "item_JvJauNH2CTXb1D9WV6pD4",
"output_index": 0,
"content_index": 0,
"transcript": "How can I assist you today?"
}
|
type string The event type. This is always response.audio_transcript.done. |
response_id string The unique identifier for the response. |
item_id string The unique identifier for the message item. |
output_index integer This is currently always 0. |
content_index integer This is currently always 0. |
transcript string The complete text. |
response.output_item.added
Returned when a new output item is added during response generation.
event_id string The unique identifier for this event. | {
"event_id": "event_B4O5yPt3Gjnjy5eYH3plG",
"type": "response.output_item.added",
"response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
"output_index": 0,
"item": {
"id": "item_OFaPGtzfWCPyGzxnuEX9i",
"object": "realtime.item",
"type": "message",
"status": "in_progress",
"role": "assistant",
"content": []
}
}
|
type string The event type. The value is always response.output_item.added. |
response_id string The unique identifier for the response. |
output_index integer The value is currently always 0. |
item object Information about the output item. Properties id string The unique identifier for the output item. type string The value is always message. object string The value is always realtime.item. status string The status of the output item. role string The role of the message. content string The content of the message. |
response.output_item.done
Returned when an output item is completed.
event_id string The unique identifier for this event. | {
"event_id": "event_XkiwbYTBC9Wcdwy6uYJ2G",
"type": "response.output_item.done",
"response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
"output_index": 0,
"item": {
"id": "item_JvJauNH2CTXb1D9WV6pD4",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "audio",
"text": "Hello, I am a large language model developed by Alibaba Cloud. My name is Qwen. How can I help you?"
}
]
}
}
|
type string The event type. The value is always response.output_item.done. |
response_id string The unique identifier for the response. |
output_index integer The value is currently always 0. |
item object Information about the output item. Properties id string The unique identifier for the output item. object string The value is always realtime.item. type string The value is always message. status string The status of the output item. role string The role of the message sender. content string The content of the message. |
response.content_part.added
Returned when a new content part is added.
event_id string The unique ID of the event. | {
"event_id": "event_J2UixwYKZsXg7c9YXZetL",
"type": "response.content_part.added",
"response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
"item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
"output_index": 0,
"content_index": 0,
"part": {
"type": "audio",
"text": ""
}
}
|
type string The type of the event. The value is always response.content_part.added. |
response_id string The unique ID of the response. |
item_id string The unique ID of the message item. |
output_index integer The value is always 0. |
content_index integer The value is always 0. |
part object Outputs item information. Properties type string The type of the content part. text string The text of the content part. |
response.content_part.done
Returned when a content part is completed.
event_id string The unique identifier for this event. | {
"event_id": "event_VN4Q4GJugLcc1S23viW8E",
"type": "response.content_part.done",
"response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
"item_id": "item_JvJauNH2CTXb1D9WV6pD4",
"output_index": 0,
"content_index": 0,
"part": {
"type": "audio",
"text": "Hello, I am a large language model developed by Alibaba Cloud. My name is Qwen. How can I help you?"
}
}
|
type string The event type. This is always response.content_part.done. |
response_id string The unique identifier for the response. |
item_id string The unique identifier for the message item. |
output_index integer The value is always 0. |
content_index integer The value is always 0. |
part object Information about the content part. Properties type string The type of the content part. text string The text of the content part. |