This topic describes the server events for the Qwen-Omni-Realtime API.
Reference: Real-time (Qwen-Omni-Realtime).
error
Error message from server.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
error Error details. |
session.created
First event returned after client connection, containing default session configuration.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
session Session configuration. |
session.updated
Returned after a successful session.update request. On error, an error event is returned instead.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
session Session configuration. |
input_audio_buffer.speech_started
Returned when VAD detects speech start in audio buffer.
May also trigger each time audio is added before speech detection.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
audio_start_ms Milliseconds from when audio writing begins until speech is first detected. |
|
|
item_id User message item ID (created on speech stop). User message items append input to conversation history for subsequent model inference and generation. |
input_audio_buffer.speech_stopped
Returned when VAD detects speech end in audio buffer.
A conversation.item.created event is also returned to create the user message item.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
audio_end_ms Milliseconds from session start to speech stop. |
|
|
item_id User message item ID (to be created). |
input_audio_buffer.committed
Returned when audio buffer is committed.
-
In VAD mode, returned automatically when speech ends.
-
In Manual mode, returned after client sends
input_audio_buffer.commit.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
item_id User message item ID (to be created). |
input_audio_buffer.cleared
Returned after client sends input_audio_buffer.clear.
|
event_id Unique event identifier. |
|
|
type Always |
conversation.item.created
Returned when a conversation item is created.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
item Conversation item to add. |
conversation.item.input_audio_transcription.completed
Provides transcription results after user audio is buffered. Transcription is processed by the gummy-realtime-v1 speech recognition model.
Transcribed text from the speech recognition model may differ from text processed by the Qwen-Omni-Realtime model — for reference only.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
item_id User message item ID. |
|
|
content_index Currently fixed at 0. |
|
|
transcript Transcribed text. |
conversation.item.input_audio_transcription.failed
Returned when input audio transcription fails. Separate from the error event for easier issue identification.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
item_id User message item ID. |
|
|
content_index Currently fixed at 0. |
|
|
error Error details. |
response.created
Returned when generating a new model response.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response Response details. |
response.done
Returned after response generation. The response object contains all output items except raw audio data.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response Response details. |
response.text.delta
Returned when model incrementally generates text (text-only output modality).
|
event_id Unique event identifier. |
|
|
type Always |
|
|
delta Incremental text. |
|
|
response_id Response ID. |
|
|
item_id Message item ID. You can use this ID to associate items from the same message. |
|
|
output_index Output item index in response. Currently fixed at 0. |
|
|
content_index Content part index within output item. Currently fixed at 0. |
response.text.done
Returned when model finishes generating text (text-only output modality).
Also returned on interruption, incompletion, or cancellation.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
item_id Message item ID. |
|
|
output_index Output item index in response. |
|
|
content_index Content part index within output item. |
|
|
text Complete model text output. |
response.audio.delta
Returned when model incrementally generates audio data.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
item_id Message item ID. |
|
|
output_index Output item index in response. |
|
|
content_index Content part index within output item. |
|
|
delta Incremental audio data (Base64-encoded). |
response.audio.done
Returned when model finishes generating audio data.
Also returned on interruption, incompletion, or cancellation.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
item_id Message item ID. |
|
|
output_index Output item index in response. |
|
|
content_index Content part index within output item. |
response.audio_transcript.delta
Returned when model incrementally generates text for audio output.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
item_id Message item ID. |
|
|
output_index Output item index in response. |
|
|
content_index Content part index within output item. |
|
|
delta Incremental text. |
response.audio_transcript.done
Returned when model finishes transcribing audio output.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
item_id Message item ID. |
|
|
output_index Output item index in response. |
|
|
content_index Content part index within output item. |
|
|
transcript Complete transcript. |
response.output_item.added
Returned when creating a new item during response generation.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
output_index Output item index in response. |
|
|
item Output item details. |
response.output_item.done
Returned when new item output completes.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
output_index Output item index in response. |
|
|
item Output item details. |
response.content_part.added
Returned when adding a new content part to assistant message during response generation.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
item_id Message item ID. |
|
|
output_index Output item index in response. Currently fixed at 0. |
|
|
content_index Content part index within output item. Currently fixed at 0. |
|
|
part Content part details. |
response.content_part.done
Returned when content part streaming completes in assistant message.
|
event_id Unique event identifier. |
|
|
type Always |
|
|
response_id Response ID. |
|
|
item_id Message item ID. |
|
|
output_index Output item index in response. Currently fixed at 0. |
|
|
content_index The index of the content part within the content array of the item. Currently fixed at 0. |
|
|
part Returned information |