All Products
Search
Document Center

Alibaba Cloud Model Studio:Server events

Last Updated:Mar 24, 2026

Server-side events for the qwen3-livetranslate-flash-realtime API.

Reference: Real-time audio and video translation - Qwen

error

Error message returned by the server.

event_id string

The unique identifier for this event.

{
  "event_id": "event_RoUu4T8yExPMI37GKwaOC",
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_value",
    "message": "Invalid modalities: ['audio']. Supported combinations are: ['text'] and ['audio', 'text'].",
    "param": "session.modalities"
  }
}

type string

The event type. The value is always error.

error object

Detailed information about the error.

Properties

type string

The error type.

code string

The error code.

message string

The error message.

param string

The parameter that is related to the error, such as session.modalities.

session.created

When a client connects, the server returns this event first with the default session configurations.

event_id string

The unique identifier for this event.

{
    "event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
    "type": "session.created",
    "session": {
        "id": "sess_OozZ1vtbPt2muDflHODIH",
        "object": "realtime.session",
        "model": "qwen3-livetranslate-flash-realtime",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Cherry",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm24",
        "translation": {
           "language": "en"
        }
    }
}

type string

The event type. The value is always session.created.

session object

The session configuration.

Properties

id string

The unique identifier for the session.

object string

The value is always realtime.session.

model string

The model in use.

modalities array

The output modality settings for the model.

voice string

The voice for the audio generated by the model.

input_audio_format string

The format of the input audio. The value is always pcm16.

output_audio_format string

The format of the output audio. The value is always pcm24.

translation object (Optional)

The translation configuration.

Properties

translation string (Optional)

The target language for translation.

session.updated

Returned after a successful session.update request. If the request fails, an error event is returned instead.

event_id string

The unique identifier for this event.

{
    "event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
    "type": "session.updated",
    "session": {
        "id": "sess_OozZ1vtbPt2muDflHODIH",
        "object": "realtime.session",
        "model": "qwen3-livetranslate-flash-realtime",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Ethan",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm24",
        "translation": {
           "language": "en"
        }
    }
}

type string

The event type. The value is always session.updated.

session object

The session configuration.

Properties

id string

The unique identifier for the session.

object string

The value is always realtime.session.

model string

The model in use.

modalities array

The output modality settings for the model.

voice string

The voice for the audio generated by the model.

input_audio_format string

The format of the input audio. The value is always pcm16.

output_audio_format string

The format of the output audio. The value is always pcm24.

translation object (Optional)

The translation configuration.

Properties

translation string (Optional)

The target language for translation.

session.finished

Session is finished and all audio translations are complete.

Sent only after the client sends a session.finish request. The client can then disconnect.

event_id string

The unique identifier for this event.

{
    "event_id": "event_xxx",
    "type": "session.finished"
}

type string

The event type. The value is always session.finished.

response.created

Returned when the server generates a new model response.

event_id string

The unique identifier for this event.

{
    "event_id": "event_L8hHVI5jYis6BzAjnPWJh",
    "type": "response.created",
    "response": {
        "id": "resp_P79OOMs8LnrXVpiIHUCKR",
        "object": "realtime.response",
        "conversation_id": "conv_UFClXtYkRkFXrs48y8pmK",
        "status": "in_progress",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Cherry",
        "output_audio_format": "pcm24",
        "output": []
    }
}

type string

The event type. The value is always response.created.

response object

The response object.

Properties

id string

The unique identifier for the response.

conversation_id string

The unique identifier for the current session.

object string

The object type. For this event, the value is always realtime.response.

status string

The response status. Valid values:

  • completed

  • failed

  • in_progress

  • incomplete

modalities array

Response modality.

voice string

The voice of the generated audio.

output_audio_format string

The format of the output audio. The value is fixed to pcm24.

output string

This event is currently empty.

response.done

Returned after response generation is complete. Contains all output items except raw audio data.

event_id string

The unique identifier for this event.

{
  "event_id": "event_CNea8oXNipVanSg2VIzkO",
  "type": "response.done",
  "response": {
    "id": "resp_TfhYTqej692vsGA2jNEtH",
    "object": "realtime.response",
    "conversation_id": "conv_ZtyLfKVm8XqLwYRlsuDih",
    "status": "completed",
    "modalities": [
      "text",
      "audio"
    ],
    "voice": "Cherry",
    "output_audio_format": "pcm24",
    "output": [
      {
        "id": "item_MKtkMwN9RtcyE9eJShyWy",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
          {
            "type": "audio",
            "transcript": "Hello? "
          }
        ]
      }
    ],
    "usage": {
      "total_tokens": 56,
      "input_tokens": 47,
      "output_tokens": 9,
      "input_tokens_details": {
        "text_tokens": 20,
        "audio_tokens": 27
      },
      "output_tokens_details": {
        "text_tokens": 2,
        "audio_tokens": 7
      }
    }
  }
}

type string

The event type. The value is always response.done.

response object

The response object.

Properties

id string

The unique identifier for the response.

conversation_id string

The unique identifier for the current session.

object string

The object type. For this event, the value is always realtime.response.

status string

The status of the response.

modalities array

The modality of the response.

voice string

The voice used for the audio generated by the model.

output_audio_format string

The format of the output audio. The value is always pcm24.

output object

The output of the response.

Properties

id string

The unique identifier for the response output.

type string

The type of the output item. The value is currently always message.

object string

The object type of the output item. The value is currently always realtime.item.

status string

The status of the output item.

role string

The role of the output item.

content array

The content of the output item.

Properties

type string

The type of the output content. The value is text for plain text output and audio when the output includes audio.

text string

The text content of the output.

transcript string

The text transcription of the audio content.

usage object

The token consumption information for this response.

response.text.text

Returned for text-only output when the model generates text incrementally.

event_id string

A unique identifier for the event.

{
    "event_id": "event_B1lIeyOXR7qJMEExbqtTG",
    "type": "response.text.text",
    "response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
    "item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
    "output_index": 0,
    "content_index": 0,
    "text": "How are",
    "stash": " you today?"
}

type string

The type of the event. The value is always response.text.text.

text string

The incremental text that is returned.

response_id string

The response ID.

item_id string

A unique identifier for the message item.

output_index integer

Currently, the value is always 0.

content_index integer

Currently, the value is always 0.

stash string

A temporary text generated by the model. Concatenate this with the current text to form a temporary result. The system updates text and stash continuously using response.text.text events until it receives a response.text.done event. At that point, get the final full text from the text field.

response.text.done

Returned when text generation finishes for text-only output, or if the response is interrupted, incomplete, or canceled.

event_id string

The unique identifier for this event.

{
    "event_id": "event_B1lIeE2Nac33zn5V7h2mm",
    "type": "response.text.done",
    "response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
    "item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
    "output_index": 0,
    "content_index": 0,
    "text": "How can I assist you today?"
}

type string

The event type. The value is always response.text.done.

response_id string

The unique identifier for the response.

item_id string

The unique identifier for the message item.

output_index integer

The value is currently always 0.

content_index integer

The value is currently always 0.

text string

The complete text output from the model.

response.audio.delta

Returned when audio output is enabled and the model generates audio incrementally.

event_id string

A unique identifier for the event.

{
    "event_id": "event_B1osWMZBtrEQbiIwW0qHQ",
    "type": "response.audio.delta",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0,
    "delta": "UklGRnoGAABXQVZFZm10IBAAAAAB..."
}

type string

The event type. The value is always response.audio.delta.

response_id string

A unique identifier for the response.

item_id string

A unique identifier for the message item.

output_index integer

The value is always 0.

content_index integer

The value is always 0.

delta string

The incremental audio data that is output by the model. The data is Base64-encoded.

response.audio.done

Returned when audio generation is complete. Also returned if the response is interrupted, incomplete, or canceled. Does not contain complete audio data.

event_id string

The unique identifier for this event.

{
    "event_id": "event_B1osWMWoDRYyITDyNYcBu",
    "type": "response.audio.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0
}

type string

The event type. This is always response.audio.done.

response_id string

The unique identifier for the response.

item_id string

The unique identifier for the message item.

output_index integer

The value is always 0.

content_index integer

The value is always 0.

conversation.item.input_audio_transcription.text

When input_audio_transcription.model is configured, the server streams speech recognition results in the original source language.

event_id string

The unique identifier for this event.

{
    "event_id": "event_xxx",
    "type": "conversation.item.input_audio_transcription.text",
    "item_id": "item_xxx",
    "content_index": 0,
    "text": "",
    "stash": "The weather is really nice today",
    "language": "zh"
}

type string

The event type. The value is always conversation.item.input_audio_transcription.text.

item_id string

The unique identifier for the message item.

content_index integer

The value is currently always 0.

text string

The confirmed recognition text.

stash string

The recognition text that is pending confirmation. This text may be corrected by subsequent events.

language string

The detected source language.

conversation.item.input_audio_transcription.completed

When input_audio_transcription.model is configured, returns the final recognition result after speech recognition completes.

event_id string

The unique identifier for this event.

{
    "event_id": "event_xxx",
    "type": "conversation.item.input_audio_transcription.completed",
    "item_id": "item_xxx",
    "content_index": 0,
    "transcript": "The weather is really nice today, let's go for a walk in the park.",
    "language": "zh"
}

type string

The event type. This is always conversation.item.input_audio_transcription.completed.

item_id string

The unique identifier for the message item.

content_index integer

This is currently always 0.

transcript string

The complete speech recognition result in the original source language.

language string

The detected source language.

response.audio_transcript.text

Returned for audio output to display real-time translation.

event_id string

The unique identifier for this event.

{
  "event_id": "event_xxx",
  "type": "response.audio_transcript.text",
  "response_id": "resp_xxx",
  "item_id": "item_xxx",
  "output_index": 0,
  "content_index": 0,
  "text": "Hello,",
  "stash": " who are you?"
}

type string

The type of the event. The value is always response.audio_transcript.text.

response_id string

The unique identifier for the response.

item_id string

The unique identifier for the message item.

output_index integer

Currently, the value is always 0.

content_index integer

Currently, the value is always 0.

text string

The confirmed translation text segment.

stash string

Temporary translation text, concatenated with text to form interim results. The system updates text and stash continuously via response.audio_transcript.text events until a response.audio_transcript.done event is received. Then retrieve the final translation from transcript.

response.audio_transcript.done

Returned when audio output is enabled and text generation finishes.

event_id string

The unique identifier for this event.

{
    "event_id": "event_VN4Q4GJugLcc1S23viW8E",
    "type": "response.audio_transcript.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_JvJauNH2CTXb1D9WV6pD4",
    "output_index": 0,
    "content_index": 0,
    "transcript": "How can I assist you today?"
}

type string

The event type. This is always response.audio_transcript.done.

response_id string

The unique identifier for the response.

item_id string

The unique identifier for the message item.

output_index integer

This is currently always 0.

content_index integer

This is currently always 0.

transcript string

The complete text.

response.output_item.added

Returned when a new output item is added during response generation.

event_id string

The unique identifier for this event.

{
    "event_id": "event_B4O5yPt3Gjnjy5eYH3plG",
    "type": "response.output_item.added",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "output_index": 0,
    "item": {
        "id": "item_OFaPGtzfWCPyGzxnuEX9i",
        "object": "realtime.item",
        "type": "message",
        "status": "in_progress",
        "role": "assistant",
        "content": []
    }
}

type string

The event type. The value is always response.output_item.added.

response_id string

The unique identifier for the response.

output_index integer

The value is currently always 0.

item object

Information about the output item.

Properties

id string

The unique identifier for the output item.

type string

The value is always message.

object string

The value is always realtime.item.

status string

The status of the output item.

role string

The role of the message.

content string

The content of the message.

response.output_item.done

Returned when an output item is completed.

event_id string

The unique identifier for this event.

{
    "event_id": "event_XkiwbYTBC9Wcdwy6uYJ2G",
    "type": "response.output_item.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "output_index": 0,
    "item": {
        "id": "item_JvJauNH2CTXb1D9WV6pD4",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
            {
                "type": "audio",
                "text": "Hello, I am a large language model developed by Alibaba Cloud. My name is Qwen. How can I help you?"
            }
        ]
    }
}

type string

The event type. The value is always response.output_item.done.

response_id string

The unique identifier for the response.

output_index integer

The value is currently always 0.

item object

Information about the output item.

Properties

id string

The unique identifier for the output item.

object string

The value is always realtime.item.

type string

The value is always message.

status string

The status of the output item.

role string

The role of the message sender.

content string

The content of the message.

response.content_part.added

Returned when a new content part is added.

event_id string

The unique ID of the event.

{
    "event_id": "event_J2UixwYKZsXg7c9YXZetL",
    "type": "response.content_part.added",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "audio",
        "text": ""
    }
}

type string

The type of the event. The value is always response.content_part.added.

response_id string

The unique ID of the response.

item_id string

The unique ID of the message item.

output_index integer

The value is always 0.

content_index integer

The value is always 0.

part object

Outputs item information.

Properties

type string

The type of the content part.

text string

The text of the content part.

response.content_part.done

Returned when a content part is completed.

event_id string

The unique identifier for this event.

{
    "event_id": "event_VN4Q4GJugLcc1S23viW8E",
    "type": "response.content_part.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_JvJauNH2CTXb1D9WV6pD4",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "audio",
        "text": "Hello, I am a large language model developed by Alibaba Cloud. My name is Qwen. How can I help you?"
    }
}

type string

The event type. This is always response.content_part.done.

response_id string

The unique identifier for the response.

item_id string

The unique identifier for the message item.

output_index integer

The value is always 0.

content_index integer

The value is always 0.

part object

Information about the content part.

Properties

type string

The type of the content part.

text string

The text of the content part.