All Products
Search
Document Center

Alibaba Cloud Model Studio:Server events

Last Updated:Mar 15, 2026

Server events for the Qwen-TTS-Realtime API.

Reference: Real-time speech synthesis - Qwen.

error

Sent for both client-side and server-side errors.

event_id string

The server-side event ID.

{
  "event_id": "event_QzAVZRVa9hKqM5VOaHunh",
  "type": "error",
  "error": {
    "code": "invalid_value",
    "message": "Session update error: session already started or finished or failed."
  }
}

type string

The event type. This value is always error.

error object

The error details.

Properties

code string

The error code.

message string

The error message.

session.created

Sent when a client connects. Includes the default session configuration.

event_id string

The server-side event ID.

{
  "event_id": "event_xxx",
  "type": "session.created",
  "session": {
    "object": "realtime.session",
    "mode": "server_commit",
    "model": "qwen-tts-realtime",
    "voice": "Cherry",
    "response_format": "pcm",
    "sample_rate": 24000,
    "id": "sess_xxx"
  }
}

type string

The event type. This value is always session.created.

session object

The session configuration.

Properties

id string

The session ID.

object string

The session service name.

mode string

The interaction mode. Valid values are server_commit or commit.

model string

The model in use.

voice string

The voice in use.

response_format string

The audio format.

sample_rate integer

The audio sampling rate.

session.updated

Sent after the server successfully processes a session.update request. On error, an `error` event is sent instead.

event_id string

The server-side event ID.

{
  "event_id": "event_xxx",
  "type": "session.updated",
  "session": {
    "id": "sess_xxx",
    "object": "realtime.session",
    "model": "qwen-tts-realtime",
    "voice": "Cherry",
    "language_type": "Chinese",
    "mode": "commit",
    "response_format": "pcm",
    "sample_rate": 24000
  }
}

type string

The event type. This value is always session.updated.

session object

The session configuration.

Properties

id string

The session ID.

object string

The session service name.

mode string

The interaction mode. Valid values are server_commit or commit.

model string

The model in use.

voice string

The voice in use.

response_format string

The audio format.

sample_rate integer

The audio sampling rate.

language_type string

The language of the audio.

input_text_buffer.committed

Sent after the server receives an input_text_buffer.commit event.

event_id string

The server-side event ID.

{
  "event_id": "event_FC6MA88wS2oEeXkPvWsxX",
  "type": "input_text_buffer.committed",
  "item_id": ""
}

type string

The event type. This value is always input_text_buffer.committed.

item_id string

The ID of the user message item to create.

input_text_buffer.cleared

The server’s response event after the client sends the input_audio_buffer.clear event.

event_id string

The server-side event ID.

{
    "event_id": "event_1122",
    "type": "input_text_buffer.cleared"
}

type string

The event type. This value is always input_text_buffer.cleared.

response.created

The server sends this event after it receives an input_text_buffer.commit event from the client.

event_id string

The server-side event ID.

{
  "event_id": "event_IMnLqDvG6Ahhk7sWV2uOs",
  "type": "response.created",
  "response": {
    "id": "resp_USvBwHktHcz76r6GaIJUV",
    "object": "realtime.response",
    "conversation_id": "",
    "status": "in_progress",
    "voice": "Cherry",
    "output": []
  }
}

type string

The event type. This value is always response.created.

response object

The response details.

Properties

id string

The response ID.

object string

The object type. This value is always realtime.response.

status string

The final status of the response. Valid values are:

  • completed

  • failed

  • in_progress

  • incomplete

voice string

The voice in use.

output array

This field is empty for this event.

response.output_item.added

Sent when a new output item is ready.

event_id string

The server-side event ID.

{
  "event_id": "event_INDGnGNulaXCrStd9ZM5X",
  "type": "response.output_item.added",
  "response_id": "resp_USvBwHktHcz76r6GaIJUV",
  "output_index": 0,
  "item": {
    "id": "item_FIrYGaNVK3rbIZqeY4QjM",
    "object": "realtime.item",
    "type": "message",
    "status": "in_progress",
    "role": "assistant",
    "content": []
  }
}

type string

The event type. This value is always response.output_item.added.

response_id string

The ID of the response.

output_index integer

The index of the response output item. This value is always 0.

item object

The output item details.

Properties

id string

The output item ID.

object string

This value is always realtime.item.

status string

The status of the output item.

content array

The content of the message.

response.content_part.added

Sent when a new content part is ready.

event_id string

The server-side event ID.

{
  "event_id": "event_DigZ95MWN36YYyyjcENoq",
  "type": "response.content_part.added",
  "response_id": "resp_USvBwHktHcz76r6GaIJUV",
  "item_id": "item_FIrYGaNVK3rbIZqeY4QjM",
  "output_index": 0,
  "content_index": 0,
  "part": {
    "type": "audio",
    "text": ""
  }
}

type string

The event type. This value is always response.content_part.added.

response_id string

The ID of the response.

item_id string

The ID of the message item.

output_index integer

The index of the response output item. This value is always 0.

content_index integer

The index of the content part within the response output item. This value is always 0.

part object

The completed content part.

Properties

type string

The type of the content part.

text string

The text of the content part.

response.audio.delta

Sent when the model generates new audio data incrementally.

event_id string

The server-side event ID.

{
  "event_id": "event_B1osWMZBtrEQbiIwW0qHQ",
  "type": "response.audio.delta",
  "response_id": "resp_B1osWTzBb8hO0WsELHgVP",
  "item_id": "item_B1osWH81fXDoyim1T5fsF",
  "output_index": 0,
  "content_index": 0,
  "delta": "base64 audio"
}

type string

The event type. This value is always response.audio.delta.

response_id string

The ID of the response.

item_id string

The ID of the message item.

output_index integer

The index of the response output item. This value is always 0.

content_index integer

The index of the content part within the response output item. This value is always 0.

delta string

Audio data generated incrementally by the model. The data is Base64-encoded.

response.content_part.done

Sent when a content part is complete.

event_id string

The server-side event ID.

{
  "event_id": "event_Vo2YUjlYQJ4colH8nVzkU",
  "type": "response.content_part.done",
  "response_id": "resp_USvBwHktHcz76r6GaIJUV",
  "item_id": "item_FIrYGaNVK3rbIZqeY4QjM",
  "output_index": 0,
  "content_index": 0,
  "part": {
    "type": "audio",
    "text": ""
  }
}

type string

The event type. This value is always response.content_part.done.

response_id string

The ID of the response.

item_id string

The ID of the message item.

output_index integer

The index of the response output item. This value is always 0.

content_index integer

The index of the content part within the response output item. This value is always 0.

part object

The completed content part.

Properties

type string

The type of the content part.

text string

The text of the content part.

response.output_item.done

Sent when an output item is complete.

event_id string

The server-side event ID.

{
  "event_id": "event_LO6SJRKIQ9NBayyYB8a1A",
  "type": "response.output_item.done",
  "response_id": "resp_USvBwHktHcz76r6GaIJUV",
  "output_index": 0,
  "item": {
    "id": "item_FIrYGaNVK3rbIZqeY4QjM",
    "object": "realtime.item",
    "type": "message",
    "status": "completed",
    "role": "assistant",
    "content": [
      {
        "type": "audio",
        "text": ""
      }
    ]
  }
}

type string

The event type. This value is always response.output_item.done.

response_id string

The ID of the response.

output_index integer

The index of the response output item. This value is always 0.

item object

The output item details.

Properties

id string

The output item ID.

object string

This value is always realtime.item.

status string

The status of the output item.

content array

The content of the message.

response.audio.done

Sent when the model finishes generating audio data.

event_id string

The server-side event ID.

{
  "event_id": "event_LZaOHPzXYMUXGBcVkBmKX",
  "type": "response.audio.done",
  "response_id": "resp_USvBwHktHcz76r6GaIJUV",
  "item_id": "item_FIrYGaNVK3rbIZqeY4QjM",
  "output_index": 0,
  "content_index": 0
}

type string

The event type. This value is always response.audio.done.

response_id string

The ID of the response.

item_id string

The ID of the message item.

output_index integer

The index of the response output item. This value is always 0.

content_index integer

The index of the content part within the response output item. This value is always 0.

response.done

Sent when response generation is complete. The `response` object contains all output items but excludes the already-sent raw audio data.

event_id string

The server-side event ID.

Qwen3-TTS Realtime

{
    "event_id": "event_Aemy83XqHFFDDSeJIDn6N",
    "type": "response.done",
    "response": {
        "id": "resp_LFeR42yXZ9SxUAeXjmyTz",
        "object": "realtime.response",
        "conversation_id": "",
        "status": "completed",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Cherry",
        "output": [
            {
                "id": "item_Ae1lv2XmRljRSG96L8Zm1",
                "object": "realtime.item",
                "type": "message",
                "status": "completed",
                "role": "assistant",
                "content": [
                    {
                        "type": "audio",
                        "transcript": ""
                    }
                ]
            }
        ],
        "usage": {
            "characters": 25
        }
    }
}

Qwen-TTS Realtime

{
  "event_id": "event_xxx",
  "type": "response.done",
  "response": {
    "id": "resp_xxx",
    "object": "realtime.response",
    "conversation_id": "",
    "status": "completed",
    "modalities": [
      "text",
      "audio"
    ],
    "voice": "Cherry",
    "output": [
      {
        "id": "item_FIrYGaNVK3rbIZqeY4QjM",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
          {
            "type": "audio",
            "transcript": ""
          }
        ]
      }
    ],
    "usage": {
      "total_tokens": 67,
      "input_tokens": 3,
      "output_tokens": 64,
      "input_tokens_details": {
        "text_tokens": 3
      },
      "output_tokens_details": {
        "text_tokens": 0,
        "audio_tokens": 64
      }
    }
  }
}

type string

The event type. This value is always response.done.

response_id string

The ID of the response.

response object

The response details.

Properties

id string

The response ID.

object string

The object type. This value is always realtime.response.

output array

Response output.

usage object

Billing information for this speech synthesis request.

Properties

characters integer

The number of characters billed for Qwen3-TTS Realtime.

total_tokensinteger

Total token count for input and output (synthesized audio).

input_tokensinteger

Token count for input content.

output_tokensinteger

Token count for output content.

input_tokens_detailsinteger

Detailed token count for input content.

input_tokens_details.text_tokensinteger

Token count for input text content.

output_tokens_detailsinteger

Detailed token count for output content.

output_tokens_details.text_tokensinteger

Token count for output text content.

output_tokens_details.audio_tokensinteger

Token count for output audio content.

Audio-to-token conversion rule: 1 second = 50 tokens. Audio under 1 second counts as 50 tokens.

session.finished

Sent when all responses have been generated.

event_id string

The server-side event ID.

{
  "event_id": "event_2239",
  "type": "session.finished"
}

type string

The event type. This value is always session.finished.