All Products
Search
Document Center

Alibaba Cloud Model Studio:Client events

Last Updated:Jan 21, 2026

This topic describes the client events for the qwen3-livetranslate-flash-realtime API.

Reference: Real-time audio and video translation - Qwen.

session.update

After you establish a WebSocket connection, send this event to update the default session configuration.

After the service receives the session.update event, it validates the parameters. If the parameters are invalid, the service returns a fault. If the parameters are valid, the service updates and returns the complete configuration.

type string (Required)

The event type. This must be set to session.update.

{
  "event_id": "event_ToPZqeobitzUJnt3QqtWg",
  "type": "session.update",
  "session": {
    "modalities": [
      "text",
      "audio"
    ],
    "voice": "Cherry",
    "input_audio_format": "pcm16",
    "output_audio_format": "pcm24",
    "input_audio_transcription": {
      "model": "qwen3-asr-flash-realtime",
      "language": "zh"
    },
    "translation": {
      "language": "en"
    }
  }
}

session object (Optional)

The session configuration.

Properties

modalities array (Optional)

The output modalities of the model. Valid values:

  • ["text"]

    Outputs text only.

  • ["text","audio"] (Default)

    Outputs text and audio.

voice string (Optional)

The voice for the generated audio. Valid values: Supported voices. Default value: Cherry.

input_audio_transcription object (Optional)

The configuration for the input audio.

Properties

model string (Optional)

The speech recognition model. If this parameter is configured, the server returns the speech recognition result (original text in the source language) of the input audio along with the translation. The result is returned through the conversation.item.input_audio_transcription.text and conversation.item.input_audio_transcription.completed events.

Valid value: qwen3-asr-flash-realtime.

language string (Optional)

The source language for translation. Valid values: Supported languages. Default value: en.

input_audio_format string (Optional)

The format of the input audio. Currently, this parameter can only be set to pcm16.

output_audio_format string (Optional)

The format of the output audio. Currently, this parameter can only be set to pcm24.

translation object (Optional)

The translation configuration.

Properties

language string (Optional)

The target language for translation. Valid values: Supported languages. Default value: en.

input_audio_buffer.append

This event appends audio bytes to the input audio buffer. The service uses this buffer to detect speech and determine when to submit it.

type string (Required)

The event type. This must be set to input_audio_buffer.append.

{
    "event_id": "event_xxx",
    "type": "input_audio_buffer.append",
    "audio": "xxx"
}

audio string (Required)

The Base64-encoded audio data.

input_image_buffer.append

This event adds image data to the image buffer. The image can be from a local file or captured in real-time from a video stream.

The following limits apply to image inputs:

  • The image format must be JPG or JPEG. For optimal performance, a resolution of 480p or 720p is recommended. The maximum resolution is 1080p.

  • The size of a single image cannot exceed 500 KB before Base64 encoding.

  • The image data must be Base64-encoded.

  • The frequency for adding images to the buffer must not exceed 2 images per second.

  • You must send at least one input_audio_buffer.append event before you send an input_image_buffer.append event.

type string (Required)

The event type. This must be set to input_image_buffer.append.

{
    "event_id": "event_xxx",
    "type": "input_image_buffer.append",
    "image": "xxx"
}

image string (Required)

The Base64-encoded image data.

session.finish

Send this event to end the current session. After you send this event, the server responds as:

The client must disconnect after receiving the session.finished event.

type string (Required)

The event type. This must be set to session.finish.

{
    "event_id": "event_xxx",
    "type": "session.finish"
}