服務端事件 - Alibaba Cloud Model Studio

本文介紹 qwen3-livetranslate-flash-realtime API 的服務端事件。

相關文檔：即時音視頻翻譯-通義千問。

error

服務端返回的錯誤資訊。

event_id string

本次事件唯一識別碼。

{
  "event_id": "event_RoUu4T8yExPMI37GKwaOC",
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_value",
    "message": "Invalid modalities: ['audio']. Supported combinations are: ['text'] and ['audio', 'text'].",
    "param": "session.modalities"
  }
}

type string

事件類型，固定為error。

error object

錯誤的詳細資料。

屬性

type string

錯誤類型。

code string

錯誤碼。

message string

錯誤資訊。

param string

與錯誤相關的參數，如session.modalities。

session.created

用戶端串連後，服務端返回的第一個事件，包含本次串連的預設配置資訊。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
    "type": "session.created",
    "session": {
        "id": "sess_OozZ1vtbPt2muDflHODIH",
        "object": "realtime.session",
        "model": "qwen3-livetranslate-flash-realtime",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Cherry",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm24",
        "translation": {
           "language": "en"
        }
    }
}

type string

事件類型，固定為session.created。

session object

會話的配置。

屬性

id string

會話的唯一識別碼。

object string

固定為realtime.session。

model string

使用的模型。

modalities array

模型輸出模態設定。

voice string

模型產生音訊音色。

input_audio_format string

輸入音訊格式，固定為pcm16。

output_audio_format string

輸出音訊格式，固定為pcm24。

translation object （可選）

翻譯配置。

屬性

translation string （可選）

設定的翻譯目標語種。

session.updated

收到使用者的 session.update 請求後，若處理成功，則返回此事件；若出錯，則返回 error 事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
    "type": "session.updated",
    "session": {
        "id": "sess_OozZ1vtbPt2muDflHODIH",
        "object": "realtime.session",
        "model": "qwen3-livetranslate-flash-realtime",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Ethan",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm24",
        "translation": {
           "language": "en"
        }
    }
}

type string

事件類型，固定為session.updated。

session object

會話的配置。

屬性

id string

會話的唯一識別碼。

object string

固定為realtime.session。

model string

使用的模型。

modalities array

模型輸出模態設定。

voice string

模型產生音訊音色。

input_audio_format string

輸入音訊格式，固定為pcm16。

output_audio_format string

輸出音訊格式，固定為pcm24。

translation object （可選）

翻譯配置。

屬性

translation string （可選）

設定的翻譯目標語種。

session.finished

會話結束事件，表示當前會話中，所有音頻翻譯已完成。

該事件在用戶端發送session.finish後才會發送，用戶端接收到該事件後可主動中斷連線。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_xxx",
    "type": "session.finished"
}

type string

事件類型，固定為session.finished。

response.created

當服務端產生新的模型響應時，會返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_L8hHVI5jYis6BzAjnPWJh",
    "type": "response.created",
    "response": {
        "id": "resp_P79OOMs8LnrXVpiIHUCKR",
        "object": "realtime.response",
        "conversation_id": "conv_UFClXtYkRkFXrs48y8pmK",
        "status": "in_progress",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Cherry",
        "output_audio_format": "pcm24",
        "output": []
    }
}

type string

事件類型，固定為response.created。

response object

響應對象。

屬性

id string

響應的唯一識別碼。

conversation_id string

當前會話的唯一識別碼。

object string

物件類型，此事件下固定為realtime.response。

status string

響應狀態，取值範圍：

completed（已完成）
failed（失敗）
in_progress（進行中）
incomplete（不完整）

modalities array

響應的模態。

voice string

模型產生音訊音色。

output_audio_format string

輸出音訊格式，固定為pcm24。

output string

此事件下目前為空白。

response.done

響應產生完成後，服務端會返回此事件。事件中的 response 對象包含除原始音頻資料外的全部輸出項。

event_id string

本次事件唯一識別碼。

{
  "event_id": "event_CNea8oXNipVanSg2VIzkO",
  "type": "response.done",
  "response": {
    "id": "resp_TfhYTqej692vsGA2jNEtH",
    "object": "realtime.response",
    "conversation_id": "conv_ZtyLfKVm8XqLwYRlsuDih",
    "status": "completed",
    "modalities": [
      "text",
      "audio"
    ],
    "voice": "Cherry",
    "output_audio_format": "pcm24",
    "output": [
      {
        "id": "item_MKtkMwN9RtcyE9eJShyWy",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
          {
            "type": "audio",
            "transcript": "Hello? "
          }
        ]
      }
    ],
    "usage": {
      "total_tokens": 56,
      "input_tokens": 47,
      "output_tokens": 9,
      "input_tokens_details": {
        "text_tokens": 20,
        "audio_tokens": 27
      },
      "output_tokens_details": {
        "text_tokens": 2,
        "audio_tokens": 7
      }
    }
  }
}

type string

事件類型，固定為response.done。

response object

響應對象。

屬性

id string

響應的唯一識別碼。

conversation_id string

當前會話的唯一識別碼。

object string

物件類型，此事件下固定為realtime.response。

status string

響應的狀態。

modalities array

響應的模態。

voice string

模型產生音訊音色。

output_audio_format string

輸出音訊格式，固定為pcm24。

output object

響應的輸出。

屬性

id string

響應輸出的唯一識別碼。

type string

輸出項的類型，當前固定為message。

object string

輸出項的物件類型，當前固定為realtime.item。

status string

輸出項的狀態。

role string

輸出項的角色。

content array

輸出項的內容。

屬性

type string

輸出內容的類型。輸出為純文字時，為text；輸出包含音頻時，為audio。

text string

輸出的常值內容。

transcript string

音頻轉錄為文字後的內容。

usage object

本次響應的 Token 消耗資訊。

response.text.text

當輸出模態僅包含文本，且模型增量產生新的文本時，服務端將返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_B1lIeyOXR7qJMEExbqtTG",
    "type": "response.text.text",
    "response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
    "item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
    "output_index": 0,
    "content_index": 0,
    "text": "How are"
}

type string

事件類型，固定為response.text.text。

text string

返回的增量文本。

response_id string

回複的ID。

item_id string

訊息項ID，可以關聯同一個訊息項。

output_index integer

目前固定為 0。

content_index integer

目前固定為 0。

response.text.done

當輸出模態僅包含文本，且模型產生的文本結束時，服務端返回此事件。

當響應中斷、不完整或取消時，服務端也會返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_B1lIeE2Nac33zn5V7h2mm",
    "type": "response.text.done",
    "response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
    "item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
    "output_index": 0,
    "content_index": 0,
    "text": "How can I assist you today?"
}

type string

事件類型，固定為response.text.done。

response_id string

響應的唯一識別碼。

item_id string

訊息項的唯一識別碼。

output_indexinteger

目前固定為 0。

content_indexinteger

目前固定為 0。

text string

模型輸出的完整文本。

response.audio.delta

當輸出模態包含音頻，且模型增量產生新的音頻資料時，服務端將返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_B1osWMZBtrEQbiIwW0qHQ",
    "type": "response.audio.delta",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0,
    "delta": "UklGRnoGAABXQVZFZm10IBAAAAAB..."
}

type string

事件類型，固定為response.audio.delta。

response_id string

響應的唯一識別碼。

item_id string

訊息項唯一識別碼。

output_indexinteger

目前固定為 0。

content_indexinteger

目前固定為 0。

delta string

模型增量輸出的音頻資料，使用Base64編碼。

response.audio.done

當輸出模態包含音頻，且模型產生音頻結束時，服務端返回此事件。

當響應中斷、不完整或取消時，服務端也會返回此事件。

該事件不返回完整音頻資料。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_B1osWMWoDRYyITDyNYcBu",
    "type": "response.audio.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0
}

type string

事件類型，固定為response.audio.done。

response_id string

響應的唯一識別碼。

item_id string

訊息項唯一識別碼。

output_indexinteger

目前固定為 0。

content_indexinteger

目前固定為 0。

conversation.item.input_audio_transcription.text

當配置了input_audio_transcription.model參數時，服務端會流式返回輸入音訊語音辨識結果（源語言原文）。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_xxx",
    "type": "conversation.item.input_audio_transcription.text",
    "item_id": "item_xxx",
    "content_index": 0,
    "text": "",
    "stash": "今天天氣真好",
    "language": "zh"
}

type string

事件類型，固定為conversation.item.input_audio_transcription.text。

item_id string

訊息項唯一識別碼。

content_index integer

目前固定為 0。

text string

已確認的識別文本。

stash string

待確認的識別文本（可能會被後續事件修正）。

language string

檢測到的源語種。

conversation.item.input_audio_transcription.completed

當配置了input_audio_transcription.model參數時，語音辨識完成後服務端會返回此事件，包含最終的完整識別結果。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_xxx",
    "type": "conversation.item.input_audio_transcription.completed",
    "item_id": "item_xxx",
    "content_index": 0,
    "transcript": "今天天氣真好，我們一起去公園散步吧。",
    "language": "zh"
}

type string

事件類型，固定為conversation.item.input_audio_transcription.completed。

item_id string

訊息項唯一識別碼。

content_index integer

目前固定為 0。

transcript string

完整的語音辨識結果（源語言原文）。

language string

檢測到的源語種。

response.audio_transcript.text

當輸出模態包含音頻時，服務端可能返回此事件，用於展示即時翻譯內容。

event_id string

本次事件唯一識別碼。

{
  "event_id": "event_xxx",
  "type": "response.audio_transcript.text",
  "response_id": "resp_xxx",
  "item_id": "item_xxx",
  "output_index": 0,
  "content_index": 0,
  "text": "Hello,",
  "stash": " who are you?"
}

type string

事件類型，固定為response.audio_transcript.text。

response_id string

響應的唯一識別碼。

item_id string

訊息項唯一識別碼。

output_indexinteger

目前固定為 0。

content_indexinteger

目前固定為 0。

text string

已確認無誤的翻譯文本片段。

stash string

初步翻譯的臨時文本，與當前 text 拼接後構成臨時翻譯結果；系統會通過 response.audio_transcript.text 事件持續更新 text 和 stash，直至收到response.audio_transcript.done事件，此時可通過 transcript 欄位擷取完整的最終翻譯文本。

response.audio_transcript.done

當輸出模態包含音頻，且模型產生文本結束時，服務端返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_VN4Q4GJugLcc1S23viW8E",
    "type": "response.audio_transcript.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_JvJauNH2CTXb1D9WV6pD4",
    "output_index": 0,
    "content_index": 0,
    "transcript": "How can I assist you today?"
}

type string

事件類型，固定為response.audio_transcript.done。

response_id string

響應的唯一識別碼。

item_id string

訊息項唯一識別碼。

output_indexinteger

目前固定為 0。

content_indexinteger

目前固定為 0。

transcript string

完整文本。

response.output_item.added

在響應產生過程中建立新輸出項時，服務端返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_B4O5yPt3Gjnjy5eYH3plG",
    "type": "response.output_item.added",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "output_index": 0,
    "item": {
        "id": "item_OFaPGtzfWCPyGzxnuEX9i",
        "object": "realtime.item",
        "type": "message",
        "status": "in_progress",
        "role": "assistant",
        "content": []
    }
}

type string

事件類型，固定為response.output_item.added。

response_id string

響應的唯一識別碼。

output_indexinteger

目前固定為 0。

itemobject

輸出項資訊。

屬性

id string

輸出項的唯一識別碼。

type string

固定為 message。

object string

始終為 realtime.item 。

status string

輸出項的狀態。

role string

訊息的角色。

content string

訊息的內容。

response.output_item.done

當新的項輸出完成時，服務端返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_XkiwbYTBC9Wcdwy6uYJ2G",
    "type": "response.output_item.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "output_index": 0,
    "item": {
        "id": "item_JvJauNH2CTXb1D9WV6pD4",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
            {
                "type": "audio",
                "text": "你好，我是阿里雲研發的大規模語言模型，我叫通義千問。有什麼我可以協助你的嗎？"
            }
        ]
    }
}

type string

事件類型，固定為response.output_item.done。

response_id string

響應的唯一識別碼。

output_indexinteger

目前固定為 0。

itemobject

輸出項資訊。

屬性

id string

輸出項的唯一識別碼。

object string

始終為 realtime.item 。

type string

固定為 message。

status string

輸出項的狀態。

role string

發送訊息的角色。

content string

訊息的內容。

response.content_part.added

當新的內容部分輸出時，服務端返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_J2UixwYKZsXg7c9YXZetL",
    "type": "response.content_part.added",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "audio",
        "text": ""
    }
}

type string

事件類型，固定為response.content_part.added。

response_id string

響應的唯一識別碼。

item_id string

訊息項唯一識別碼。

output_indexinteger

目前固定為 0。

content_indexinteger

目前固定為 0。

partobject

輸出項資訊。

屬性

type string

內容部分的類型。

text string

內容部分的文本。

response.content_part.done

當新的內容部分輸出完成時，服務端返回此事件。

event_id string

本次事件唯一識別碼。

{
    "event_id": "event_VN4Q4GJugLcc1S23viW8E",
    "type": "response.content_part.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_JvJauNH2CTXb1D9WV6pD4",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "audio",
        "text": "你好，我是阿里雲研發的大規模語言模型，我叫通義千問。有什麼我可以協助你的嗎？"
    }
}

type string

事件類型，固定為response.content_part.done。

response_id string

響應的唯一識別碼。

item_id string

訊息項唯一識別碼。

output_indexinteger

目前固定為 0。

content_indexinteger

目前固定為 0。

partobject

輸出項資訊。

屬性

type string

內容部分的類型。

text string

內容部分的文本。