Parameter | Type | Description |
type | string | The event type. The value is fixed to session.update. |
event_id | string | The identifier for this event. |
session | object | The session configuration. |
session.mode | string | The interaction mode. Valid values: server_commit (default) commit
|
session.voice | string | The voice used for speech synthesis. For more information, see Supported voices. Supports system and custom voices: |
session.language_type | string | Specifies the language of the synthesized audio. The default value is Auto. Auto: Use this value when the language of the text is uncertain or the text contains multiple languages. The model automatically matches the pronunciation for segments in different languages, but cannot guarantee perfect accuracy.
Specific language: Specify the specific language when the text is in a single language. This significantly improves synthesis quality and typically yields better results than Auto. Valid values include the following: Chinese
English
German
Italian
Portuguese
Spanish
Japanese
Korean
French
Russian
|
session.response_format | string | The format of the audio output from the model. Supported formats: "pcm" (default) "wav" "mp3" "opus"
Qwen-TTS-Realtime (see Supported models) supports only pcm. |
session.sample_rate | integer | The sample rate (in Hz) of the audio output from the model. Supported sample rates: 8000 16000 24000 (default) 48000
Qwen-TTS-Realtime (see Supported models) supports only 24000. |
session.speech_rate | float | The speech rate of the audio. A value of 1.0 indicates a normal speed. A value less than 1.0 indicates a slower speed, and a value greater than 1.0 indicates a faster speed. Default value: 1.0. Valid values: [0.5, 2.0]. Qwen-TTS-Realtime (see Supported models) does not support this parameter. |
session.volume | integer | The volume of the audio. Default value: 50. Value range: [0, 100]. Qwen-TTS-Realtime (see Supported models) does not support this parameter. |
session.pitch_rate | float | The pitch of the synthesized audio. Default value: 1.0. Value range: [0.5, 2.0]. Qwen-TTS-Realtime (see Supported models) does not support this parameter. |
session.bit_rate | integer | Specifies the bitrate (in kbps) of the audio. A higher bitrate results in better audio quality and a larger file size. This parameter is available only when the audio format (response_format) is set to opus. Default value: 128. Value range: [6, 510]. Qwen-TTS-Realtime (see Supported models) does not support this parameter. |