All Products
Search
Document Center

Alibaba Cloud Model Studio:Speech synthesis - Qwen

Last Updated:Dec 11, 2025

The Qwen speech synthesis model offers a variety of human-like voices. They support multiple languages and dialects and can generate multilingual content in a single voice. The system automatically adapts its tone and processes complex text fluently.

Model availability

We recommend Qwen3-TTS-Flash.

Qwen3-TTS-Flash offers 49 voices and supports multiple languages and dialects.

Qwen-TTS offers up to 7 voices and supports only Chinese and English.

International (Singapore)

Model

Version

Unit price

Max input characters

Supported languages

Free quota (Note)

qwen3-tts-flash

Capabilities are identical to qwen3-tts-flash-2025-09-18.

Stable

$0.1/10,000 characters

600

Chinese (Mandarin, Beijing, Shanghai, Sichuan, Nanjing, Shaanxi, Minnan, Tianjin, Cantonese), English, Spanish, Russian, Italian, French, Korean, Japanese, German, and Portuguese

If you activate Model Studio before 00:00 on November 13, 2025: 2,000 characters

If you activate Model Studio on or after 00:00 on November 13, 2025: 10,000 characters

Validity: Valid for 90 days after you activate Model Studio.

qwen3-tts-flash-2025-11-27

Snapshot

10,000 characters

Validity: Valid for 90 days after you activate Model Studio.

qwen3-tts-flash-2025-09-18

Snapshot

If you activate Model Studio before 00:00 on November 13, 2025: 2,000 characters

If you activate Model Studio on or after 00:00 on November 13, 2025: 10,000 characters

Validity: Valid for 90 days after you activate Model Studio.

Billing is based on the number of input characters. The calculation rules are as follows:

  • Each Chinese character (including simplified/traditional Chinese, Japanese Kanji, and Korean Hanja) is counted as 2 characters.

  • Each other character, such as an English letter, a punctuation mark, or a space, is counted as 1 character.

China (Beijing)

Qwen3-TTS-Flash

Model

Version

Unit price

Max input characters

Supported languages

qwen3-tts-flash

Same capabilities as qwen3-tts-flash-2025-09-18

Stable

$0.114682/10,000 characters

600

Chinese (Mandarin, Beijing, Shanghai, Sichuan, Nanjing, Shaanxi, Minnan, Tianjin, Cantonese), English, Spanish, Russian, Italian, French, Korean, Japanese, German, Portuguese

qwen3-tts-flash-2025-11-27

Snapshot

qwen3-tts-flash-2025-09-18

Snapshot

Billing is based on the number of input characters. The calculation rules are as follows:

  • Each Chinese character (including simplified/traditional Chinese, Japanese Kanji, and Korean Hanja) is counted as 2 characters.

  • Each other character, such as an English letter, a punctuation mark, or a space, is counted as 1 character.

Qwen-TTS

Model

Version

Context window

Max input

Max output

Input cost

Output cost

(tokens)

(1,000 tokens)

qwen-tts

Same capabilities as qwen-tts-2025-04-10.

Stable

8,192

512

7,680

$0.230

$1.434

qwen-tts-latest

Same capabilities as the latest snapshot version.

Latest

qwen-tts-2025-05-22

Snapshot

qwen-tts-2025-04-10

Audio is converted to tokens at a rate of 50 tokens per second. Audio clips shorter than 1 second are billed as 50 tokens.

Features

Features

Qwen3-TTS-Flash

Qwen-TTS

Connection type

Java/Python SDK, RESTful API

Streaming output

Support

Streaming input

Not supported

Synthesized audio format

  • wav

  • Base64-encoded PCM for streaming output

Synthesized audio sample rate

24 kHz

Timestamp

Not supported

Language

Chinese (Mandarin, Beijing, Shanghai, Sichuan, Nanjing, Shaanxi, Minnan, Tianjin, Cantonese), English, Spanish, Russian, Italian, French, Korean, Japanese, German, Portuguese. Varies by voice. For more information, see Supported voices

Chinese (Mandarin, Beijing, Shanghai, Sichuan), English. Varies by model and voice. For more information, see Supported voices

Voice cloning

Not supported

SSML

Not supported

Getting started

Preparations

Key parameters

  • text: Specifies the text.

  • voice: The voice to use.

  • language_type: The language for the synthesized audio. Valid values are Chinese, English, German, Italian, Portuguese, Spanish, Japanese, Korean, French, Russian, and Auto (default).

Use the returned url to retrieve the synthesized audio. The URL is valid for 24 hours.

Python

# DashScope SDK version 1.24.6 or later
import os
import dashscope

# This is the URL for the Singapore region. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

text = "Today is a wonderful day to build something people love!"
# To use the SpeechSynthesizer interface: dashscope.audio.qwen_tts.SpeechSynthesizer.call(...)
response = dashscope.MultiModalConversation.call(
    model="qwen3-tts-flash",
    # API keys for the Singapore and China (Beijing) regions are different. To get an API key: https://www.alibabacloud.com/help/en/model-studio/get-api-key
    # If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key = "sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    text=text,
    voice="Cherry",
    language_type="English", # We recommend that this matches the text language to ensure correct pronunciation and natural intonation.
    stream=False
)
print(response)

Java

You must import the Gson dependency. If you use Maven or Gradle, you can add the dependency as follows:

Maven

Add the following content to pom.xml:

<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.13.1</version>
</dependency>

Gradle

Add the following content to build.gradle:

// https://mvnrepository.com/artifact/com.google.code.gson/gson
implementation("com.google.code.gson:gson:2.13.1")
// DashScope SDK version 2.21.9 or later
// Versions 2.20.7 and later support the Dylan, Jada, and Sunny voices.
import com.alibaba.dashscope.aigc.multimodalconversation.AudioParameters;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.protocol.Protocol;
import com.alibaba.dashscope.utils.Constants;

import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;

public class Main {
    private static final String MODEL = "qwen3-tts-flash";
    public static void call() throws ApiException, NoApiKeyException, UploadFileException {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                // API keys for the Singapore and China (Beijing) regions are different. To get an API key: https://www.alibabacloud.com/help/en/model-studio/get-api-key
                // If you have not configured an environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model(MODEL)
                .text("Today is a wonderful day to build something people love!")
                .voice(AudioParameters.Voice.CHERRY)
                .languageType("English") // We recommend that this matches the text language to ensure correct pronunciation and natural intonation.
                .build();
        MultiModalConversationResult result = conv.call(param);
        String audioUrl = result.getOutput().getAudio().getUrl();
        System.out.print(audioUrl);

        // Download the audio file locally
        try (InputStream in = new URL(audioUrl).openStream();
             FileOutputStream out = new FileOutputStream("downloaded_audio.wav")) {
            byte[] buffer = new byte[1024];
            int bytesRead;
            while ((bytesRead = in.read(buffer)) != -1) {
                out.write(buffer, 0, bytesRead);
            }
            System.out.println("\nAudio file downloaded to local path: downloaded_audio.wav");
        } catch (Exception e) {
            System.out.println("\nError downloading audio file: " + e.getMessage());
        }
    }
    public static void main(String[] args) {
        // This is the URL for the Singapore region. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
        Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
        try {
            call();
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

cURL

# ======= Important =======
# This is the URL for the Singapore region. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# API keys for the Singapore and China (Beijing) regions are different. To get an API key: https://www.alibabacloud.com/help/en/model-studio/get-api-key
# === Delete this comment before execution ===

curl -X POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
    "model": "qwen3-tts-flash",
    "input": {
        "text": "Today is a wonderful day to build something people love!",
        "voice": "Cherry",
        "language_type": "English"
    }
}'

Real-time playback

Stream audio data in Base64 format. The last packet contains the URL for the complete audio file.

Python

# DashScope SDK version 1.24.6 or later
# coding=utf-8
#
# Installation instructions for pyaudio:
# APPLE Mac OS X
#   brew install portaudio
#   pip install pyaudio
# Debian/Ubuntu
#   sudo apt-get install python-pyaudio python3-pyaudio
#   or
#   pip install pyaudio
# CentOS
#   sudo yum install -y portaudio portaudio-devel && pip install pyaudio
# Microsoft Windows
#   python -m pip install pyaudio

import os
import dashscope
import pyaudio
import time
import base64
import numpy as np

# This is the URL for the Singapore region. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

p = pyaudio.PyAudio()
# Create an audio stream
stream = p.open(format=pyaudio.paInt16,
                channels=1,
                rate=24000,
                output=True)


text = "Today is a wonderful day to build something people love!"
response = dashscope.MultiModalConversation.call(
    # API keys for the Singapore and China (Beijing) regions are different. To get an API key: https://www.alibabacloud.com/help/en/model-studio/get-api-key
    # If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key = "sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="qwen3-tts-flash",
    text=text,
    voice="Cherry",
    language_type="English", # We recommend that this matches the text language to ensure correct pronunciation and natural intonation.
    stream=True
)

for chunk in response:
    if chunk.output is not None:
      audio = chunk.output.audio
      if audio.data is not None:
          wav_bytes = base64.b64decode(audio.data)
          audio_np = np.frombuffer(wav_bytes, dtype=np.int16)
          # Play the audio data directly
          stream.write(audio_np.tobytes())
      if chunk.output.finish_reason == "stop":
          print("finish at: {} ", chunk.output.audio.expires_at)
time.sleep(0.8)
# Clean up resources
stream.stop_stream()
stream.close()
p.terminate()

Java

You must import the Gson dependency. If you use Maven or Gradle, you can add the dependency as follows:

Maven

Add the following content to pom.xml:

<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.13.1</version>
</dependency>

Gradle

Add the following content to build.gradle:

// https://mvnrepository.com/artifact/com.google.code.gson/gson
implementation("com.google.code.gson:gson:2.13.1")
// Install the latest version of the DashScope SDK
// Versions 2.20.7 and later support the Dylan, Jada, and Sunny voices.
import com.alibaba.dashscope.aigc.multimodalconversation.AudioParameters;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.protocol.Protocol;
import com.alibaba.dashscope.utils.Constants;
import io.reactivex.Flowable;
import javax.sound.sampled.*;
import java.util.Base64;

public class Main {
    private static final String MODEL = "qwen3-tts-flash";
    public static void streamCall() throws ApiException, NoApiKeyException, UploadFileException {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                // API keys for the Singapore and China (Beijing) regions are different. To get an API key: https://www.alibabacloud.com/help/en/model-studio/get-api-key
                // If you have not configured an environment variable, replace the following line with your Model Studio API key: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model(MODEL)
                .text("Today is a wonderful day to build something people love!")
                .voice(AudioParameters.Voice.CHERRY)
                .languageType("English") // We recommend that this matches the text language to ensure correct pronunciation and natural intonation.
                .build();
        Flowable<MultiModalConversationResult> result = conv.streamCall(param);
        result.blockingForEach(r -> {
            try {
                // 1. Get the Base64-encoded audio data
                String base64Data = r.getOutput().getAudio().getData();
                byte[] audioBytes = Base64.getDecoder().decode(base64Data);

                // 2. Configure the audio format (adjust based on the format returned by the API)
                AudioFormat format = new AudioFormat(
                        AudioFormat.Encoding.PCM_SIGNED,
                        24000, // Sample rate (must match the format returned by the API)
                        16,    // Audio bit depth
                        1,     // Number of sound channels
                        2,     // Frame size (bit depth / 8)
                        16000, // Data transfer rate
                        false  // Is compressed
                );

                // 3. Play the audio data in real time
                DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);
                try (SourceDataLine line = (SourceDataLine) AudioSystem.getLine(info)) {
                    if (line != null) {
                        line.open(format);
                        line.start();
                        line.write(audioBytes, 0, audioBytes.length);
                        line.drain();
                    }
                }
            } catch (LineUnavailableException e) {
                e.printStackTrace();
            }
        });
    }
    public static void main(String[] args) {
        // This is the URL for the Singapore region. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
        Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
        try {
            streamCall();
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

cURL

# ======= Important =======
# This is the URL for the Singapore region. If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
# API keys for the Singapore and China (Beijing) regions are different. To get an API key: https://www.alibabacloud.com/help/en/model-studio/get-api-key
# === Delete this comment before execution ===

curl -X POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
    "model": "qwen3-tts-flash",
    "input": {
        "text": "Today is a wonderful day to build something people love!",
        "voice": "Cherry",
        "language_type": "Chinese"
    }
}'

Supported voices

Qwen3-TTS-Flash

Supported voices vary by models. Set the voice request parameter to the value in the voice parameter column in the table.

View supported voices of qwen3-tts-flash-2025-11-27

Name

voice parameter

Sample

Description

Supported languages

Cherry

Cherry

A cheerful, friendly, and natural young woman's voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Serena

Serena

Gentle young woman

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Serena

Ethan

Standard Mandarin with a slight northern accent. A bright, warm, and energetic voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Chelsie

Chelsie

An anime-style virtual girlfriend voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Momo

Momo

A playful and cute voice to cheer you up.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Vivian

Vivian

A cool, cute, and slightly grumpy voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Moon

Moon

The dashing and carefree Yuebai

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Maia

Maia

A voice that blends intelligence and gentleness.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Kai

Kai

A spa for your ears.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Nofish

Nofish

A designer who does not use retroflex consonants.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Bella

Bella

A loli who drinks but does not practice Drunken Fist.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Jennifer

Jennifer

A premium, cinematic American English female voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Ryan

Ryan

A rhythmic, dramatic voice with realism and tension.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Katerina

Katerina

A mature and rhythmic female voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Aiden

Aiden

An American young man who is a great cook.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Eldric Sage

Eldric Sage

A calm, wise, and weathered old man's voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Mia

Mia

As gentle as spring water, as quiet as the first snow

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Mochi

Mochi

A smart and precocious child's voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Bellona

Bellona

A powerful and sonorous voice with clear articulation that brings characters to life and stirs passion in the listener.

The clash of swords and the thunder of hooves echo in your mind, as a world of a thousand voices unfolds in perfectly clear and resonant tones.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Vincent

Vincent

A unique, raspy voice that evokes epic tales of heroism.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Meng Xiaoji

Bunny

A little loli brimming with moe appeal.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Neil

Neil

A professional news anchor's voice with a clear, steady tone.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Elias

Elias

Explains complex topics with academic rigor and clear storytelling.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Arthur

Arthur

A rustic, weathered voice of an old storyteller.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Nini

Nini

A voice as soft and sweet as a mochi. Its drawn-out calls of 'older brother' are heart-meltingly sweet.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Ebona

Ebona

A whispering voice that unlocks your deepest childhood fears.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Seren

Seren

A gentle and soothing voice to help you fall asleep. Good night and sweet dreams.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Pip

Pip

Mischievous yet full of childlike innocence. Is this the Shin-chan you remember?

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Stella

Stella

A sweet magical girl voice that is both ditsy and powerful.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Bodega

Bodega

Enthusiastic Spanish uncle

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Sonisha

Sonrisa

A warm and cheerful Latin American lady

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Alek

Alek

A Russian voice that is both cool and warm.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Dolce

Dolce

Lazy Italian uncle

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Sohee

Sohee

A gentle, cheerful, and expressive Korean older sister's voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Ono Anna

Ono Anna

A quirky and clever childhood friend's voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Lenn

Lenn

Rational at the core and rebellious in the details—a young German who wears a suit and listens to post-punk.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Emilien

Emilien

A romantic French gentleman

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Andre

Andre

A magnetic, natural, and calm male voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Radio Gol

Radio Gol

Football Poet Rádio Gol! Today, I will use names to call the football game for you.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Shanghai-Jada

Jada

A lively woman from Shanghai.

Chinese (Shanghainese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Beijing-Dylan

Dylan

A teenager who grew up in the hutongs of Beijing.

Chinese (Beijing dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Nanjing-Li

Li

A patient yoga teacher.

Chinese (Nanjing dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Shaanxi-Marcus

Marcus

A sincere and deep voice from Shaanxi.

Chinese (Shaanxi dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Minnan-Roy

Roy

A humorous, straightforward, and lively young man from Taiwan.

Chinese (Min Nan), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Tianjin-Peter

Peter

A voice for the straight man in Tianjin crosstalk.

Chinese (Tianjin dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Sichuan-Sunny

Sunny

A sweet Sichuan girl's voice that will melt your heart.

Chinese (Sichuanese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Sichuan-Eric

Eric

A man from Chengdu, Sichuan, who has risen above the mundane.

Chinese (Sichuanese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Cantonese-Rocky

Rocky

A witty and humorous male voice for online chats.

Chinese (Cantonese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Cantonese-Kiki

Kiki

A sweet best friend from Hong Kong.

Chinese (Cantonese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

View supported voices of qwen3-tts-flash, qwen3-tts-flash-2025-09-18

Name

voice parameter

Sample

Description

Supported languages

Cherry

Cherry

A cheerful, friendly, and natural young woman's voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Ethan

Ethan

Standard Mandarin with a slight northern accent. A bright, warm, and energetic voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Nofish

Nofish

A designer who does not use retroflex consonants.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Jennifer

Jennifer

A premium, cinematic American English female voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Ryan

Ryan

A rhythmic, dramatic voice with realism and tension.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Katerina

Katerina

A mature and rhythmic female voice.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Elias

Elias

Explains complex topics with academic rigor and clear storytelling.

Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Shanghai-Jada

Jada

A lively woman from Shanghai.

Chinese (Shanghainese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Beijing-Dylan

Dylan

A teenager who grew up in the hutongs of Beijing.

Chinese (Beijing dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Sichuan-Sunny

Sunny

A sweet Sichuan girl's voice that will melt your heart.

Chinese (Sichuanese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Nanjing-Li

Li

A patient yoga teacher.

Chinese (Nanjing dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Shaanxi-Marcus

Marcus

A sincere and deep voice from Shaanxi.

Chinese (Shaanxi dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Minnan-Roy

Roy

A humorous, straightforward, and lively young man from Taiwan.

Chinese (Min Nan), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Tianjin-Peter

Peter

A voice for the straight man in Tianjin crosstalk.

Chinese (Tianjin dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Cantonese-Rocky

Rocky

A witty and humorous male voice for online chats.

Chinese (Cantonese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Cantonese-Kiki

Kiki

A sweet best friend from Hong Kong.

Chinese (Cantonese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Sichuan - Eric

Eric

An extraordinary man from Chengdu, Sichuan

Chinese (Sichuanese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean

Qwen-TTS

Supported voices vary by models. Set the voice request parameter to the value in the voice parameter column in the table.

View supported voices of qwen-tts, qwen-tts-2025-05-22, qwen-tts-2025-04-10

Name

voice parameter

Sample

Description

Supported languages

Cherry

Cherry

A cheerful, friendly, and natural young woman's voice.

Chinese, English

Serena

Serena

Gentle Lady

Chinese, English

Ethan

Ethan

Standard Mandarin with a slight northern accent. A bright, warm, and energetic voice.

Chinese, English

Chelsie

Chelsie

An anime-style virtual girlfriend voice.

Chinese, English

View supported voices of qwen-tts-latest, qwen-tts-2025-05-22

Name

voice parameter

Sample

Description

Supported languages

Cherry

Cherry

A cheerful, friendly, and natural young woman's voice.

Chinese, English

Serena

Serena

A gentle young woman's voice.

Chinese, English

Serena

Ethan

Standard Mandarin with a slight northern accent. A bright, warm, and energetic voice.

Chinese, English

Chelsie

Chelsie

An anime-style virtual girlfriend voice.

Chinese, English

Sichuan-Sunny

Sunny

A sweet Sichuan girl's voice that will melt your heart.

Chinese (Sichuanese), English

Shanghai-Jada

Jada

A lively woman from Shanghai.

Chinese (Shanghainese), English

Beijing-Dylan

Dylan

A teenager who grew up in the hutongs of Beijing.

Chinese (Beijing dialect), English

API reference

For more information, see Speech synthesis (Qwen-TTS).

FAQ

Q: How long is the audio file URL valid?

A: The audio file URL expires after 24 hours.