千問聲音設計API參考 - Alibaba Cloud Model Studio

聲音設計通過文本描述產生定製化音色，支援多語言和多維度音色特徵定義，適用於廣告配音、角色塑造、有聲內容創作等多種應用。聲音設計與語音合成是前後關聯的兩個步驟。本文檔聚焦於介紹聲音設計的參數和介面細節，語音合成請參見即時語音合成-千問或語音合成-千問。

使用者指南：關於模型介紹和選型建議請參見即時語音合成-千問或語音合成-千問。

語言支援說明

聲音設計服務支援多語言音色建立和語音合成，覆蓋如下語言：中文（zh）、英文（en）、德語（de）、意大利語（it）、葡萄牙語（pt）、西班牙語（es）、日語（ja）、韓語（ko）、法語（fr）、俄語（ru）。

如何編寫高品質的聲音描述？

要求與限制

在編寫聲音描述（voice_prompt）時，請務必遵循以下技術約束：

長度限制：voice_prompt 的內容長度不得超過 2048 個字元。
支援語言：描述文本僅支援中文和英文。

核心原則

高品質的聲音描述（voice_prompt）是成功建立理想音色的關鍵。它如同聲音設計的“藍圖”，直接指導模型產生具有特定特徵的聲音。

請遵循以下核心原則對聲音進行描述：

具體而非模糊：使用能夠描繪具體聲音特質的詞語，如“低沉”、“清脆”、“語速偏快”。避免使用“好聽”、“普通”等主觀且缺乏資訊量的詞彙。
多維而非單一：優秀的描述通常結合多個維度（如下文所述的性別、年齡、情感等）。單一維度描述（如僅“女聲”）過於寬泛，難以產生特色鮮明的音色。
客觀而非主觀：專註於聲音本身的物理和感知特徵，而不是個人的喜好。例如，用“音調偏高，帶有活力”代替“我最喜歡的聲音”。
原創而非模仿：請描述聲音的特質，而不是要求模仿特定人物（如名人、演員）。此類請求涉及著作權風險且模型不支援直接模仿。
簡潔而非冗餘：確保每個詞都有其意義。避免重複使用同義字或無意義的強調詞（如“非常非常棒的聲音”）。

描述維度參考

維度	描述樣本
性別	男性、女性、中性
年齡	兒童 (5-12歲)、青少年 (13-18歲)、青年 (19-35歲)、中年 (36-55歲)、老年 (55歲以上)
音調	高音、中音、低音、偏高、偏低
語速	快速、中速、緩慢、偏快、偏慢
情感	開朗、沉穩、溫柔、嚴肅、活潑、冷靜、治癒
特點	有磁性、清脆、沙啞、圓潤、甜美、渾厚、有力
用途	新聞播報、廣告配音、有聲書、動畫角色、語音助手、紀錄片解說

樣本對比

✅ 推薦樣本

“年輕活潑的女性聲音，語速較快，帶有明顯的上揚語調，適合介紹時尚產品。”
分析：結合了年齡、性格、語速和語調，並指明了適用情境，形象立體。
“沉穩的中年男性，語速緩慢，音色低沉有磁性，適合朗讀新聞或紀錄片解說。”
分析：清晰定義了性別、年齡段、語速、音色特點和應用領域。
“可愛的兒童聲音，大約8歲女孩，說話略帶稚氣，適合動畫角色配音。”
分析：精確到具體年齡和聲音特質（稚氣），目標明確。
“溫柔知性的女性，30歲左右，語調平和，適合有聲書朗讀。”
分析：通過“知性”、“平和”等詞彙，有效傳遞了聲音的情感和風格。

❌ 不推薦樣本與改進建議

不推薦樣本	主要問題	改進建議
好聽的聲音	過於模糊，主觀性強，缺乏可執行檔特徵。	添加具體維度，如：“聲線清澈的青年女聲，語調溫柔”。
像某明星的聲音	涉及著作權風險，模型無法直接模仿。	提取其聲音特質進行描述，如：“聲音成熟、富有磁性、語速沉穩的男聲”。
非常非常非常好聽的女聲	資訊冗餘，重複詞彙無助於定義音色。	移除重複詞，並增加有效描述，如：“一個20~24歲，語氣輕快、音調活潑、音色甜美的女聲”。
123456	無效輸入，無法解析為聲音特徵。	請提供有意義的文本描述，參考上方的推薦樣本。

快速開始：從聲音設計到語音合成

1. 工作流程

聲音設計與語音合成是緊密關聯的兩個獨立步驟，遵循“先建立，後使用”的流程：

準備聲音設計所需的聲音描述與試聽文本。
- 聲音描述（voice_prompt）：定義目標音色的特徵（關於如何編寫請參見“如何編寫高品質的聲音描述？”）。
- 試聽文本（preview_text）：目標音色產生的預覽音頻朗讀的內容（如“大家好，歡迎收聽”）。
調用建立音色介面，建立一個專屬音色，擷取音色名和預覽音頻。
此步驟必須指定target_model，聲明建立的音色將由哪個語音合成模型驅動
試聽擷取預覽音頻來判斷是否符合預期；若符合要求，繼續下一步，否則，重新設計。
若已有建立好的音色（調用查詢音色列表介面查看），可跳過這一步直接進行下一步。
使用音色進行語音合成。
調用語音合成介面，傳入上一步獲得的音色。此步驟指定的語音合成模型必須和上一步的target_model一致。

2. 模型配置與準備工作

選擇合適的模型並完成準備工作。

模型配置

聲音設計時需要指定以下兩個模型：

聲音設計模型：qwen-voice-design
驅動音色的語音合成模型（兩類）：
- 千問3-TTS-VD-Realtime（參見即時語音合成-千問）：
  - qwen3-tts-vd-realtime-2026-01-15
  - qwen3-tts-vd-realtime-2025-12-16
- 千問3-TTS-VD（參見語音合成-千問）：
  - qwen3-tts-vd-2026-01-26

準備工作

擷取API Key：擷取API Key與API Host，為安全起見，推薦將API Key配置到環境變數。
安裝SDK：確保已安裝最新版DashScope SDK。

3. 範例程式碼

雙向流式合成

適用於千問3-TTS-VC-Realtime系列模型，更多說明請參見即時語音合成-千問。

產生專屬音色並試聽效果，若對效果滿意，進行下一步；否則重建。

Python

import requests
import base64
import os

def create_voice_and_play():
    # 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
    api_key = os.getenv("DASHSCOPE_API_KEY")
    
    if not api_key:
        print("錯誤: 未找到DASHSCOPE_API_KEY環境變數，請先設定API Key")
        return None, None, None
    
    # 準備請求資料
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": "qwen-voice-design",
        "input": {
            "action": "create",
            "target_model": "qwen3-tts-vd-realtime-2026-01-15",
            "voice_prompt": "A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, is suitable for news broadcasting or documentary commentary.",
            "preview_text": "Dear listeners, hello everyone. Welcome to the evening news.",
            "preferred_name": "announcer",
            "language": "en"
        },
        "parameters": {
            "sample_rate": 24000,
            "response_format": "wav"
        }
    }
    
    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
    url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
    
    try:
        # 發送請求
        response = requests.post(
            url,
            headers=headers,
            json=data,
            timeout=60  # 添加逾時設定
        )
        
        if response.status_code == 200:
            result = response.json()
            
            # 擷取音色名稱
            voice_name = result["output"]["voice"]
            print(f"音色名稱: {voice_name}")
            
            # 擷取預覽音頻資料
            base64_audio = result["output"]["preview_audio"]["data"]
            
            # 解碼Base64音頻資料
            audio_bytes = base64.b64decode(base64_audio)
            
            # 儲存音頻檔案到本地
            filename = f"{voice_name}_preview.wav"
            
            # 將音頻資料寫入本地檔案
            with open(filename, 'wb') as f:
                f.write(audio_bytes)
            
            print(f"音頻已儲存到本地檔案: {filename}")
            print(f"檔案路徑: {os.path.abspath(filename)}")
            
            return voice_name, audio_bytes, filename
        else:
            print(f"請求失敗，狀態代碼: {response.status_code}")
            print(f"響應內容: {response.text}")
            return None, None, None
            
    except requests.exceptions.RequestException as e:
        print(f"網路請求發生錯誤: {e}")
        return None, None, None
    except KeyError as e:
        print(f"響應資料格式錯誤，缺少必要的欄位: {e}")
        print(f"響應內容: {response.text if 'response' in locals() else 'No response'}")
        return None, None, None
    except Exception as e:
        print(f"發生未知錯誤: {e}")
        return None, None, None

if __name__ == "__main__":
    print("開始建立語音...")
    voice_name, audio_data, saved_filename = create_voice_and_play()
    
    if voice_name:
        print(f"\n成功建立音色 '{voice_name}'")
        print(f"音頻檔案已儲存: '{saved_filename}'")
        print(f"檔案大小: {os.path.getsize(saved_filename)} 位元組")
    else:
        print("\n音色建立失敗")

Java

需要匯入Gson依賴，若是使用Maven或者Gradle，添加依賴方式如下：

Maven

在pom.xml中添加如下內容：

<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.13.1</version>
</dependency>

Gradle

在build.gradle中添加如下內容：

// https://mvnrepository.com/artifact/com.google.code.gson/gson
implementation("com.google.code.gson:gson:2.13.1")

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Base64;

public class Main {
    public static void main(String[] args) {
        Main example = new Main();
        example.createVoice();
    }

    public void createVoice() {
        // 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        // 建立JSON請求體字串
        String jsonBody = "{\n" +
                "    \"model\": \"qwen-voice-design\",\n" +
                "    \"input\": {\n" +
                "        \"action\": \"create\",\n" +
                "        \"target_model\": \"qwen3-tts-vd-realtime-2026-01-15\",\n" +
                "        \"voice_prompt\": \"A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, is suitable for news broadcasting or documentary commentary.\",\n" +
                "        \"preview_text\": \"Dear listeners, hello everyone. Welcome to the evening news.\",\n" +
                "        \"preferred_name\": \"announcer\",\n" +
                "        \"language\": \"en\"\n" +
                "    },\n" +
                "    \"parameters\": {\n" +
                "        \"sample_rate\": 24000,\n" +
                "        \"response_format\": \"wav\"\n" +
                "    }\n" +
                "}";

        HttpURLConnection connection = null;
        try {
            // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
            URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
            connection = (HttpURLConnection) url.openConnection();

            // 佈建要求方法和頭部
            connection.setRequestMethod("POST");
            connection.setRequestProperty("Authorization", "Bearer " + apiKey);
            connection.setRequestProperty("Content-Type", "application/json");
            connection.setDoOutput(true);
            connection.setDoInput(true);

            // 發送請求體
            try (OutputStream os = connection.getOutputStream()) {
                byte[] input = jsonBody.getBytes("UTF-8");
                os.write(input, 0, input.length);
                os.flush();
            }

            // 擷取響應
            int responseCode = connection.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // 讀取響應內容
                StringBuilder response = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        response.append(responseLine.trim());
                    }
                }

                // 解析JSON響應
                JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();
                JsonObject outputObj = jsonResponse.getAsJsonObject("output");
                JsonObject previewAudioObj = outputObj.getAsJsonObject("preview_audio");

                // 擷取音色名稱
                String voiceName = outputObj.get("voice").getAsString();
                System.out.println("音色名稱: " + voiceName);

                // 擷取Base64編碼的音頻資料
                String base64Audio = previewAudioObj.get("data").getAsString();

                // 解碼Base64音頻資料
                byte[] audioBytes = Base64.getDecoder().decode(base64Audio);

                // 儲存音頻到本地檔案
                String filename = voiceName + "_preview.wav";
                saveAudioToFile(audioBytes, filename);

                System.out.println("音頻已儲存到本地檔案: " + filename);

            } else {
                // 讀取錯誤響應
                StringBuilder errorResponse = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        errorResponse.append(responseLine.trim());
                    }
                }

                System.out.println("請求失敗，狀態代碼: " + responseCode);
                System.out.println("錯誤響應: " + errorResponse.toString());
            }

        } catch (Exception e) {
            System.err.println("請求發生錯誤: " + e.getMessage());
            e.printStackTrace();
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
    }

    private void saveAudioToFile(byte[] audioBytes, String filename) {
        try {
            File file = new File(filename);
            try (FileOutputStream fos = new FileOutputStream(file)) {
                fos.write(audioBytes);
            }
            System.out.println("音頻已儲存到: " + file.getAbsolutePath());
        } catch (IOException e) {
            System.err.println("儲存音頻檔案時發生錯誤: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

使用上一步產生的專屬音色進行語音合成。

這裡參考了使用系統音色進行語音合成DashScope SDK的“server commit模式”範例程式碼，將voice參數替換為聲音設計產生的專屬音色進行語音合成。

關鍵原則：聲音設計時使用的模型 (target_model) 必須與後續進行語音合成時使用的模型 (model) 保持一致，否則會導致合成失敗。

Python

# coding=utf-8
# Installation instructions for pyaudio:
# APPLE Mac OS X
#   brew install portaudio
#   pip install pyaudio
# Debian/Ubuntu
#   sudo apt-get install python-pyaudio python3-pyaudio
#   or
#   pip install pyaudio
# CentOS
#   sudo yum install -y portaudio portaudio-devel && pip install pyaudio
# Microsoft Windows
#   python -m pip install pyaudio

import pyaudio
import os
import base64
import threading
import time
import dashscope  # DashScope Python SDK 版本需要不低於1.23.9
from dashscope.audio.qwen_tts_realtime import QwenTtsRealtime, QwenTtsRealtimeCallback, AudioFormat

# ======= 常量配置 =======
TEXT_TO_SYNTHESIZE = [
    '對吧~我就特別喜歡這種超市，',
    '尤其是過年的時候',
    '去逛超市',
    '就會覺得',
    '超級超級開心！',
    '想買好多好多的東西呢！'
]

def init_dashscope_api_key():
    """
    初始化 dashscope SDK 的 API key
    """
    # 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：dashscope.api_key = "sk-xxx"
    dashscope.api_key = os.getenv("DASHSCOPE_API_KEY")

# ======= 回調類 =======
class MyCallback(QwenTtsRealtimeCallback):
    """
    自訂 TTS 流式回調
    """
    def __init__(self):
        self.complete_event = threading.Event()
        self._player = pyaudio.PyAudio()
        self._stream = self._player.open(
            format=pyaudio.paInt16, channels=1, rate=24000, output=True
        )

    def on_open(self) -> None:
        print('[TTS] 串連已建立')

    def on_close(self, close_status_code, close_msg) -> None:
        self._stream.stop_stream()
        self._stream.close()
        self._player.terminate()
        print(f'[TTS] 串連關閉 code={close_status_code}, msg={close_msg}')

    def on_event(self, response: dict) -> None:
        try:
            event_type = response.get('type', '')
            if event_type == 'session.created':
                print(f'[TTS] 會話開始: {response["session"]["id"]}')
            elif event_type == 'response.audio.delta':
                audio_data = base64.b64decode(response['delta'])
                self._stream.write(audio_data)
            elif event_type == 'response.done':
                print(f'[TTS] 響應完成, Response ID: {qwen_tts_realtime.get_last_response_id()}')
            elif event_type == 'session.finished':
                print('[TTS] 會話結束')
                self.complete_event.set()
        except Exception as e:
            print(f'[Error] 處理回調事件異常: {e}')

    def wait_for_finished(self):
        self.complete_event.wait()

# ======= 主執行邏輯 =======
if __name__ == '__main__':
    init_dashscope_api_key()
    print('[系統] 初始化 Qwen TTS Realtime ...')

    callback = MyCallback()
    qwen_tts_realtime = QwenTtsRealtime(
        # 聲音設計、語音合成要使用相同的模型
        model="qwen3-tts-vd-realtime-2026-01-15",
        callback=callback,
        # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：wss://dashscope.aliyuncs.com/api-ws/v1/realtime
        url='wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime'
    )
    qwen_tts_realtime.connect()
    
    qwen_tts_realtime.update_session(
        voice="myvoice", # 將voice參數替換為聲音設計產生的專屬音色
        response_format=AudioFormat.PCM_24000HZ_MONO_16BIT,
        mode='server_commit'
    )

    for text_chunk in TEXT_TO_SYNTHESIZE:
        print(f'[發送文本]: {text_chunk}')
        qwen_tts_realtime.append_text(text_chunk)
        time.sleep(0.1)

    qwen_tts_realtime.finish()
    callback.wait_for_finished()

    print(f'[Metric] session_id={qwen_tts_realtime.get_session_id()}, '
          f'first_audio_delay={qwen_tts_realtime.get_first_audio_delay()}s')

Java

import com.alibaba.dashscope.audio.qwen_tts_realtime.*;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.google.gson.JsonObject;

import javax.sound.sampled.*;
import java.io.*;
import java.util.Base64;
import java.util.Queue;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.atomic.AtomicReference;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.atomic.AtomicBoolean;

public class Main {
    // ===== 常量定義 =====
    private static String[] textToSynthesize = {
            "對吧~我就特別喜歡這種超市",
            "尤其是過年的時候",
            "去逛超市",
            "就會覺得",
            "超級超級開心！",
            "想買好多好多的東西呢！"
    };

    // 即時音頻播放器類
    public static class RealtimePcmPlayer {
        private int sampleRate;
        private SourceDataLine line;
        private AudioFormat audioFormat;
        private Thread decoderThread;
        private Thread playerThread;
        private AtomicBoolean stopped = new AtomicBoolean(false);
        private Queue<String> b64AudioBuffer = new ConcurrentLinkedQueue<>();
        private Queue<byte[]> RawAudioBuffer = new ConcurrentLinkedQueue<>();

        // 建構函式初始化音頻格式和音頻線路
        public RealtimePcmPlayer(int sampleRate) throws LineUnavailableException {
            this.sampleRate = sampleRate;
            this.audioFormat = new AudioFormat(this.sampleRate, 16, 1, true, false);
            DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
            line = (SourceDataLine) AudioSystem.getLine(info);
            line.open(audioFormat);
            line.start();
            decoderThread = new Thread(new Runnable() {
                @Override
                public void run() {
                    while (!stopped.get()) {
                        String b64Audio = b64AudioBuffer.poll();
                        if (b64Audio != null) {
                            byte[] rawAudio = Base64.getDecoder().decode(b64Audio);
                            RawAudioBuffer.add(rawAudio);
                        } else {
                            try {
                                Thread.sleep(100);
                            } catch (InterruptedException e) {
                                throw new RuntimeException(e);
                            }
                        }
                    }
                }
            });
            playerThread = new Thread(new Runnable() {
                @Override
                public void run() {
                    while (!stopped.get()) {
                        byte[] rawAudio = RawAudioBuffer.poll();
                        if (rawAudio != null) {
                            try {
                                playChunk(rawAudio);
                            } catch (IOException e) {
                                throw new RuntimeException(e);
                            } catch (InterruptedException e) {
                                throw new RuntimeException(e);
                            }
                        } else {
                            try {
                                Thread.sleep(100);
                            } catch (InterruptedException e) {
                                throw new RuntimeException(e);
                            }
                        }
                    }
                }
            });
            decoderThread.start();
            playerThread.start();
        }

        // 播放一個音頻塊並阻塞直到播放完成
        private void playChunk(byte[] chunk) throws IOException, InterruptedException {
            if (chunk == null || chunk.length == 0) return;

            int bytesWritten = 0;
            while (bytesWritten < chunk.length) {
                bytesWritten += line.write(chunk, bytesWritten, chunk.length - bytesWritten);
            }
            int audioLength = chunk.length / (this.sampleRate*2/1000);
            // 等待緩衝區中的音頻播放完成
            Thread.sleep(audioLength - 10);
        }

        public void write(String b64Audio) {
            b64AudioBuffer.add(b64Audio);
        }

        public void cancel() {
            b64AudioBuffer.clear();
            RawAudioBuffer.clear();
        }

        public void waitForComplete() throws InterruptedException {
            while (!b64AudioBuffer.isEmpty() || !RawAudioBuffer.isEmpty()) {
                Thread.sleep(100);
            }
            line.drain();
        }

        public void shutdown() throws InterruptedException {
            stopped.set(true);
            decoderThread.join();
            playerThread.join();
            if (line != null && line.isRunning()) {
                line.drain();
                line.close();
            }
        }
    }

    public static void main(String[] args) throws Exception {
        QwenTtsRealtimeParam param = QwenTtsRealtimeParam.builder()
                // 聲音設計、語音合成要使用相同的模型
                .model("qwen3-tts-vd-realtime-2026-01-15")
                // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：wss://dashscope.aliyuncs.com/api-ws/v1/realtime
                .url("wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime")
                // 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
                // 若沒有配置環境變數，請用百鍊API Key將下行替換為：.apikey("sk-xxx")
                .apikey(System.getenv("DASHSCOPE_API_KEY"))
                .build();
        AtomicReference<CountDownLatch> completeLatch = new AtomicReference<>(new CountDownLatch(1));
        final AtomicReference<QwenTtsRealtime> qwenTtsRef = new AtomicReference<>(null);

        // 建立即時音頻播放器執行個體
        RealtimePcmPlayer audioPlayer = new RealtimePcmPlayer(24000);

        QwenTtsRealtime qwenTtsRealtime = new QwenTtsRealtime(param, new QwenTtsRealtimeCallback() {
            @Override
            public void onOpen() {
                // 串連建立時的處理
            }
            @Override
            public void onEvent(JsonObject message) {
                String type = message.get("type").getAsString();
                switch(type) {
                    case "session.created":
                        // 會話建立時的處理
                        break;
                    case "response.audio.delta":
                        String recvAudioB64 = message.get("delta").getAsString();
                        // 即時播放音頻
                        audioPlayer.write(recvAudioB64);
                        break;
                    case "response.done":
                        // 響應完成時的處理
                        break;
                    case "session.finished":
                        // 會話結束時的處理
                        completeLatch.get().countDown();
                    default:
                        break;
                }
            }
            @Override
            public void onClose(int code, String reason) {
                // 串連關閉時的處理
            }
        });
        qwenTtsRef.set(qwenTtsRealtime);
        try {
            qwenTtsRealtime.connect();
        } catch (NoApiKeyException e) {
            throw new RuntimeException(e);
        }
        QwenTtsRealtimeConfig config = QwenTtsRealtimeConfig.builder()
                .voice("myvoice") // 將voice參數替換為聲音設計產生的專屬音色
                .responseFormat(QwenTtsRealtimeAudioFormat.PCM_24000HZ_MONO_16BIT)
                .mode("server_commit")
                .build();
        qwenTtsRealtime.updateSession(config);
        for (String text:textToSynthesize) {
            qwenTtsRealtime.appendText(text);
            Thread.sleep(100);
        }
        qwenTtsRealtime.finish();
        completeLatch.get().await();

        // 等待音頻播放完成並關閉播放器
        audioPlayer.waitForComplete();
        audioPlayer.shutdown();
        System.exit(0);
    }
}

非流式/單向流式合成

適用於千問3-TTS-VC系列模型，更多說明請參見語音合成-千問。

產生專屬音色並試聽效果，若對效果滿意，進行下一步；否則重建。

Python

import requests
import base64
import os

def create_voice_and_play():
    # 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
    api_key = os.getenv("DASHSCOPE_API_KEY")
    
    if not api_key:
        print("錯誤: 未找到DASHSCOPE_API_KEY環境變數，請先設定API Key")
        return None, None, None
    
    # 準備請求資料
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": "qwen-voice-design",
        "input": {
            "action": "create",
            "target_model": "qwen3-tts-vd-2026-01-26",
            "voice_prompt": "A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, is suitable for news broadcasting or documentary commentary.",
            "preview_text": "Dear listeners, hello everyone. Welcome to the evening news.",
            "preferred_name": "announcer",
            "language": "en"
        },
        "parameters": {
            "sample_rate": 24000,
            "response_format": "wav"
        }
    }
    
    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
    url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
    
    try:
        # 發送請求
        response = requests.post(
            url,
            headers=headers,
            json=data,
            timeout=60  # 添加逾時設定
        )
        
        if response.status_code == 200:
            result = response.json()
            
            # 擷取音色名稱
            voice_name = result["output"]["voice"]
            print(f"音色名稱: {voice_name}")
            
            # 擷取預覽音頻資料
            base64_audio = result["output"]["preview_audio"]["data"]
            
            # 解碼Base64音頻資料
            audio_bytes = base64.b64decode(base64_audio)
            
            # 儲存音頻檔案到本地
            filename = f"{voice_name}_preview.wav"
            
            # 將音頻資料寫入本地檔案
            with open(filename, 'wb') as f:
                f.write(audio_bytes)
            
            print(f"音頻已儲存到本地檔案: {filename}")
            print(f"檔案路徑: {os.path.abspath(filename)}")
            
            return voice_name, audio_bytes, filename
        else:
            print(f"請求失敗，狀態代碼: {response.status_code}")
            print(f"響應內容: {response.text}")
            return None, None, None
            
    except requests.exceptions.RequestException as e:
        print(f"網路請求發生錯誤: {e}")
        return None, None, None
    except KeyError as e:
        print(f"響應資料格式錯誤，缺少必要的欄位: {e}")
        print(f"響應內容: {response.text if 'response' in locals() else 'No response'}")
        return None, None, None
    except Exception as e:
        print(f"發生未知錯誤: {e}")
        return None, None, None

if __name__ == "__main__":
    print("開始建立語音...")
    voice_name, audio_data, saved_filename = create_voice_and_play()
    
    if voice_name:
        print(f"\n成功建立音色 '{voice_name}'")
        print(f"音頻檔案已儲存: '{saved_filename}'")
        print(f"檔案大小: {os.path.getsize(saved_filename)} 位元組")
    else:
        print("\n音色建立失敗")

Java

需要匯入Gson依賴，若是使用Maven或者Gradle，添加依賴方式如下：

Maven

在pom.xml中添加如下內容：

<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.13.1</version>
</dependency>

Gradle

在build.gradle中添加如下內容：

// https://mvnrepository.com/artifact/com.google.code.gson/gson
implementation("com.google.code.gson:gson:2.13.1")

重要

使用聲音設計產生的專屬音色進行語音合成時，必須按照如下方式設定音色：

MultiModalConversationParam param = MultiModalConversationParam.builder()
                .parameter("voice", "your_voice") // 將voice參數替換為聲音設計產生的專屬音色
                .build();

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Base64;

public class Main {
    public static void main(String[] args) {
        Main example = new Main();
        example.createVoice();
    }

    public void createVoice() {
        // 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        // 建立JSON請求體字串
        String jsonBody = "{\n" +
                "    \"model\": \"qwen-voice-design\",\n" +
                "    \"input\": {\n" +
                "        \"action\": \"create\",\n" +
                "        \"target_model\": \"qwen3-tts-vd-2026-01-26\",\n" +
                "        \"voice_prompt\": \"A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, is suitable for news broadcasting or documentary commentary.\",\n" +
                "        \"preview_text\": \"Dear listeners, hello everyone. Welcome to the evening news.\",\n" +
                "        \"preferred_name\": \"announcer\",\n" +
                "        \"language\": \"en\"\n" +
                "    },\n" +
                "    \"parameters\": {\n" +
                "        \"sample_rate\": 24000,\n" +
                "        \"response_format\": \"wav\"\n" +
                "    }\n" +
                "}";

        HttpURLConnection connection = null;
        try {
            // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
            URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
            connection = (HttpURLConnection) url.openConnection();

            // 佈建要求方法和頭部
            connection.setRequestMethod("POST");
            connection.setRequestProperty("Authorization", "Bearer " + apiKey);
            connection.setRequestProperty("Content-Type", "application/json");
            connection.setDoOutput(true);
            connection.setDoInput(true);

            // 發送請求體
            try (OutputStream os = connection.getOutputStream()) {
                byte[] input = jsonBody.getBytes("UTF-8");
                os.write(input, 0, input.length);
                os.flush();
            }

            // 擷取響應
            int responseCode = connection.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // 讀取響應內容
                StringBuilder response = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        response.append(responseLine.trim());
                    }
                }

                // 解析JSON響應
                JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();
                JsonObject outputObj = jsonResponse.getAsJsonObject("output");
                JsonObject previewAudioObj = outputObj.getAsJsonObject("preview_audio");

                // 擷取音色名稱
                String voiceName = outputObj.get("voice").getAsString();
                System.out.println("音色名稱: " + voiceName);

                // 擷取Base64編碼的音頻資料
                String base64Audio = previewAudioObj.get("data").getAsString();

                // 解碼Base64音頻資料
                byte[] audioBytes = Base64.getDecoder().decode(base64Audio);

                // 儲存音頻到本地檔案
                String filename = voiceName + "_preview.wav";
                saveAudioToFile(audioBytes, filename);

                System.out.println("音頻已儲存到本地檔案: " + filename);

            } else {
                // 讀取錯誤響應
                StringBuilder errorResponse = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        errorResponse.append(responseLine.trim());
                    }
                }

                System.out.println("請求失敗，狀態代碼: " + responseCode);
                System.out.println("錯誤響應: " + errorResponse.toString());
            }

        } catch (Exception e) {
            System.err.println("請求發生錯誤: " + e.getMessage());
            e.printStackTrace();
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
    }

    private void saveAudioToFile(byte[] audioBytes, String filename) {
        try {
            File file = new File(filename);
            try (FileOutputStream fos = new FileOutputStream(file)) {
                fos.write(audioBytes);
            }
            System.out.println("音頻已儲存到: " + file.getAbsolutePath());
        } catch (IOException e) {
            System.err.println("儲存音頻檔案時發生錯誤: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

使用上一步產生的專屬音色進行語音合成（非流式合成）。

這裡參考了使用系統音色進行語音合成DashScope SDK的“非流式輸出”範例程式碼，將voice參數替換為聲音設計產生的專屬音色進行語音合成。單向流式合成請參見語音合成-千問。

關鍵原則：聲音設計時使用的模型 (target_model) 必須與後續進行語音合成時使用的模型 (model) 保持一致，否則會導致合成失敗。

Python

import os
import dashscope


if __name__ == '__main__':
    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1
    dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

    text = "今天天氣怎麼樣？"
    # SpeechSynthesizer介面使用方法：dashscope.audio.qwen_tts.SpeechSynthesizer.call(...)
    response = dashscope.MultiModalConversation.call(
        model="qwen3-tts-vd-2026-01-26",
        # 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        text=text,
        voice="myvoice", # 將voice參數替換為聲音設計產生的專屬音色
        stream=False
    )
    print(response)

Java

import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;

import com.alibaba.dashscope.utils.Constants;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;

public class Main {
    private static final String MODEL = "qwen3-tts-vd-2026-01-26";
    public static void call() throws ApiException, NoApiKeyException, UploadFileException {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
                // 若沒有配置環境變數，請用百鍊API Key將下行替換為：.apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model(MODEL)
                .text("Today is a wonderful day to build something people love!")
                .parameter("voice", "myvoice") // 將voice參數替換為聲音設計產生的專屬音色
                .build();
        MultiModalConversationResult result = conv.call(param);
        String audioUrl = result.getOutput().getAudio().getUrl();
        System.out.print(audioUrl);

        // 下載音頻檔案到本地
        try (InputStream in = new URL(audioUrl).openStream();
             FileOutputStream out = new FileOutputStream("downloaded_audio.wav")) {
            byte[] buffer = new byte[1024];
            int bytesRead;
            while ((bytesRead = in.read(buffer)) != -1) {
                out.write(buffer, 0, bytesRead);
            }
            System.out.println("\n音頻檔案已下載到本地: downloaded_audio.wav");
        } catch (Exception e) {
            System.out.println("\n下載音頻檔案時出錯: " + e.getMessage());
        }
    }
    public static void main(String[] args) {
        try {
            // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1
            Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
            call();
        } catch (ApiException | NoApiKeyException | UploadFileException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

API參考

使用不同 API 時，請確保使用同一帳號進行操作。

建立音色

輸入聲音描述和試聽文本，建立自訂音色。

URL

中國內地：

POST https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

國際：

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

要求標頭

參數	類型	是否必須	說明
Authorization	string	支援	鑒權令牌，格式為`Bearer <your_api_key>`，使用時，將“`<your_api_key>`”替換為實際的API Key。
Content-Type	string	支援	請求體中傳輸的資料的媒體類型。固定為`application/json`。

訊息體

包含所有請求參數的訊息體如下，對於可選欄位，在實際業務中可根據需求省略。

重要

注意區分如下參數：

model：聲音設計模型，固定為qwen-voice-design
target_model：驅動音色的語音合成模型，須和後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗

{
    "model": "qwen-voice-design",
    "input": {
        "action": "create",
        "target_model": "qwen3-tts-vd-realtime-2026-01-15",
        "voice_prompt": "沉穩的中年男性播音員，音色低沉渾厚，富有磁性，語速平穩，吐字清晰，適合用於新聞播報或紀錄片解說。",
        "preview_text": "各位聽眾朋友，大家好，歡迎收聽晚間新聞。",
        "preferred_name": "announcer",
        "language": "zh"
    },
    "parameters": {
        "sample_rate": 24000,
        "response_format": "wav"
    }
}

請求參數

參數	類型	預設值	是否必須	說明
model	string	-	支援	聲音設計模型，固定為`qwen-voice-design`。
action	string	-	支援	操作類型，固定為`create`。
target_model	string	-	支援	驅動音色的語音合成模型，支援的模型有（兩類）：千問3-TTS-VD-Realtime（參見即時語音合成-千問）： qwen3-tts-vd-realtime-2026-01-15 qwen3-tts-vd-realtime-2025-12-16 千問3-TTS-VD（參見語音合成-千問）： qwen3-tts-vd-2026-01-26 必須與後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗。
voice_prompt	string	-	支援	聲音描述。最大長度 2048 字元。只支援中文和英文。關於如何編寫聲音描述，請參見“如何編寫高品質的聲音描述？”。
preview_text	string	-	支援	預覽音頻對應的文本。最大長度 1024 字元。支援中文（zh）、英文（en）、德語（de）、意大利語（it）、葡萄牙語（pt）、西班牙語（es）、日語（ja）、韓語（ko）、法語（fr）、俄語（ru）。
preferred_name	string	-	支援	為音色指定一個便於識別的名稱（僅允許數字、英文字母和底線，不超過16個字元）。建議選用與角色、情境相關的標識。該關鍵字會在設計的音色名中出現，例如關鍵字為“announcer”，最終音色名為“qwen-tts-vd-announcer-voice-20251201102800-a1b2”
language	string	zh	不支援	語言代碼，指定聲音設計產生音色的語言傾向。該參數影響產生音色的語言特徵和發音傾向，建議根據實際使用情境選擇對應語言代碼。若使用該參數，設定的語種要和`preview_text`的語種一致。取值範圍：`zh`（中文）、`en`（英文）、`de`（德語）、`it`（意大利語）、`pt`（葡萄牙語）、`es`（西班牙語）、`ja`（日語）、`ko`（韓語）、`fr`（法語）、`ru`（俄語）。
sample_rate	int	24000	不支援	聲音設計產生的預覽音頻採樣率（單位：Hz）。取值範圍： 8000 16000 24000 48000
response_format	string	wav	不支援	聲音設計產生的預覽音頻格式。取值範圍： pcm wav mp3 opus

響應參數

點擊查看響應樣本

{
    "output": {
        "preview_audio": {
            "data": "{base64_encoded_audio}",
            "sample_rate": 24000,
            "response_format": "wav"
        },
        "target_model": "qwen3-tts-vd-realtime-2026-01-15",
        "voice": "yourVoice"
    },
    "usage": {
        "count": 1
    },
    "request_id": "yourRequestId"
}

需關注的參數如下：

參數	類型	說明
voice	string	音色名稱，可直接用於語音合成介面的`voice`參數。
data	string	聲音設計產生的預覽音頻資料，以 Base 64 編碼字串形式返回。
sample_rate	int	聲音設計產生的預覽音頻採樣率（單位：Hz），與音色建立時的採樣率一致，未指定則預設為 24000 Hz。
response_format	string	聲音設計產生的預覽音頻格式，與音色建立時的音頻格式一致，未指定則預設為wav。
target_model	string	驅動音色的語音合成模型，支援的模型有（兩類）：千問3-TTS-VD-Realtime（參見即時語音合成-千問）： qwen3-tts-vd-realtime-2026-01-15 qwen3-tts-vd-realtime-2025-12-16 千問3-TTS-VD（參見語音合成-千問）： qwen3-tts-vd-2026-01-26 必須與後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗。
request_id	string	Request ID。
count	integer	本次請求實際計入費用的“建立音色”次數，本次請求的費用為$ $co u n t \times 0.2$ 。建立音色時，count恒為1。

範例程式碼

重要

注意區分如下參數：

model：聲音設計模型，固定為qwen-voice-design
target_model：驅動音色的語音合成模型，須和後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗

cURL

若未將API Key配置到環境變數，需將樣本中的$DASHSCOPE_API_KEY替換為實際的API Key。

# ======= 重要提示 =======
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# === 執行時請刪除該注釋 ===

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-voice-design",
    "input": {
        "action": "create",
        "target_model": "qwen3-tts-vd-realtime-2026-01-15",
        "voice_prompt": "沉穩的中年男性播音員，音色低沉渾厚，富有磁性，語速平穩，吐字清晰，適合用於新聞播報或紀錄片解說。",
        "preview_text": "各位聽眾朋友，大家好，歡迎收聽晚間新聞。",
        "preferred_name": "announcer",
        "language": "zh"
    },
    "parameters": {
        "sample_rate": 24000,
        "response_format": "wav"
    }
}'

Python

import requests
import base64
import os

def create_voice_and_play():
    # 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
    api_key = os.getenv("DASHSCOPE_API_KEY")
    
    if not api_key:
        print("錯誤: 未找到DASHSCOPE_API_KEY環境變數，請先設定API Key")
        return None, None, None
    
    # 準備請求資料
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": "qwen-voice-design",
        "input": {
            "action": "create",
            "target_model": "qwen3-tts-vd-realtime-2026-01-15",
            "voice_prompt": "A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, is suitable for news broadcasting or documentary commentary.",
            "preview_text": "Dear listeners, hello everyone. Welcome to the evening news.",
            "preferred_name": "announcer",
            "language": "en"
        },
        "parameters": {
            "sample_rate": 24000,
            "response_format": "wav"
        }
    }
    
    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
    url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
    
    try:
        # 發送請求
        response = requests.post(
            url,
            headers=headers,
            json=data,
            timeout=60  # 添加逾時設定
        )
        
        if response.status_code == 200:
            result = response.json()
            
            # 擷取音色名稱
            voice_name = result["output"]["voice"]
            print(f"音色名稱: {voice_name}")
            
            # 擷取預覽音頻資料
            base64_audio = result["output"]["preview_audio"]["data"]
            
            # 解碼Base64音頻資料
            audio_bytes = base64.b64decode(base64_audio)
            
            # 儲存音頻檔案到本地
            filename = f"{voice_name}_preview.wav"
            
            # 將音頻資料寫入本地檔案
            with open(filename, 'wb') as f:
                f.write(audio_bytes)
            
            print(f"音頻已儲存到本地檔案: {filename}")
            print(f"檔案路徑: {os.path.abspath(filename)}")
            
            return voice_name, audio_bytes, filename
        else:
            print(f"請求失敗，狀態代碼: {response.status_code}")
            print(f"響應內容: {response.text}")
            return None, None, None
            
    except requests.exceptions.RequestException as e:
        print(f"網路請求發生錯誤: {e}")
        return None, None, None
    except KeyError as e:
        print(f"響應資料格式錯誤，缺少必要的欄位: {e}")
        print(f"響應內容: {response.text if 'response' in locals() else 'No response'}")
        return None, None, None
    except Exception as e:
        print(f"發生未知錯誤: {e}")
        return None, None, None

if __name__ == "__main__":
    print("開始建立語音...")
    voice_name, audio_data, saved_filename = create_voice_and_play()
    
    if voice_name:
        print(f"\n成功建立音色 '{voice_name}'")
        print(f"音頻檔案已儲存: '{saved_filename}'")
        print(f"檔案大小: {os.path.getsize(saved_filename)} 位元組")
    else:
        print("\n音色建立失敗")

Java

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Base64;

public class Main {
    public static void main(String[] args) {
        Main example = new Main();
        example.createVoice();
    }

    public void createVoice() {
        // 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        // 建立JSON請求體字串
        String jsonBody = "{\n" +
                "    \"model\": \"qwen-voice-design\",\n" +
                "    \"input\": {\n" +
                "        \"action\": \"create\",\n" +
                "        \"target_model\": \"qwen3-tts-vd-realtime-2026-01-15\",\n" +
                "        \"voice_prompt\": \"A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, is suitable for news broadcasting or documentary commentary.\",\n" +
                "        \"preview_text\": \"Dear listeners, hello everyone. Welcome to the evening news.\",\n" +
                "        \"preferred_name\": \"announcer\",\n" +
                "        \"language\": \"en\"\n" +
                "    },\n" +
                "    \"parameters\": {\n" +
                "        \"sample_rate\": 24000,\n" +
                "        \"response_format\": \"wav\"\n" +
                "    }\n" +
                "}";

        HttpURLConnection connection = null;
        try {
            // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
            URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
            connection = (HttpURLConnection) url.openConnection();

            // 佈建要求方法和頭部
            connection.setRequestMethod("POST");
            connection.setRequestProperty("Authorization", "Bearer " + apiKey);
            connection.setRequestProperty("Content-Type", "application/json");
            connection.setDoOutput(true);
            connection.setDoInput(true);

            // 發送請求體
            try (OutputStream os = connection.getOutputStream()) {
                byte[] input = jsonBody.getBytes("UTF-8");
                os.write(input, 0, input.length);
                os.flush();
            }

            // 擷取響應
            int responseCode = connection.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // 讀取響應內容
                StringBuilder response = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        response.append(responseLine.trim());
                    }
                }

                // 解析JSON響應
                JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();
                JsonObject outputObj = jsonResponse.getAsJsonObject("output");
                JsonObject previewAudioObj = outputObj.getAsJsonObject("preview_audio");

                // 擷取音色名稱
                String voiceName = outputObj.get("voice").getAsString();
                System.out.println("音色名稱: " + voiceName);

                // 擷取Base64編碼的音頻資料
                String base64Audio = previewAudioObj.get("data").getAsString();

                // 解碼Base64音頻資料
                byte[] audioBytes = Base64.getDecoder().decode(base64Audio);

                // 儲存音頻到本地檔案
                String filename = voiceName + "_preview.wav";
                saveAudioToFile(audioBytes, filename);

                System.out.println("音頻已儲存到本地檔案: " + filename);

            } else {
                // 讀取錯誤響應
                StringBuilder errorResponse = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        errorResponse.append(responseLine.trim());
                    }
                }

                System.out.println("請求失敗，狀態代碼: " + responseCode);
                System.out.println("錯誤響應: " + errorResponse.toString());
            }

        } catch (Exception e) {
            System.err.println("請求發生錯誤: " + e.getMessage());
            e.printStackTrace();
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
    }

    private void saveAudioToFile(byte[] audioBytes, String filename) {
        try {
            File file = new File(filename);
            try (FileOutputStream fos = new FileOutputStream(file)) {
                fos.write(audioBytes);
            }
            System.out.println("音頻已儲存到: " + file.getAbsolutePath());
        } catch (IOException e) {
            System.err.println("儲存音頻檔案時發生錯誤: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

查詢音色列表

分頁查詢已建立的音色列表。

URL

中國內地：

POST https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

國際：

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

要求標頭

參數	類型	是否必須	說明
Authorization	string	支援	鑒權令牌，格式為`Bearer <your_api_key>`，使用時，將“`<your_api_key>`”替換為實際的API Key。
Content-Type	string	支援	請求體中傳輸的資料的媒體類型。固定為`application/json`。

訊息體
包含所有請求參數的訊息體如下，對於可選欄位，在實際業務中可根據需求省略。
重要
model：聲音設計模型，固定為qwen-voice-design，請勿修改。
```
{
    "model": "qwen-voice-design",
    "input": {
        "action": "list",
        "page_size": 10,
        "page_index": 0
    }
}
```

請求參數

參數	類型	預設值	是否必須	說明
model	string	-	支援	聲音設計模型，固定為`qwen-voice-design`。
action	string	-	支援	操作類型，固定為`list`。
page_index	integer	0	不支援	頁碼索引。取值範圍：[0, 200]。
page_size	integer	10	不支援	每頁包含資料條數。取值範圍：大於0即可。

響應參數

點擊查看響應樣本

{
    "output": {
        "page_index": 0,
        "page_size": 2,
        "total_count": 26,
        "voice_list": [
            {
                "gmt_create": "2025-12-10 17:04:54",
                "gmt_modified": "2025-12-10 17:04:54",
                "language": "zh",
                "preview_text": "各位聽眾朋友們，大家好，歡迎收聽今天的節目。",
                "target_model": "qwen3-tts-vd-realtime-2026-01-15",
                "voice": "yourVoice1",
                "voice_prompt": "沉穩的中年男性播音員，音色低沉渾厚，富有磁性，語速平穩，吐字清晰，適合用於新聞播報或紀錄片解說。，低沉有磁性，語速平穩"
            },
            {
                "gmt_create": "2025-12-10 15:31:35",
                "gmt_modified": "2025-12-10 15:31:35",
                "language": "zh",
                "preview_text": "各位聽眾朋友們，大家好",
                "target_model": "qwen3-tts-vd-realtime-2026-01-15",
                "voice": "yourVoice2",
                "voice_prompt": "沉穩的中年男性播音員，音色低沉渾厚，富有磁性，語速平穩，吐字清晰，適合用於新聞播報或紀錄片解說。"
            }
        ]
    },
    "usage": {},
    "request_id": "yourRequestId"
}

需關注的參數如下：

參數	類型	說明
voice	string	音色名稱，可直接用於語音合成介面的`voice`參數。
target_model	string	驅動音色的語音合成模型，支援的模型有（兩類）：千問3-TTS-VD-Realtime（參見即時語音合成-千問）： qwen3-tts-vd-realtime-2026-01-15 qwen3-tts-vd-realtime-2025-12-16 千問3-TTS-VD（參見語音合成-千問）： qwen3-tts-vd-2026-01-26 必須與後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗。
language	string	語言代碼。取值範圍：`zh`（中文）、`en`（英文）、`de`（德語）、`it`（意大利語）、`pt`（葡萄牙語）、`es`（西班牙語）、`ja`（日語）、`ko`（韓語）、`fr`（法語）、`ru`（俄語）。
voice_prompt	string	聲音描述。
preview_text	string	試聽文本。
gmt_create	string	建立音色的時間。
gmt_modified	string	修改音色的時間。
page_index	integer	頁碼索引。
page_size	integer	每頁包含資料條數。
total_count	integer	查詢得到的資料總條數。
request_id	string	Request ID。

範例程式碼

重要

model：聲音設計模型，固定為qwen-voice-design，請勿修改。

cURL

若未將API Key配置到環境變數，需將樣本中的$DASHSCOPE_API_KEY替換為實際的API Key。

# ======= 重要提示 =======
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# === 執行時請刪除該注釋 ===

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-voice-design",
    "input": {
        "action": "list",
        "page_size": 10,
        "page_index": 0
    }
}'

Python

import os
import requests

# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"

payload = {
    "model": "qwen-voice-design", # 不要修改該值
    "input": {
        "action": "list",
        "page_size": 10,
        "page_index": 0
    }
}

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print("HTTP 狀態代碼:", response.status_code)

if response.status_code == 200:
    data = response.json()
    voice_list = data["output"]["voice_list"]

    print("查詢到的音色列表：")
    for item in voice_list:
        print(f"- 音色: {item['voice']}  建立時間: {item['gmt_create']}  模型: {item['target_model']}")
else:
    print("請求失敗:", response.text)

Java

import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class Main {
    public static void main(String[] args) {
        // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");
        // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
        String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";

        // JSON 請求體（舊版本 Java 無 """ 多行字串）
        String jsonPayload =
                "{"
                        + "\"model\": \"qwen-voice-design\"," // 不要修改該值
                        + "\"input\": {"
                        +     "\"action\": \"list\","
                        +     "\"page_size\": 10,"
                        +     "\"page_index\": 0"
                        + "}"
                        + "}";

        try {
            HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
            con.setRequestMethod("POST");
            con.setRequestProperty("Authorization", "Bearer " + apiKey);
            con.setRequestProperty("Content-Type", "application/json");
            con.setDoOutput(true);

            try (OutputStream os = con.getOutputStream()) {
                os.write(jsonPayload.getBytes("UTF-8"));
            }

            int status = con.getResponseCode();
            BufferedReader br = new BufferedReader(new InputStreamReader(
                    status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(), "UTF-8"));

            StringBuilder response = new StringBuilder();
            String line;
            while ((line = br.readLine()) != null) {
                response.append(line);
            }
            br.close();

            System.out.println("HTTP 狀態代碼: " + status);
            System.out.println("返回 JSON: " + response.toString());

            if (status == 200) {
                Gson gson = new Gson();
                JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
                JsonArray voiceList = jsonObj.getAsJsonObject("output").getAsJsonArray("voice_list");

                System.out.println("\n 查詢到的音色列表：");
                for (int i = 0; i < voiceList.size(); i++) {
                    JsonObject voiceItem = voiceList.get(i).getAsJsonObject();
                    String voice = voiceItem.get("voice").getAsString();
                    String gmtCreate = voiceItem.get("gmt_create").getAsString();
                    String targetModel = voiceItem.get("target_model").getAsString();

                    System.out.printf("- 音色: %s  建立時間: %s  模型: %s\n",
                            voice, gmtCreate, targetModel);
                }
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

查詢特定音色

通過音色名稱擷取特定音色的詳細資料。

URL

中國內地：

POST https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

國際：

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

要求標頭

參數	類型	是否必須	說明
Authorization	string	支援	鑒權令牌，格式為`Bearer <your_api_key>`，使用時，將“`<your_api_key>`”替換為實際的API Key。
Content-Type	string	支援	請求體中傳輸的資料的媒體類型。固定為`application/json`。

訊息體
包含所有請求參數的訊息體如下，對於可選欄位，在實際業務中可根據需求省略。
重要
model：聲音設計模型，固定為qwen-voice-design，請勿修改。
```
{
    "model": "qwen-voice-design",
    "input": {
        "action": "query",
        "voice": "voiceName"
    }
}
```

請求參數

參數	類型	預設值	是否必須	說明
model	string	-	支援	聲音設計模型，固定為`qwen-voice-design`。
action	string	-	支援	操作類型，固定為`query`。
voice	string	-	支援	待查詢的音色名稱。

響應參數

點擊查看響應樣本

查到資料

{
    "output": {
        "gmt_create": "2025-12-10 14:54:09",
        "gmt_modified": "2025-12-10 17:47:48",
        "language": "zh",
        "preview_text": "各位聽眾朋友們，大家好",
        "target_model": "qwen3-tts-vd-realtime-2026-01-15",
        "voice": "yourVoice",
        "voice_prompt": "沉穩的中年男性播音員，音色低沉渾厚，富有磁性，語速平穩，吐字清晰，適合用於新聞播報或紀錄片解說。"
    },
    "usage": {},
    "request_id": "yourRequestId"
}

未查到資料

當查詢的音色不存在時，API返回HTTP 400狀態代碼，響應體包含VoiceNotFound錯誤碼。

{
    "request_id":"yourRequestId",
    "code":"VoiceNotFound",
    "message":"Voice not found: qwen-tts-vd-announcer-voice-xxxx"
}

需關注的參數如下：

參數	類型	說明
voice	string	音色名稱，可直接用於語音合成介面的`voice`參數。
target_model	string	驅動音色的語音合成模型，支援的模型有（兩類）：千問3-TTS-VD-Realtime（參見即時語音合成-千問）： qwen3-tts-vd-realtime-2026-01-15 qwen3-tts-vd-realtime-2025-12-16 千問3-TTS-VD（參見語音合成-千問）： qwen3-tts-vd-2026-01-26 必須與後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗。
language	string	語言代碼。取值範圍：`zh`（中文）、`en`（英文）、`de`（德語）、`it`（意大利語）、`pt`（葡萄牙語）、`es`（西班牙語）、`ja`（日語）、`ko`（韓語）、`fr`（法語）、`ru`（俄語）。
voice_prompt	string	聲音描述。
preview_text	string	試聽文本。
gmt_create	string	建立音色的時間。
gmt_modified	string	修改音色的時間。
request_id	string	Request ID。

範例程式碼

重要

model：聲音設計模型，固定為qwen-voice-design，請勿修改。

cURL

若未將API Key配置到環境變數，需將樣本中的$DASHSCOPE_API_KEY替換為實際的API Key。

# ======= 重要提示 =======
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# === 執行時請刪除該注釋 ===

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-voice-design",
    "input": {
        "action": "query",
        "voice": "voiceName"
    }
}'

Python

import requests
import os

def query_voice(voice_name):
    """
    查詢指定音色資訊
    :param voice_name: 音色名稱
    :return: 音色資訊字典，如果找不到則返回None
    """
    # 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
    api_key = os.getenv("DASHSCOPE_API_KEY")
    
    # 準備請求資料
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": "qwen-voice-design",
        "input": {
            "action": "query",
            "voice": voice_name
        }
    }
    
    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
    url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
    # 發送請求
    response = requests.post(
        url,
        headers=headers,
        json=data
    )
    
    if response.status_code == 200:
        result = response.json()
        
        # 檢查是否包含錯誤資訊
        if "code" in result and result["code"] == "VoiceNotFound":
            print(f"找不到音色: {voice_name}")
            print(f"錯誤資訊: {result.get('message', 'Voice not found')}")
            return None
        
        # 擷取音色資訊
        voice_info = result["output"]
        print(f"成功查詢到音色資訊:")
        print(f"  音色名稱: {voice_info.get('voice')}")
        print(f"  建立時間: {voice_info.get('gmt_create')}")
        print(f"  修改時間: {voice_info.get('gmt_modified')}")
        print(f"  語言: {voice_info.get('language')}")
        print(f"  預覽文本: {voice_info.get('preview_text')}")
        print(f"  模型: {voice_info.get('target_model')}")
        print(f"  音色描述: {voice_info.get('voice_prompt')}")
        
        return voice_info
    else:
        print(f"請求失敗，狀態代碼: {response.status_code}")
        print(f"響應內容: {response.text}")
        return None

def main():
    # 樣本：查詢音色
    voice_name = "myvoice"  # 替換為您要查詢的實際音色名稱
    
    print(f"正在查詢音色: {voice_name}")
    voice_info = query_voice(voice_name)
    
    if voice_info:
        print("\n音色查詢成功！")
    else:
        print("\n音色查詢失敗或音色不存在。")

if __name__ == "__main__":
    main()

Java

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class Main {

    public static void main(String[] args) {
        Main example = new Main();
        // 樣本：查詢音色
        String voiceName = "myvoice"; // 替換為您要查詢的實際音色名稱
        System.out.println("正在查詢音色: " + voiceName);
        example.queryVoice(voiceName);
    }

    public void queryVoice(String voiceName) {
        // 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        // 建立JSON請求體字串
        String jsonBody = "{\n" +
                "    \"model\": \"qwen-voice-design\",\n" +
                "    \"input\": {\n" +
                "        \"action\": \"query\",\n" +
                "        \"voice\": \"" + voiceName + "\"\n" +
                "    }\n" +
                "}";

        HttpURLConnection connection = null;
        try {
            // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
            URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
            connection = (HttpURLConnection) url.openConnection();

            // 佈建要求方法和頭部
            connection.setRequestMethod("POST");
            connection.setRequestProperty("Authorization", "Bearer " + apiKey);
            connection.setRequestProperty("Content-Type", "application/json");
            connection.setDoOutput(true);
            connection.setDoInput(true);

            // 發送請求體
            try (OutputStream os = connection.getOutputStream()) {
                byte[] input = jsonBody.getBytes("UTF-8");
                os.write(input, 0, input.length);
                os.flush();
            }

            // 擷取響應
            int responseCode = connection.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // 讀取響應內容
                StringBuilder response = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        response.append(responseLine.trim());
                    }
                }

                // 解析JSON響應
                JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();

                // 檢查是否包含錯誤資訊
                if (jsonResponse.has("code") && "VoiceNotFound".equals(jsonResponse.get("code").getAsString())) {
                    String errorMessage = jsonResponse.has("message") ?
                            jsonResponse.get("message").getAsString() : "Voice not found";
                    System.out.println("找不到音色: " + voiceName);
                    System.out.println("錯誤資訊: " + errorMessage);
                    return;
                }

                // 擷取音色資訊
                JsonObject outputObj = jsonResponse.getAsJsonObject("output");

                System.out.println("成功查詢到音色資訊:");
                System.out.println("  音色名稱: " + outputObj.get("voice").getAsString());
                System.out.println("  建立時間: " + outputObj.get("gmt_create").getAsString());
                System.out.println("  修改時間: " + outputObj.get("gmt_modified").getAsString());
                System.out.println("  語言: " + outputObj.get("language").getAsString());
                System.out.println("  預覽文本: " + outputObj.get("preview_text").getAsString());
                System.out.println("  模型: " + outputObj.get("target_model").getAsString());
                System.out.println("  音色描述: " + outputObj.get("voice_prompt").getAsString());

            } else {
                // 讀取錯誤響應
                StringBuilder errorResponse = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        errorResponse.append(responseLine.trim());
                    }
                }

                System.out.println("請求失敗，狀態代碼: " + responseCode);
                System.out.println("錯誤響應: " + errorResponse.toString());
            }

        } catch (Exception e) {
            System.err.println("請求發生錯誤: " + e.getMessage());
            e.printStackTrace();
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
    }
}

刪除音色

刪除指定音色，釋放對應額度。

URL

中國內地：

POST https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

國際：

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

要求標頭

參數	類型	是否必須	說明
Authorization	string	支援	鑒權令牌，格式為`Bearer <your_api_key>`，使用時，將“`<your_api_key>`”替換為實際的API Key。
Content-Type	string	支援	請求體中傳輸的資料的媒體類型。固定為`application/json`。

訊息體
包含所有請求參數的訊息體如下，對於可選欄位，在實際業務中可根據需求省略：
重要
model：聲音設計模型，固定為qwen-voice-design，請勿修改。
```
{
    "model": "qwen-voice-design",
    "input": {
        "action": "delete",
        "voice": "yourVoice"
    }
}
```
請求參數
參數
類型
預設值
是否必須
說明
model
string
-
支援
聲音設計模型，固定為qwen-voice-design。
action
string
-
支援
操作類型，固定為delete。
voice
string
-
支援
待刪除的音色。
響應參數
點擊查看響應樣本
```
{
    "output": {
        "voice": "yourVoice"
    },
    "usage": {},
    "request_id": "yourRequestId"
}
```
需關注的參數如下：
參數
類型
說明
request_id
string
Request ID。
voice
string
被刪除的音色。

範例程式碼

重要

model：聲音設計模型，固定為qwen-voice-design，請勿修改。

cURL

若未將API Key配置到環境變數，需將樣本中的$DASHSCOPE_API_KEY替換為實際的API Key。

# ======= 重要提示 =======
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# === 執行時請刪除該注釋 ===

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-voice-design",
    "input": {
        "action": "delete",
        "voice": "yourVoice"
    }
}'

Python

import requests
import os

def delete_voice(voice_name):
    """
    刪除指定音色
    :param voice_name: 音色名稱
    :return: True表示刪除成功或音色不存在但請求成功，False表示操作失敗
    """
    # 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
    api_key = os.getenv("DASHSCOPE_API_KEY")
    
    # 準備請求資料
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": "qwen-voice-design",
        "input": {
            "action": "delete",
            "voice": voice_name
        }
    }
    
    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
    url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
    # 發送請求
    response = requests.post(
        url,
        headers=headers,
        json=data
    )
    
    if response.status_code == 200:
        result = response.json()
        
        # 檢查是否包含錯誤資訊
        if "code" in result and "VoiceNotFound" in result["code"]:
            print(f"音色不存在: {voice_name}")
            print(f"錯誤資訊: {result.get('message', 'Voice not found')}")
            return True  # 音色不存在也算操作成功（因為目標已不存在）
        
        # 檢查是否成功刪除
        if "usage" in result:
            print(f"音色刪除成功: {voice_name}")
            print(f"請求ID: {result.get('request_id', 'N/A')}")
            return True
        else:
            print(f"刪除操作返回意外格式: {result}")
            return False
    else:
        print(f"刪除音色請求失敗，狀態代碼: {response.status_code}")
        print(f"響應內容: {response.text}")
        return False

def main():
    # 樣本：刪除音色
    voice_name = "myvoice"  # 替換為您要刪除的實際音色名稱
    
    print(f"正在刪除音色: {voice_name}")
    success = delete_voice(voice_name)
    
    if success:
        print(f"\n音色 '{voice_name}' 刪除操作完成！")
    else:
        print(f"\n音色 '{voice_name}' 刪除操作失敗！")

if __name__ == "__main__":
    main()

Java

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class Main {

    public static void main(String[] args) {
        Main example = new Main();
        // 樣本：刪除音色
        String voiceName = "myvoice"; // 替換為您要刪除的實際音色名稱
        System.out.println("正在刪除音色: " + voiceName);
        example.deleteVoice(voiceName);
    }

    public void deleteVoice(String voiceName) {
        // 新加坡和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        // 建立JSON請求體字串
        String jsonBody = "{\n" +
                "    \"model\": \"qwen-voice-design\",\n" +
                "    \"input\": {\n" +
                "        \"action\": \"delete\",\n" +
                "        \"voice\": \"" + voiceName + "\"\n" +
                "    }\n" +
                "}";

        HttpURLConnection connection = null;
        try {
            // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
            URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
            connection = (HttpURLConnection) url.openConnection();

            // 佈建要求方法和頭部
            connection.setRequestMethod("POST");
            connection.setRequestProperty("Authorization", "Bearer " + apiKey);
            connection.setRequestProperty("Content-Type", "application/json");
            connection.setDoOutput(true);
            connection.setDoInput(true);

            // 發送請求體
            try (OutputStream os = connection.getOutputStream()) {
                byte[] input = jsonBody.getBytes("UTF-8");
                os.write(input, 0, input.length);
                os.flush();
            }

            // 擷取響應
            int responseCode = connection.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // 讀取響應內容
                StringBuilder response = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        response.append(responseLine.trim());
                    }
                }

                // 解析JSON響應
                JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();

                // 檢查是否包含錯誤資訊
                if (jsonResponse.has("code") && jsonResponse.get("code").getAsString().contains("VoiceNotFound")) {
                    String errorMessage = jsonResponse.has("message") ?
                            jsonResponse.get("message").getAsString() : "Voice not found";
                    System.out.println("音色不存在: " + voiceName);
                    System.out.println("錯誤資訊: " + errorMessage);
                    // 音色不存在也算操作成功（因為目標已不存在）
                } else if (jsonResponse.has("usage")) {
                    // 檢查是否成功刪除
                    System.out.println("音色刪除成功: " + voiceName);
                    String requestId = jsonResponse.has("request_id") ?
                            jsonResponse.get("request_id").getAsString() : "N/A";
                    System.out.println("請求ID: " + requestId);
                } else {
                    System.out.println("刪除操作返回意外格式: " + response.toString());
                }

            } else {
                // 讀取錯誤響應
                StringBuilder errorResponse = new StringBuilder();
                try (BufferedReader br = new BufferedReader(
                        new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
                    String responseLine;
                    while ((responseLine = br.readLine()) != null) {
                        errorResponse.append(responseLine.trim());
                    }
                }

                System.out.println("刪除音色請求失敗，狀態代碼: " + responseCode);
                System.out.println("錯誤響應: " + errorResponse.toString());
            }

        } catch (Exception e) {
            System.err.println("請求發生錯誤: " + e.getMessage());
            e.printStackTrace();
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
    }
}

語音合成

如何使用聲音設計產生的專屬音色合成個人化的聲音，請參見快速開始：從聲音設計到語音合成。

用於聲音設計的語音合成模型（如 qwen3-tts-vd-realtime-2026-01-15）為專用模型，僅支援使用聲音設計產生的音色，不支援Chelsie、Serena、Ethan、Cherry等系統音色。

音色配額與自動清理規則

總數限制：1000個音色/帳號
可通過調用查詢音色列表介面查詢音色數目（total_count）
自動清理：若單個音色在過去一年內未被用於任何語音合成請求，系統將自動將其刪除

計費說明

聲音設計和語音合成分開計費：

聲音設計：建立音色按$0.2/個計費，建立失敗不計費
說明
免費額度說明（僅新加坡地區有免費額度）：
- 阿里雲百鍊開通後90天內，可享10次免費音色建立機會。
- 建立失敗不佔用免費次數。
- 刪除音色不會恢複免費次數。
- 免費額度用完或超出 90 天有效期間後，建立音色將按$0.2/個的價格計費。
使用聲音設計產生的專屬音色進行語音合成：按量（文本字元數）計費，詳情請參見即時語音合成-千問或語音合成-千問

錯誤資訊

如遇報錯問題，請參見錯誤資訊進行排查。