千問聲音複刻API參考 - Alibaba Cloud Model Studio

聲音複刻依託大模型進行特徵提取，無需訓練即可複刻聲音。僅需提供 10~20 秒的音頻，即可產生高度相似且聽感自然的定製音色。聲音複刻與語音合成是前後關聯的兩個步驟。本文檔聚焦於介紹聲音複刻的參數和介面細節，語音合成請參見即時語音合成-千問或語音合成-千問。

使用者指南：關於模型介紹和選型建議請參見即時語音合成-千問或語音合成-千問。

重要

本文檔專用於千問聲音複刻介面；若您使用的是CosyVoice模型，請參見CosyVoice聲音複刻API。

音頻要求

高品質的輸入音頻是獲得優質複刻效果的基礎。

專案	要求
支援格式	WAV (16bit)、MP3、M4A
音頻時間長度	推薦10~20秒，最長不得超過60秒
檔案大小	＜ 10 MB
採樣率	≥ 24 kHz
聲道	單聲道
內容	音頻必須包含至少3秒連續清晰朗讀（無背景音），其餘部分僅允許短暫停頓（≤2秒）；整段音頻應避免背景音樂、噪音或其他人聲，確保核心朗讀內容品質；請使用正常說話音頻作為輸入，不要上傳歌曲或唱歌音頻，以確保複刻效果準確和可用
語言	中文（zh）、英文（en）、德語（de）、意大利語（it）、葡萄牙語（pt）、西班牙語（es）、日語（ja）、韓語（ko）、法語（fr）、俄語（ru）

快速開始：從複刻到合成

1. 工作流程

聲音複刻與語音合成是緊密關聯的兩個獨立步驟，遵循“先建立，後使用”的流程：

建立音色
調用建立音色介面，上傳一段音頻。系統會分析該音頻，建立一個專屬的複刻音色。此步驟必須指定target_model，聲明建立的音色將由哪個語音合成模型驅動。
若已有建立好的音色（調用查詢音色列表介面查看），可跳過這一步直接進行下一步。
使用音色進行語音合成
調用語音合成介面，傳入上一步獲得的音色。此步驟指定的語音合成模型必須和上一步的target_model一致。

2. 模型配置與準備工作

選擇合適的模型並完成準備工作。

模型配置

聲音複刻時需要指定以下兩個模型：

聲音複刻模型：qwen-voice-enrollment
驅動音色的語音合成模型（兩類）：
- 千問3-TTS-VC-Realtime（參見即時語音合成-千問）：
  - qwen3-tts-vc-realtime-2026-01-15
  - qwen3-tts-vc-realtime-2025-11-27
- 千問3-TTS-VC（參見語音合成-千問）：
  - qwen3-tts-vc-2026-01-22

準備工作

擷取API Key：擷取API Key與API Host，為安全起見，推薦將API Key配置到環境變數。
安裝SDK：確保已安裝最新版DashScope SDK。
準備待覆刻音頻：音頻需符合音頻要求。

3. 端到端樣本

以下樣本示範了如何在語音合成中使用聲音複刻產生的專屬音色，實現與原音高度相似的輸出效果。

關鍵原則：聲音複刻時，target_model（驅動音色的語音合成模型）必須與後續調用語音合成介面時指定的語音合成模型一致，否則會合成失敗。
樣本使用本地音頻檔案 voice.mp3 進行聲音複刻，運行代碼時，請注意替換。

雙向流式合成

適用於千問3-TTS-VC-Realtime系列模型，更多說明請參見即時語音合成-千問。

Python

# coding=utf-8
# Installation instructions for pyaudio:
# APPLE Mac OS X
#   brew install portaudio
#   pip install pyaudio
# Debian/Ubuntu
#   sudo apt-get install python-pyaudio python3-pyaudio
#   or
#   pip install pyaudio
# CentOS
#   sudo yum install -y portaudio portaudio-devel && pip install pyaudio
# Microsoft Windows
#   python -m pip install pyaudio

import pyaudio
import os
import requests
import base64
import pathlib
import threading
import time
import dashscope  # DashScope Python SDK 版本需要不低於1.23.9
from dashscope.audio.qwen_tts_realtime import QwenTtsRealtime, QwenTtsRealtimeCallback, AudioFormat

# ======= 常量配置 =======
DEFAULT_TARGET_MODEL = "qwen3-tts-vc-realtime-2026-01-15"  # 聲音複刻、語音合成要使用相同的模型
DEFAULT_PREFERRED_NAME = "guanyu"
DEFAULT_AUDIO_MIME_TYPE = "audio/mpeg"
VOICE_FILE_PATH = "voice.mp3"  # 用於聲音複刻的本地音頻檔案的相對路徑

TEXT_TO_SYNTHESIZE = [
    '對吧~我就特別喜歡這種超市，',
    '尤其是過年的時候',
    '去逛超市',
    '就會覺得',
    '超級超級開心！',
    '想買好多好多的東西呢！'
]

def create_voice(file_path: str,
                 target_model: str = DEFAULT_TARGET_MODEL,
                 preferred_name: str = DEFAULT_PREFERRED_NAME,
                 audio_mime_type: str = DEFAULT_AUDIO_MIME_TYPE) -> str:
    """
    建立音色，並返回 voice 參數
    """
    # 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
    api_key = os.getenv("DASHSCOPE_API_KEY")

    file_path_obj = pathlib.Path(file_path)
    if not file_path_obj.exists():
        raise FileNotFoundError(f"音頻檔案不存在: {file_path}")

    base64_str = base64.b64encode(file_path_obj.read_bytes()).decode()
    data_uri = f"data:{audio_mime_type};base64,{base64_str}"

    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
    url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
    payload = {
        "model": "qwen-voice-enrollment", # 不要修改該值
        "input": {
            "action": "create",
            "target_model": target_model,
            "preferred_name": preferred_name,
            "audio": {"data": data_uri}
        }
    }
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    resp = requests.post(url, json=payload, headers=headers)
    if resp.status_code != 200:
        raise RuntimeError(f"建立 voice 失敗: {resp.status_code}, {resp.text}")

    try:
        return resp.json()["output"]["voice"]
    except (KeyError, ValueError) as e:
        raise RuntimeError(f"解析 voice 響應失敗: {e}")

def init_dashscope_api_key():
    """
    初始化 dashscope SDK 的 API key
    """
    # 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：dashscope.api_key = "sk-xxx"
    dashscope.api_key = os.getenv("DASHSCOPE_API_KEY")

# ======= 回調類 =======
class MyCallback(QwenTtsRealtimeCallback):
    """
    自訂 TTS 流式回調
    """
    def __init__(self):
        self.complete_event = threading.Event()
        self._player = pyaudio.PyAudio()
        self._stream = self._player.open(
            format=pyaudio.paInt16, channels=1, rate=24000, output=True
        )

    def on_open(self) -> None:
        print('[TTS] 串連已建立')

    def on_close(self, close_status_code, close_msg) -> None:
        self._stream.stop_stream()
        self._stream.close()
        self._player.terminate()
        print(f'[TTS] 串連關閉 code={close_status_code}, msg={close_msg}')

    def on_event(self, response: dict) -> None:
        try:
            event_type = response.get('type', '')
            if event_type == 'session.created':
                print(f'[TTS] 會話開始: {response["session"]["id"]}')
            elif event_type == 'response.audio.delta':
                audio_data = base64.b64decode(response['delta'])
                self._stream.write(audio_data)
            elif event_type == 'response.done':
                print(f'[TTS] 響應完成, Response ID: {qwen_tts_realtime.get_last_response_id()}')
            elif event_type == 'session.finished':
                print('[TTS] 會話結束')
                self.complete_event.set()
        except Exception as e:
            print(f'[Error] 處理回調事件異常: {e}')

    def wait_for_finished(self):
        self.complete_event.wait()

# ======= 主執行邏輯 =======
if __name__ == '__main__':
    init_dashscope_api_key()
    print('[系統] 初始化 Qwen TTS Realtime ...')

    callback = MyCallback()
    qwen_tts_realtime = QwenTtsRealtime(
        model=DEFAULT_TARGET_MODEL,
        callback=callback,
        # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：wss://dashscope.aliyuncs.com/api-ws/v1/realtime
        url='wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime'
    )
    qwen_tts_realtime.connect()
    
    qwen_tts_realtime.update_session(
        voice=create_voice(VOICE_FILE_PATH), # 將voice參數替換為複刻產生的專屬音色
        response_format=AudioFormat.PCM_24000HZ_MONO_16BIT,
        mode='server_commit'
    )

    for text_chunk in TEXT_TO_SYNTHESIZE:
        print(f'[發送文本]: {text_chunk}')
        qwen_tts_realtime.append_text(text_chunk)
        time.sleep(0.1)

    qwen_tts_realtime.finish()
    callback.wait_for_finished()

    print(f'[Metric] session_id={qwen_tts_realtime.get_session_id()}, '
          f'first_audio_delay={qwen_tts_realtime.get_first_audio_delay()}s')

Java

需要匯入Gson依賴，若是使用Maven或者Gradle，添加依賴方式如下：

Maven

在pom.xml中添加如下內容：

<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.13.1</version>
</dependency>

Gradle

在build.gradle中添加如下內容：

// https://mvnrepository.com/artifact/com.google.code.gson/gson
implementation("com.google.code.gson:gson:2.13.1")

import com.alibaba.dashscope.audio.qwen_tts_realtime.*;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.google.gson.Gson;
import com.google.gson.JsonObject;

import javax.sound.sampled.*;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.file.*;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.Queue;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.atomic.AtomicReference;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.atomic.AtomicBoolean;

public class Main {
    // ===== 常量定義 =====
    // 聲音複刻、語音合成要使用相同的模型
    private static final String TARGET_MODEL = "qwen3-tts-vc-realtime-2026-01-15";
    private static final String PREFERRED_NAME = "guanyu";
    // 用於聲音複刻的本地音頻檔案的相對路徑
    private static final String AUDIO_FILE = "voice.mp3";
    private static final String AUDIO_MIME_TYPE = "audio/mpeg";
    private static String[] textToSynthesize = {
            "對吧~我就特別喜歡這種超市",
            "尤其是過年的時候",
            "去逛超市",
            "就會覺得",
            "超級超級開心！",
            "想買好多好多的東西呢！"
    };

    // 產生 data URI
    public static String toDataUrl(String filePath) throws IOException {
        byte[] bytes = Files.readAllBytes(Paths.get(filePath));
        String encoded = Base64.getEncoder().encodeToString(bytes);
        return "data:" + AUDIO_MIME_TYPE + ";base64," + encoded;
    }

    // 調用 API 建立 voice
    public static String createVoice() throws Exception {
        // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        String jsonPayload =
                "{"
                        + "\"model\": \"qwen-voice-enrollment\"," // 不要修改該值
                        + "\"input\": {"
                        +     "\"action\": \"create\","
                        +     "\"target_model\": \"" + TARGET_MODEL + "\","
                        +     "\"preferred_name\": \"" + PREFERRED_NAME + "\","
                        +     "\"audio\": {"
                        +         "\"data\": \"" + toDataUrl(AUDIO_FILE) + "\""
                        +     "}"
                        + "}"
                        + "}";

        HttpURLConnection con = (HttpURLConnection) new URL("https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization").openConnection();
        con.setRequestMethod("POST");
        con.setRequestProperty("Authorization", "Bearer " + apiKey);
        con.setRequestProperty("Content-Type", "application/json");
        con.setDoOutput(true);

        try (OutputStream os = con.getOutputStream()) {
            os.write(jsonPayload.getBytes(StandardCharsets.UTF_8));
        }

        int status = con.getResponseCode();
        System.out.println("HTTP 狀態代碼: " + status);

        try (BufferedReader br = new BufferedReader(
                new InputStreamReader(status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(),
                        StandardCharsets.UTF_8))) {
            StringBuilder response = new StringBuilder();
            String line;
            while ((line = br.readLine()) != null) {
                response.append(line);
            }
            System.out.println("返回內容: " + response);

            if (status == 200) {
                JsonObject jsonObj = new Gson().fromJson(response.toString(), JsonObject.class);
                return jsonObj.getAsJsonObject("output").get("voice").getAsString();
            }
            throw new IOException("建立語音失敗: " + status + " - " + response);
        }
    }

    // 即時PCM音頻播放器類
    public static class RealtimePcmPlayer {
        private int sampleRate;
        private SourceDataLine line;
        private AudioFormat audioFormat;
        private Thread decoderThread;
        private Thread playerThread;
        private AtomicBoolean stopped = new AtomicBoolean(false);
        private Queue<String> b64AudioBuffer = new ConcurrentLinkedQueue<>();
        private Queue<byte[]> RawAudioBuffer = new ConcurrentLinkedQueue<>();

        // 建構函式初始化音頻格式和音頻線路
        public RealtimePcmPlayer(int sampleRate) throws LineUnavailableException {
            this.sampleRate = sampleRate;
            this.audioFormat = new AudioFormat(this.sampleRate, 16, 1, true, false);
            DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
            line = (SourceDataLine) AudioSystem.getLine(info);
            line.open(audioFormat);
            line.start();
            decoderThread = new Thread(new Runnable() {
                @Override
                public void run() {
                    while (!stopped.get()) {
                        String b64Audio = b64AudioBuffer.poll();
                        if (b64Audio != null) {
                            byte[] rawAudio = Base64.getDecoder().decode(b64Audio);
                            RawAudioBuffer.add(rawAudio);
                        } else {
                            try {
                                Thread.sleep(100);
                            } catch (InterruptedException e) {
                                throw new RuntimeException(e);
                            }
                        }
                    }
                }
            });
            playerThread = new Thread(new Runnable() {
                @Override
                public void run() {
                    while (!stopped.get()) {
                        byte[] rawAudio = RawAudioBuffer.poll();
                        if (rawAudio != null) {
                            try {
                                playChunk(rawAudio);
                            } catch (IOException e) {
                                throw new RuntimeException(e);
                            } catch (InterruptedException e) {
                                throw new RuntimeException(e);
                            }
                        } else {
                            try {
                                Thread.sleep(100);
                            } catch (InterruptedException e) {
                                throw new RuntimeException(e);
                            }
                        }
                    }
                }
            });
            decoderThread.start();
            playerThread.start();
        }

        // 播放一個音頻塊並阻塞直到播放完成
        private void playChunk(byte[] chunk) throws IOException, InterruptedException {
            if (chunk == null || chunk.length == 0) return;

            int bytesWritten = 0;
            while (bytesWritten < chunk.length) {
                bytesWritten += line.write(chunk, bytesWritten, chunk.length - bytesWritten);
            }
            int audioLength = chunk.length / (this.sampleRate*2/1000);
            // 等待緩衝區中的音頻播放完成
            Thread.sleep(audioLength - 10);
        }

        public void write(String b64Audio) {
            b64AudioBuffer.add(b64Audio);
        }

        public void cancel() {
            b64AudioBuffer.clear();
            RawAudioBuffer.clear();
        }

        public void waitForComplete() throws InterruptedException {
            while (!b64AudioBuffer.isEmpty() || !RawAudioBuffer.isEmpty()) {
                Thread.sleep(100);
            }
            line.drain();
        }

        public void shutdown() throws InterruptedException {
            stopped.set(true);
            decoderThread.join();
            playerThread.join();
            if (line != null && line.isRunning()) {
                line.drain();
                line.close();
            }
        }
    }

    public static void main(String[] args) throws Exception {
        QwenTtsRealtimeParam param = QwenTtsRealtimeParam.builder()
                .model(TARGET_MODEL)
                // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：wss://dashscope.aliyuncs.com/api-ws/v1/realtime
                .url("wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime")
                // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
                // 若沒有配置環境變數，請用百鍊API Key將下行替換為：.apikey("sk-xxx")
                .apikey(System.getenv("DASHSCOPE_API_KEY"))
                .build();
        AtomicReference<CountDownLatch> completeLatch = new AtomicReference<>(new CountDownLatch(1));
        final AtomicReference<QwenTtsRealtime> qwenTtsRef = new AtomicReference<>(null);

        // 建立即時音頻播放器執行個體
        RealtimePcmPlayer audioPlayer = new RealtimePcmPlayer(24000);

        QwenTtsRealtime qwenTtsRealtime = new QwenTtsRealtime(param, new QwenTtsRealtimeCallback() {
            @Override
            public void onOpen() {
                // 串連建立時的處理
            }
            @Override
            public void onEvent(JsonObject message) {
                String type = message.get("type").getAsString();
                switch(type) {
                    case "session.created":
                        // 會話建立時的處理
                        break;
                    case "response.audio.delta":
                        String recvAudioB64 = message.get("delta").getAsString();
                        // 即時播放音頻
                        audioPlayer.write(recvAudioB64);
                        break;
                    case "response.done":
                        // 響應完成時的處理
                        break;
                    case "session.finished":
                        // 會話結束時的處理
                        completeLatch.get().countDown();
                    default:
                        break;
                }
            }
            @Override
            public void onClose(int code, String reason) {
                // 串連關閉時的處理
            }
        });
        qwenTtsRef.set(qwenTtsRealtime);
        try {
            qwenTtsRealtime.connect();
        } catch (NoApiKeyException e) {
            throw new RuntimeException(e);
        }
        QwenTtsRealtimeConfig config = QwenTtsRealtimeConfig.builder()
                .voice(createVoice()) // 將voice參數替換為複刻產生的專屬音色
                .responseFormat(QwenTtsRealtimeAudioFormat.PCM_24000HZ_MONO_16BIT)
                .mode("server_commit")
                .build();
        qwenTtsRealtime.updateSession(config);
        for (String text:textToSynthesize) {
            qwenTtsRealtime.appendText(text);
            Thread.sleep(100);
        }
        qwenTtsRealtime.finish();
        completeLatch.get().await();

        // 等待音頻播放完成並關閉播放器
        audioPlayer.waitForComplete();
        audioPlayer.shutdown();
        System.exit(0);
    }
}

非流式/單向流式合成

適用於千問3-TTS-VC系列模型，更多說明請參見語音合成-千問。

這裡參考了使用系統音色進行語音合成DashScope SDK的“非流式輸出”範例程式碼（非流式合成），將voice參數替換為聲音複刻產生的專屬音色進行語音合成。單向流式合成請參見語音合成-千問。

Python

import os
import requests
import base64
import pathlib
import dashscope

# ======= 常量配置 =======
DEFAULT_TARGET_MODEL = "qwen3-tts-vc-2026-01-22"  # 聲音複刻、語音合成要使用相同的模型
DEFAULT_PREFERRED_NAME = "guanyu"
DEFAULT_AUDIO_MIME_TYPE = "audio/mpeg"
VOICE_FILE_PATH = "voice.mp3"  # 用於聲音複刻的本地音頻檔案的相對路徑


def create_voice(file_path: str,
                 target_model: str = DEFAULT_TARGET_MODEL,
                 preferred_name: str = DEFAULT_PREFERRED_NAME,
                 audio_mime_type: str = DEFAULT_AUDIO_MIME_TYPE) -> str:
    """
    建立音色，並返回 voice 參數
    """
    # 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
    api_key = os.getenv("DASHSCOPE_API_KEY")

    file_path_obj = pathlib.Path(file_path)
    if not file_path_obj.exists():
        raise FileNotFoundError(f"音頻檔案不存在: {file_path}")

    base64_str = base64.b64encode(file_path_obj.read_bytes()).decode()
    data_uri = f"data:{audio_mime_type};base64,{base64_str}"

    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
    url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
    payload = {
        "model": "qwen-voice-enrollment", # 不要修改該值
        "input": {
            "action": "create",
            "target_model": target_model,
            "preferred_name": preferred_name,
            "audio": {"data": data_uri}
        }
    }
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    resp = requests.post(url, json=payload, headers=headers)
    if resp.status_code != 200:
        raise RuntimeError(f"建立 voice 失敗: {resp.status_code}, {resp.text}")

    try:
        return resp.json()["output"]["voice"]
    except (KeyError, ValueError) as e:
        raise RuntimeError(f"解析 voice 響應失敗: {e}")


if __name__ == '__main__':
    # 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1
    dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

    text = "今天天氣怎麼樣？"
    # SpeechSynthesizer介面使用方法：dashscope.audio.qwen_tts.SpeechSynthesizer.call(...)
    response = dashscope.MultiModalConversation.call(
        model=DEFAULT_TARGET_MODEL,
        # 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        # 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        text=text,
        voice=create_voice(VOICE_FILE_PATH), # 將voice參數替換為複刻產生的專屬音色
        stream=False
    )
    print(response)

Java

需要匯入Gson依賴，若是使用Maven或者Gradle，添加依賴方式如下：

Maven

在pom.xml中添加如下內容：

<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.13.1</version>
</dependency>

Gradle

在build.gradle中添加如下內容：

// https://mvnrepository.com/artifact/com.google.code.gson/gson
implementation("com.google.code.gson:gson:2.13.1")

重要

使用聲音複刻產生的專屬音色進行語音合成時，必須按照如下方式設定音色：

MultiModalConversationParam param = MultiModalConversationParam.builder()
                .parameter("voice", "your_voice") // 將voice參數替換為複刻產生的專屬音色
                .build();

import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.utils.Constants;
import com.google.gson.Gson;
import com.google.gson.JsonObject;

import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.file.*;
import java.nio.charset.StandardCharsets;
import java.util.Base64;

public class Main {
    // ===== 常量定義 =====
    // 聲音複刻、語音合成要使用相同的模型
    private static final String TARGET_MODEL = "qwen3-tts-vc-2026-01-22";
    private static final String PREFERRED_NAME = "guanyu";
    // 用於聲音複刻的本地音頻檔案的相對路徑
    private static final String AUDIO_FILE = "voice.mp3";
    private static final String AUDIO_MIME_TYPE = "audio/mpeg";

    // 產生 data URI
    public static String toDataUrl(String filePath) throws IOException {
        byte[] bytes = Files.readAllBytes(Paths.get(filePath));
        String encoded = Base64.getEncoder().encodeToString(bytes);
        return "data:" + AUDIO_MIME_TYPE + ";base64," + encoded;
    }

    // 調用 API 建立 voice
    public static String createVoice() throws Exception {
        // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        String jsonPayload =
                "{"
                        + "\"model\": \"qwen-voice-enrollment\"," // 不要修改該值
                        + "\"input\": {"
                        +     "\"action\": \"create\","
                        +     "\"target_model\": \"" + TARGET_MODEL + "\","
                        +     "\"preferred_name\": \"" + PREFERRED_NAME + "\","
                        +     "\"audio\": {"
                        +         "\"data\": \"" + toDataUrl(AUDIO_FILE) + "\""
                        +     "}"
                        + "}"
                        + "}";

        // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
        String url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";
        HttpURLConnection con = (HttpURLConnection) new URL(url).openConnection();
        con.setRequestMethod("POST");
        con.setRequestProperty("Authorization", "Bearer " + apiKey);
        con.setRequestProperty("Content-Type", "application/json");
        con.setDoOutput(true);

        try (OutputStream os = con.getOutputStream()) {
            os.write(jsonPayload.getBytes(StandardCharsets.UTF_8));
        }

        int status = con.getResponseCode();
        System.out.println("HTTP 狀態代碼: " + status);

        try (BufferedReader br = new BufferedReader(
                new InputStreamReader(status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(),
                        StandardCharsets.UTF_8))) {
            StringBuilder response = new StringBuilder();
            String line;
            while ((line = br.readLine()) != null) {
                response.append(line);
            }
            System.out.println("返回內容: " + response);

            if (status == 200) {
                JsonObject jsonObj = new Gson().fromJson(response.toString(), JsonObject.class);
                return jsonObj.getAsJsonObject("output").get("voice").getAsString();
            }
            throw new IOException("建立語音失敗: " + status + " - " + response);
        }
    }

    public static void call() throws Exception {
        MultiModalConversation conv = new MultiModalConversation();
        MultiModalConversationParam param = MultiModalConversationParam.builder()
                // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
                // 若沒有配置環境變數，請用百鍊API Key將下行替換為：.apikey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model(TARGET_MODEL)
                .text("今天天氣怎麼樣？")
                .parameter("voice", createVoice()) // 將voice參數替換為複刻產生的專屬音色
                .build();
        MultiModalConversationResult result = conv.call(param);
        String audioUrl = result.getOutput().getAudio().getUrl();
        System.out.print(audioUrl);

        // 下載音頻檔案到本地
        try (InputStream in = new URL(audioUrl).openStream();
             FileOutputStream out = new FileOutputStream("downloaded_audio.wav")) {
            byte[] buffer = new byte[1024];
            int bytesRead;
            while ((bytesRead = in.read(buffer)) != -1) {
                out.write(buffer, 0, bytesRead);
            }
            System.out.println("\n音頻檔案已下載到本地: downloaded_audio.wav");
        } catch (Exception e) {
            System.out.println("\n下載音頻檔案時出錯: " + e.getMessage());
        }
    }
    public static void main(String[] args) {
        try {
            // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1
            Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
            call();
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}

API參考

使用不同 API 時，請確保使用同一帳號進行操作。

建立音色

上傳用於複刻的音頻，建立自訂音色。

URL

中國內地：

POST https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

國際：

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

要求標頭

參數	類型	是否必須	說明
Authorization	string	支援	鑒權令牌，格式為`Bearer <your_api_key>`，使用時，將“`<your_api_key>`”替換為實際的API Key。
Content-Type	string	支援	請求體中傳輸的資料的媒體類型。固定為`application/json`。

訊息體

包含所有請求參數的訊息體如下，對於可選欄位，在實際業務中可根據需求省略。

重要

注意區分如下參數：

model：聲音複刻模型，固定為qwen-voice-enrollment
target_model：驅動音色的語音合成模型，須和後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗

{
    "model": "qwen-voice-enrollment",
    "input": {
        "action": "create",
        "target_model": "qwen3-tts-vc-realtime-2026-01-15",
        "preferred_name": "guanyu",
        "audio": {
            "data": "https://xxx.wav"
        },
        "text": "可選項，填入audio.data對應的文本",
        "language": "可選項，填入audio.data對應的語種，如zh"
    }
}

請求參數

參數	類型	預設值	是否必須	說明
model	string	-	支援	聲音複刻模型，固定為`qwen-voice-enrollment`。
action	string	-	支援	操作類型，固定為`create`。
target_model	string	-	支援	驅動音色的語音合成模型，支援的模型有（兩類）：千問3-TTS-VC-Realtime（參見即時語音合成-千問）： qwen3-tts-vc-realtime-2026-01-15 qwen3-tts-vc-realtime-2025-11-27 千問3-TTS-VC（參見語音合成-千問）： qwen3-tts-vc-2026-01-22 必須與後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗。
preferred_name	string	-	支援	為音色指定一個便於識別的名稱（僅允許數字、大小寫字母和底線，不超過16個字元）。建議選用與角色、情境相關的標識。該關鍵字會在複刻的音色名中出現，例如關鍵字為“guanyu”，最終音色名為“qwen-tts-vc-guanyu-voice-20250812105009984-838b”
audio.data	string	-	支援	用於複刻的音頻（錄製時需遵循錄音操作指南，音頻需滿足音頻要求）。可通過以下兩種方式提交音頻資料： Data URL 格式：`data:<mediatype>;base64,<data>` `<mediatype>`：MIME類型 WAV：`audio/wav` MP3：`audio/mpeg` M4A：`audio/mp4` `<data>`：音頻轉成的Base64編碼的字串 Base64編碼會增大體積，請控制原檔案大小，確保編碼後仍小於10MB 樣本：`data:audio/wav;base64,SUQzBAAAAAAAI1RTU0UAAAAPAAADTGF2ZjU4LjI5LjEwMAAAAAAAAAAAAAAA//PAxABQ/BXRbMPe4IQAhl9` 點擊查看範例程式碼 Python `import base64, pathlib # input.mp3為用於聲音複刻的本地音頻檔案，請替換為自己的音頻檔案路徑，確保其符合音頻要求 file_path = pathlib.Path("input.mp3") base64_str = base64.b64encode(file_path.read_bytes()).decode() data_uri = f"data:audio/mpeg;base64,{base64_str}"` Java import java.nio.file.; import java.util.Base64; public class Main { /* * filePath為用於聲音複刻的本地音頻檔案，請替換為自己的音頻檔案路徑，確保其符合音頻要求 */ public static String toDataUrl(String filePath) throws Exception { byte[] bytes = Files.readAllBytes(Paths.get(filePath)); String encoded = Base64.getEncoder().encodeToString(bytes); return "data:audio/mpeg;base64," + encoded; } // 使用樣本 public static void main(String[] args) throws Exception { System.out.println(toDataUrl("input.mp3")); } } 音頻URL（推薦將音頻上傳至OSS）檔案大小不超過10MB URL必須公網可訪問且無需鑒權
text	string	-	不支援	與`audio.data`音頻內容相匹配的文本。傳入該參數後，服務端會對比音頻與該文本的差異，若差異過大，將返回Audio.PreprocessError。
language	string	-	不支援	`audio.data`音頻對應的語種。支援`zh`（中文）、`en`（英文）、`de`（德語）、`it`（意大利語）、`pt`（葡萄牙語）、`es`（西班牙語）、`ja`（日語）、`ko`（韓語）、`fr`（法語）、`ru`（俄語）。若使用該參數，設定的語種要和實際用於複刻的音訊語種一致。

響應參數

點擊查看響應樣本

{
    "output": {
        "voice": "yourVoice",
        "target_model": "qwen3-tts-vc-realtime-2026-01-15"
    },
    "usage": {
        "count": 1
    },
    "request_id": "yourRequestId"
}

需關注的參數如下：

參數	類型	說明
voice	string	音色名稱，可直接用於語音合成介面的`voice`參數。
target_model	string	驅動音色的語音合成模型，支援的模型有（兩類）：千問3-TTS-VC-Realtime（參見即時語音合成-千問）： qwen3-tts-vc-realtime-2026-01-15 qwen3-tts-vc-realtime-2025-11-27 千問3-TTS-VC（參見語音合成-千問）： qwen3-tts-vc-2026-01-22 必須與後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗。
request_id	string	Request ID。
count	integer	本次請求實際計入費用的“建立音色”次數，本次請求的費用為$ $co u n t \times 0.01$ 。建立音色時，count恒為1。

範例程式碼

重要

注意區分如下參數：

model：聲音複刻模型，固定為qwen-voice-enrollment
target_model：驅動音色的語音合成模型，須和後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗

cURL

若未將API Key配置到環境變數，需將樣本中的$DASHSCOPE_API_KEY替換為實際的API Key。

# ======= 重要提示 =======
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# === 執行時請刪除該注釋 ===

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-voice-enrollment",
    "input": {
        "action": "create",
        "target_model": "qwen3-tts-vc-realtime-2026-01-15",
        "preferred_name": "guanyu",
        "audio": {
            "data": "https://xxx.wav"
        }
    }
}'

Python

import os
import requests
import base64, pathlib

target_model = "qwen3-tts-vc-realtime-2026-01-15"
preferred_name = "guanyu"
audio_mime_type = "audio/mpeg"

file_path = pathlib.Path("input.mp3")
base64_str = base64.b64encode(file_path.read_bytes()).decode()
data_uri = f"data:{audio_mime_type};base64,{base64_str}"

# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"

payload = {
    "model": "qwen-voice-enrollment", # 不要修改這個值
    "input": {
        "action": "create",
        "target_model": target_model,
        "preferred_name": preferred_name,
        "audio": {
            "data": data_uri
        }
    }
}

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

# 發送 POST 請求
resp = requests.post(url, json=payload, headers=headers)

if resp.status_code == 200:
    data = resp.json()
    voice = data["output"]["voice"]
    print(f"產生的 voice 參數為: {voice}")
else:
    print("請求失敗:", resp.status_code, resp.text)

Java

import com.google.gson.Gson;
import com.google.gson.JsonObject;

import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.file.*;
import java.util.Base64;

public class Main {
    private static final String TARGET_MODEL = "qwen3-tts-vc-realtime-2026-01-15";
    private static final String PREFERRED_NAME = "guanyu";
    private static final String AUDIO_FILE = "input.mp3";
    private static final String AUDIO_MIME_TYPE = "audio/mpeg";

    public static String toDataUrl(String filePath) throws Exception {
        byte[] bytes = Files.readAllBytes(Paths.get(filePath));
        String encoded = Base64.getEncoder().encodeToString(bytes);
        return "data:" + AUDIO_MIME_TYPE + ";base64," + encoded;
    }

    public static void main(String[] args) {
        // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");
        // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
        String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";

        try {
            // 構造 JSON 請求體（注意內部的引號需轉義）
            String jsonPayload =
                    "{"
                            + "\"model\": \"qwen-voice-enrollment\"," // 不要修改該值
                            + "\"input\": {"
                            +     "\"action\": \"create\","
                            +     "\"target_model\": \"" + TARGET_MODEL + "\","
                            +     "\"preferred_name\": \"" + PREFERRED_NAME + "\","
                            +     "\"audio\": {"
                            +         "\"data\": \"" + toDataUrl(AUDIO_FILE) + "\""
                            +     "}"
                            + "}"
                            + "}";

            HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
            con.setRequestMethod("POST");
            con.setRequestProperty("Authorization", "Bearer " + apiKey);
            con.setRequestProperty("Content-Type", "application/json");
            con.setDoOutput(true);

            // 發送請求體
            try (OutputStream os = con.getOutputStream()) {
                os.write(jsonPayload.getBytes("UTF-8"));
            }

            int status = con.getResponseCode();
            InputStream is = (status >= 200 && status < 300)
                    ? con.getInputStream()
                    : con.getErrorStream();

            StringBuilder response = new StringBuilder();
            try (BufferedReader br = new BufferedReader(new InputStreamReader(is, "UTF-8"))) {
                String line;
                while ((line = br.readLine()) != null) {
                    response.append(line);
                }
            }

            System.out.println("HTTP 狀態代碼: " + status);
            System.out.println("返回內容: " + response.toString());

            if (status == 200) {
                // 解析 JSON
                Gson gson = new Gson();
                JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
                String voice = jsonObj.getAsJsonObject("output").get("voice").getAsString();
                System.out.println("產生的 voice 參數為: " + voice);
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

查詢音色列表

分頁查詢已建立的音色列表。

URL

中國內地：

POST https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

國際：

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

要求標頭

參數	類型	是否必須	說明
Authorization	string	支援	鑒權令牌，格式為`Bearer <your_api_key>`，使用時，將“`<your_api_key>`”替換為實際的API Key。
Content-Type	string	支援	請求體中傳輸的資料的媒體類型。固定為`application/json`。

訊息體
包含所有請求參數的訊息體如下，對於可選欄位，在實際業務中可根據需求省略。
重要
model：聲音複刻模型，固定為qwen-voice-enrollment，請勿修改。
```
{
    "model": "qwen-voice-enrollment",
    "input": {
        "action": "list",
        "page_size": 2,
        "page_index": 0
    }
}
```

請求參數

參數	類型	預設值	是否必須	說明
model	string	-	支援	聲音複刻模型，固定為`qwen-voice-enrollment`。
action	string	-	支援	操作類型，固定為`list`。
page_index	integer	0	不支援	頁碼索引。取值範圍：[0, 1000000]。
page_size	integer	10	不支援	每頁包含資料條數。取值範圍：[0, 1000000]。

響應參數

點擊查看響應樣本

{
    "output": {
        "voice_list": [
            {
                "voice": "yourVoice1",
                "gmt_create": "2025-08-11 17:59:32",
                "target_model": "qwen3-tts-vc-realtime-2026-01-15"
            },
            {
                "voice": "yourVoice2",
                "gmt_create": "2025-08-11 17:38:10",
                "target_model": "qwen3-tts-vc-realtime-2026-01-15"
            }
        ]
    },
    "usage": {
        "count": 0
    },
    "request_id": "yourRequestId"
}

需關注的參數如下：

參數	類型	說明
voice	string	音色名稱，可直接用於語音合成介面的`voice`參數。
gmt_create	string	建立音色的時間。
target_model	string	驅動音色的語音合成模型，支援的模型有（兩類）：千問3-TTS-VC-Realtime（參見即時語音合成-千問）： qwen3-tts-vc-realtime-2026-01-15 qwen3-tts-vc-realtime-2025-11-27 千問3-TTS-VC（參見語音合成-千問）： qwen3-tts-vc-2026-01-22 必須與後續調用語音合成介面時使用的語音合成模型一致，否則合成會失敗。
request_id	string	Request ID。
count	integer	本次請求實際計入費用的“建立音色”次數，本次請求的費用為$ $co u n t \times 0.01$ 。查詢音色不計費，因此`count`恒為0。

範例程式碼

重要

model：聲音複刻模型，固定為qwen-voice-enrollment，請勿修改。

cURL

若未將API Key配置到環境變數，需將樣本中的$DASHSCOPE_API_KEY替換為實際的API Key。

# ======= 重要提示 =======
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# === 執行時請刪除該注釋 ===

curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization' \
--header 'Authorization: Bearer $DASHSCOPE_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen-voice-enrollment",
    "input": {
        "action": "list",
        "page_size": 10,
        "page_index": 0
    }
}'

Python

import os
import requests

# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"

payload = {
    "model": "qwen-voice-enrollment", # 不要修改該值
    "input": {
        "action": "list",
        "page_size": 10,
        "page_index": 0
    }
}

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print("HTTP 狀態代碼:", response.status_code)

if response.status_code == 200:
    data = response.json()
    voice_list = data["output"]["voice_list"]

    print("查詢到的音色列表：")
    for item in voice_list:
        print(f"- 音色: {item['voice']}  建立時間: {item['gmt_create']}  模型: {item['target_model']}")
else:
    print("請求失敗:", response.text)

Java

import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class Main {
    public static void main(String[] args) {
        // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");
        // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
        String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";

        // JSON 請求體（舊版本 Java 無 """ 多行字串）
        String jsonPayload =
                "{"
                        + "\"model\": \"qwen-voice-enrollment\"," // 不要修改該值
                        + "\"input\": {"
                        +     "\"action\": \"list\","
                        +     "\"page_size\": 10,"
                        +     "\"page_index\": 0"
                        + "}"
                        + "}";

        try {
            HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
            con.setRequestMethod("POST");
            con.setRequestProperty("Authorization", "Bearer " + apiKey);
            con.setRequestProperty("Content-Type", "application/json");
            con.setDoOutput(true);

            try (OutputStream os = con.getOutputStream()) {
                os.write(jsonPayload.getBytes("UTF-8"));
            }

            int status = con.getResponseCode();
            BufferedReader br = new BufferedReader(new InputStreamReader(
                    status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(), "UTF-8"));

            StringBuilder response = new StringBuilder();
            String line;
            while ((line = br.readLine()) != null) {
                response.append(line);
            }
            br.close();

            System.out.println("HTTP 狀態代碼: " + status);
            System.out.println("返回 JSON: " + response.toString());

            if (status == 200) {
                Gson gson = new Gson();
                JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
                JsonArray voiceList = jsonObj.getAsJsonObject("output").getAsJsonArray("voice_list");

                System.out.println("\n 查詢到的音色列表：");
                for (int i = 0; i < voiceList.size(); i++) {
                    JsonObject voiceItem = voiceList.get(i).getAsJsonObject();
                    String voice = voiceItem.get("voice").getAsString();
                    String gmtCreate = voiceItem.get("gmt_create").getAsString();
                    String targetModel = voiceItem.get("target_model").getAsString();

                    System.out.printf("- 音色: %s  建立時間: %s  模型: %s\n",
                            voice, gmtCreate, targetModel);
                }
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

刪除音色

刪除指定音色，釋放對應額度。

URL

中國內地：

POST https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

國際：

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

要求標頭

參數	類型	是否必須	說明
Authorization	string	支援	鑒權令牌，格式為`Bearer <your_api_key>`，使用時，將“`<your_api_key>`”替換為實際的API Key。
Content-Type	string	支援	請求體中傳輸的資料的媒體類型。固定為`application/json`。

訊息體
包含所有請求參數的訊息體如下，對於可選欄位，在實際業務中可根據需求省略：
重要
model：聲音複刻模型，固定為qwen-voice-enrollment，請勿修改。
```
{
    "model": "qwen-voice-enrollment",
    "input": {
        "action": "delete",
        "voice": "yourVoice"
    }
}
```

請求參數

參數	類型	預設值	是否必須	說明
model	string	-	支援	聲音複刻模型，固定為`qwen-voice-enrollment`。
action	string	-	支援	操作類型，固定為`delete`。
voice	string	-	支援	待刪除的音色。

響應參數

點擊查看響應樣本

{
    "usage": {
        "count": 0
    },
    "request_id": "yourRequestId"
}

需關注的參數如下：

參數

類型

說明

request_id

string

Request ID。

count

integer

本次請求實際計入費用的“建立音色”次數，本次請求的費用為$ $co u n t \times 0.01$ 。

刪除音色不計費，因此count恒為0。

範例程式碼

重要

model：聲音複刻模型，固定為qwen-voice-enrollment，請勿修改。

cURL

若未將API Key配置到環境變數，需將樣本中的$DASHSCOPE_API_KEY替換為實際的API Key。

# ======= 重要提示 =======
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# === 執行時請刪除該注釋 ===

curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization' \
--header 'Authorization: Bearer $DASHSCOPE_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen-voice-enrollment",
    "input": {
        "action": "delete",
        "voice": "yourVoice"
    }
}'

Python

import os
import requests

# 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
# 若沒有配置環境變數，請用百鍊API Key將下行替換為：api_key = "sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")
# 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"

voice_to_delete = "yourVoice"  # 要刪除的音色（替換為真實值）

payload = {
    "model": "qwen-voice-enrollment", # 不要修改該值
    "input": {
        "action": "delete",
        "voice": voice_to_delete
    }
}

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print("HTTP 狀態代碼:", response.status_code)

if response.status_code == 200:
    data = response.json()
    request_id = data["request_id"]

    print(f"刪除成功")
    print(f"Request ID: {request_id}")
else:
    print("請求失敗:", response.text)

Java

import com.google.gson.Gson;
import com.google.gson.JsonObject;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class Main {
    public static void main(String[] args) {
        // 新加坡地區和北京地區的API Key不同。擷取API Key：https://www.alibabacloud.com/help/zh/model-studio/get-api-key
        // 若沒有配置環境變數，請用百鍊API Key將下行替換為：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");
        // 以下為新加坡地區url，若使用北京地區的模型，需將url替換為：https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
        String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";
        String voiceToDelete = "yourVoice"; // 要刪除的音色（替換為真實值）

        // 構造 JSON 請求體（字串拼接，相容 Java 8）
        String jsonPayload =
                "{"
                        + "\"model\": \"qwen-voice-enrollment\"," // 不要修改該值
                        + "\"input\": {"
                        +     "\"action\": \"delete\","
                        +     "\"voice\": \"" + voiceToDelete + "\""
                        + "}"
                        + "}";

        try {
            // 建立 POST 串連
            HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
            con.setRequestMethod("POST");
            con.setRequestProperty("Authorization", "Bearer " + apiKey);
            con.setRequestProperty("Content-Type", "application/json");
            con.setDoOutput(true);

            // 發送請求體
            try (OutputStream os = con.getOutputStream()) {
                os.write(jsonPayload.getBytes("UTF-8"));
            }

            int status = con.getResponseCode();
            BufferedReader br = new BufferedReader(new InputStreamReader(
                    status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(), "UTF-8"));

            StringBuilder response = new StringBuilder();
            String line;
            while ((line = br.readLine()) != null) {
                response.append(line);
            }
            br.close();

            System.out.println("HTTP 狀態代碼: " + status);
            System.out.println("返回 JSON: " + response.toString());

            if (status == 200) {
                Gson gson = new Gson();
                JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
                String requestId = jsonObj.get("request_id").getAsString();

                System.out.println("刪除成功");
                System.out.println("Request ID: " + requestId);
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

語音合成

如何使用聲音複刻產生的專屬音色合成個人化的聲音，請參見快速開始：從複刻到合成。

用於聲音複刻的語音合成模型（如 qwen3-tts-vc-realtime-2026-01-15）為專用模型，僅支援使用複刻產生的音色，不支援Chelsie、Serena、Ethan、Cherry等系統音色。

音色配額與自動清理規則

總數限制：1000個音色/帳號
當前介面不提供音色數量查詢功能，可通過調用查詢音色列表介面自行統計音色數目
自動清理：若單個音色在過去一年內未被用於任何語音合成請求，系統將自動將其刪除

計費說明

聲音複刻和語音合成分開計費：

聲音複刻：建立音色按$0.01/個計費，建立失敗不計費
說明
免費額度說明（僅中國站北京地區和國際站新加坡地區有免費額度）：
- 阿里雲百鍊開通後90天內，可享1000次免費音色建立機會。
- 建立失敗不佔用免費次數。
- 刪除音色不會恢複免費次數。
- 免費額度用完或超出 90 天有效期間後，建立音色將按$0.01/個的價格計費。
使用複刻產生的專屬音色進行語音合成：按量（文本字元數）計費，詳情請參見即時語音合成-千問或語音合成-千問

著作權與合法性

您需對所提供聲音的所有權及合法使用權負責，請注意閱讀服務合約。

錄音操作指南

錄音裝置

推薦使用具備降噪功能的麥克風，或在安靜環境下使用手機近距離錄音，以保證音源純淨。

錄音環境

場地

建議在 10 平方米以內的小型封閉空間錄音。
優先選擇配有吸音材料（如吸音棉、地毯、窗帘）的房間。
避免空曠大廳、會議室、教室等高混響場所。

噪音控制

室外噪音：關閉門窗，避免交通、施工等幹擾。
室內噪音：關閉空調、風扇、日光燈鎮流器等裝置；可通過手機錄製環境音並放大播放，識別潛在噪音源。

混響控制

混響會導致聲音模糊、清晰度下降。
減少光滑表面反射：拉上窗帘、開啟衣櫃門、鋪放衣物或床單覆蓋案頭/櫃面。
利用不規則物體（如書架、軟包傢具）實現聲波漫反射。

錄音文案

文案內容靈活，建議與目標應用情境一致（例如，若用於客服情境，文案應為客服對話風格），但必須確保不包含任何敏感或非法詞彙（如政治、色情、暴力相關內容），否則會導致複刻失敗。
避免短句（如“你好”、“是的”），應使用完整句子。
保持語義連貫，朗讀時避免頻繁停頓（建議至少連續 3 秒無中斷）。
可帶入目標情緒（如親切、嚴肅），但需避免過度誇張的戲劇化朗讀，保持語調自然。

操作建議

以普通臥室為例：

關閉門窗，隔絕外部噪音。
關閉空調、電扇等電器。
拉上窗帘，減少玻璃反射。
在案頭鋪放衣物或毛毯，降低案頭反射。
提前熟悉文案，設定角色語氣，自然演繹。
與錄音裝置保持約 10 厘米距離，避免噴麥或訊號過弱。

錯誤資訊

如遇報錯問題，請參見錯誤資訊進行排查。