ホットワードのカスタマイズ - Alibaba Cloud Model Studio - Alibaba Cloud ドキュメントセンター

特定のビジネスドメインにおける単語やフレーズの音声認識性能が低い場合、それらをカスタム語彙に追加できます。これにより、それらの認識が優先され、精度が向上します。

重要

このドキュメントは中国 (北京) リージョンにのみ適用されます。モデルを使用するには、中国 (北京) リージョンの API キーが必要です。

概要

カスタム語彙は、SDK (Software Development Kit) で term のリストとして使用されます。このリストは JSON 配列であり、各 term は以下のフィールドを持つオブジェクトです。

フィールド	型	必須	説明
text	string	はい	カスタム語彙の用語。各用語は、中国語 15 文字または英単語 7 語を超えることはできません。用語に中国語と英語の両方が含まれる場合、文字とアルファベットの合計が 15 を超えることはできません。
weight	int	はい	用語の重み。1 から 5 までの整数である必要があります。一般的な値は 4 です。認識精度が向上しない場合は、重みを増やすことができます。ただし、重みが高すぎると、他の単語が誤って認識される可能性があります。
lang	string	いいえ	言語コード。これにより、自動音声認識 (ASR) モデルでサポートされている特定の言語の用語をブーストできます。サポートされている言語とそのコードの詳細については、モデルの API リファレンスをご参照ください。ブーストを適用するには、認識リクエストの `language_hints` パラメーターで同じ言語を指定する必要があります。他の言語の用語は無視されます。

利用シーン

映画タイトルの認識精度を向上させるには、次のタイトルをカスタム語彙に追加します。

[
    {"text": "赛德克巴莱", "weight": 4, "lang": "zh"},
    {"text": "Seediq Bale", "weight": 4, "lang": "en"},
    
    {"text": "夏洛特烦恼", "weight": 4, "lang": "zh"},
    {"text": "Goodbye Mr. Loser", "weight": 4, "lang": "en"},
    
    {"text": "阙里人家", "weight": 4, "lang": "zh"},
    {"text": "Confucius' Family", "weight": 4, "lang": "en"},
]

サポート対象モデル

リアルタイム音声認識：paraformer-realtime-v2、paraformer-realtime-8k-v2、fun-asr-realtime、fun-asr-realtime-2025-11-07、および fun-asr-realtime-2025-09-15
バッチ音声認識：paraformer-v2、paraformer-8k-v2、fun-asr、fun-asr-2025-11-07、および fun-asr-2025-08-25

課金

現在、カスタム語彙は無料でご利用いただけます。

語彙の制限

各アカウントで 10 個の語彙を作成できます。この制限の引き上げを希望する場合は、リクエストを送信してください。
各語彙に追加できる term の数は、モデルによって異なります。
- Fun-ASR シリーズモデル：
  - fun-asr および fun-asr-2025-11-07 モデル：最大 10,000
  - その他のモデル：最大 1,000
- Paraformer シリーズモデル：最大 500

前提条件

サービスをアクティブ化し、API キーを取得します。詳細については、「API キーの作成と設定」をご参照ください。
漏洩のリスクを減らすために、API キーを環境変数として設定します。詳細については、「API キーを環境変数として設定」をご参照ください。コードに API キーを書き込むこともできますが、漏洩のリスクが高まります。
最新バージョンの SDK がインストールされていること。詳細については、「SDK のインストール」をご参照ください。

コード例

以下のコード例は、語彙を作成し、それを使用して `paraformer-realtime-v2` モデルでローカル音声ファイルを認識する方法を示しています。

Python

import dashscope
from dashscope.audio.asr import *


dashscope.api_key = 'your-dashscope-api-key'
prefix = 'prefix'
target_model = "paraformer-realtime-v2"

my_vocabulary = [
    {"text": "Wu Yigong", "weight": 4, "lang": "zh"},
    {"text": "Confucius' Family", "weight": 4, "lang": "zh"},
]


# 語彙を作成
service = VocabularyService()
vocabulary_id = service.create_vocabulary(
      prefix=prefix,
      target_model=target_model,
      vocabulary=my_vocabulary)

print(f"your vocabulary id is {vocabulary_id}")

# 語彙を使用してファイルを認識
recognition = Recognition(model=target_model,
                          format='wav',
                          sample_rate=16000,
                          callback=None,
                          vocabulary_id=vocabulary_id,
                          language_hints=['zh'])  # language_hints パラメーターは、paraformer-v2 および paraformer-realtime-v2 モデルでのみサポートされています。
result = recognition.call('your-audio-file.wav')
print(result.output)

Java

package org.example.customization;

import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.audio.asr.vocabulary.Vocabulary;
import com.alibaba.dashscope.audio.asr.vocabulary.VocabularyService;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;

import java.io.File;
import java.util.ArrayList;
import java.util.List;

public class VocabularySampleCodes {
    public static String apiKey = "your-dashscope-apikey";

    public static void main(String[] args) throws NoApiKeyException, InputRequiredException {
        String targetModel = "paraformer-realtime-v2";
        // 語彙を準備
        class Hotword {
            String text;
            int weight;
            String lang;

            public Hotword(String text, int weight, String lang) {
                this.text = text;
                this.weight = weight;
                this.lang = lang;
            }
        }
        JsonArray vocabulary = new JsonArray();
        List<Hotword> wordList = new ArrayList<>();
        wordList.add(new Hotword("Wu Yigong", 4, "zh"));
        wordList.add(new Hotword("Confucius' Family", 4, "zh"));

        for (Hotword word : wordList) {
            JsonObject jsonObject = new JsonObject();
            jsonObject.addProperty("text", word.text);
            jsonObject.addProperty("weight", word.weight);
            jsonObject.addProperty("lang", word.lang);
            vocabulary.add(jsonObject);
        }
        // 語彙を作成
        VocabularyService service = new VocabularyService(apiKey);
        Vocabulary myVoc = service.createVocabulary(targetModel, "prefix", vocabulary);
        System.out.println("your vocabulary id is " + myVoc.getVocabularyId());
        // 語彙を使用してファイルを認識
        Recognition recognizer = new Recognition();
        RecognitionParam param =
                RecognitionParam.builder()
                        .model(targetModel)
                        .format("wav")
                        .sampleRate(16000)
                        .apiKey(apiKey)
                        .vocabularyId(myVoc.getVocabularyId())
                        // language_hints パラメーターは、paraformer-v2 および paraformer-realtime-v2 モデルでのみサポートされています。
                        .parameter("language_hints", new String[] {"zh"})
                        .build();
        String result = recognizer.call(param, new File("your-local-audio-file.wav"));
        System.out.println(result);
        System.exit(0);
    }
}

DashScope SDK を使用した語彙の管理

オブジェクトの初期化

Python

service = VocabularyService(api_key="your-dashscope-apikey")

Java

VocabularyService service = new VocabularyService("your-dashscope-apikey");

語彙の作成

Python

def create_vocabulary(self, target_model: str, prefix: str, vocabulary: List[dict]) -> str:
    '''
    語彙を作成します。
    param: target_model 語彙の音声認識モデルのバージョン。
    param: prefix 語彙のカスタムプレフィックス。10 文字未満の小文字の英数字で構成する必要があります。
    param: vocabulary 語彙の辞書。
    return: 語彙識別子、vocabulary_id。
    '''

Java

/**
 * 新しい語彙を作成します。
 *
 * @param targetModel 語彙の音声認識モデルのバージョン。
 * @param prefix 語彙のカスタムプレフィックス。10 文字未満の小文字の英数字で構成する必要があります。
 * @param vocabulary 語彙リスト。
 * @return 語彙オブジェクト。
 * @throws NoApiKeyException API キーが空の場合。
 * @throws InputRequiredException 必須パラメーターが空の場合。
 */
public Vocabulary createVocabulary(String targetModel, String prefix, JsonArray vocabulary)
    throws NoApiKeyException, InputRequiredException

すべての語彙のクエリ

Python

def list_vocabularies(self, prefix=None, page_index: int = 0, page_size: int = 10) -> List[dict]:
    '''
    作成されたすべての語彙をクエリします。
    param: prefix カスタムプレフィックス。指定した場合、このプレフィックスを持つ語彙識別子のリストのみを返します。
    param: page_index クエリするページインデックス。
    param: page_size ページサイズ。
    return: 語彙識別子のリスト。
    '''

Java

/**
 * 作成されたすべての語彙をクエリします。デフォルトのページインデックスは 0、デフォルトのページサイズは 10 です。
 *
 * @param prefix 語彙のカスタムプレフィックス。
 * @return 語彙オブジェクトの配列。
 * @throws NoApiKeyException API キーが空の場合。
 * @throws InputRequiredException 必須パラメーターが空の場合。
 */
public Vocabulary[] listVocabulary(String prefix)
    throws NoApiKeyException, InputRequiredException

/**
 * 作成されたすべての語彙をクエリします。
 *
 * @param prefix 語彙のカスタムプレフィックス。
 * @param pageIndex クエリするページインデックス。
 * @param pageSize ページサイズ。
 * @return 語彙オブジェクトの配列。
 * @throws NoApiKeyException API キーが空の場合。
 * @throws InputRequiredException 必須パラメーターが空の場合。
 */
public Vocabulary[] listVocabulary(String prefix, int pageIndex, int pageSize)
    throws NoApiKeyException, InputRequiredException

レスポンスの例：

[
    {
        "gmt_create": "2024-08-21 15:19:09",
        "vocabulary_id": "vocab-xxx-1f8b10e61ac54b1da86a8d5axxxxxxxx",
        "gmt_modified": "2024-08-21 15:19:09",
        "status": "OK",
    },
    {
        "gmt_create": "2024-08-27 11:17:04",
        "vocabulary_id": "vocab-xxx-24ee19fa8cfb4d52902170a0xxxxxxxx",
        "gmt_modified": "2024-08-27 11:17:04",
        "status": "OK",
    }
]

特定の語彙のクエリ

Python

def query_vocabulary(self, vocabulary_id: str) -> List[dict]:
    '''
    語彙の内容を取得します。
    param: vocabulary_id 語彙識別子。
    return: 語彙。
    '''

Java

/**
 * 特定の語彙をクエリします。
 *
 * @param vocabularyId クエリする語彙。
 * @return 語彙オブジェクト。
 * @throws NoApiKeyException API キーが空の場合。
 * @throws InputRequiredException 必須パラメーターが空の場合。
 */
public Vocabulary queryVocabulary(String vocabularyId)
    throws NoApiKeyException, InputRequiredException

レスポンスの例：

{
    "gmt_create": "2024-08-21 15:19:09",
    "vocabulary": [
        {"weight": 4, "text": "Wu Yigong", "lang": "zh"},
        {"weight": 4, "text": "Confucius' Family", "lang": "zh"},
    ],
    "target_model": "paraformer-realtime-v2",
    "gmt_modified": "2024-08-21 15:19:09",
    "status": "OK",
}

語彙の更新

Python

def update_vocabulary(self, vocabulary_id: str, vocabulary: List[dict]) -> None:
    '''
    既存の語彙を新しい語彙に置き換えます。
    param: vocabulary_id 置き換える語彙の識別子。
    param: vocabulary 新しい語彙。
    '''

Java

/**
 * 語彙を更新します。
 *
 * @param vocabularyId 更新する語彙。
 * @param vocabulary 語彙オブジェクト。
 * @throws NoApiKeyException API キーが空の場合。
 * @throws InputRequiredException 必須パラメーターが空の場合。
 */
public void updateVocabulary(String vocabularyId, JsonArray vocabulary)
    throws NoApiKeyException, InputRequiredException

語彙の削除

Python

def delete_vocabulary(self, vocabulary_id: str) -> None:
    '''
    語彙を削除します。
    param: vocabulary_id 削除する語彙の識別子。
    '''

Java

/**
 * 語彙を削除します。
 *
 * @param vocabularyId 削除する語彙。
 * @throws NoApiKeyException API キーが空の場合。
 * @throws InputRequiredException 必須パラメーターが空の場合。
 */
public void deleteVocabulary(String vocabularyId)
    throws NoApiKeyException, InputRequiredException

エラー処理

Python SDK では、エラーは `VocabularyServiceException` としてスローされます。この例外には、ステータスコード、エラーコード、およびエラーメッセージが含まれます。

class VocabularyServiceException(Exception):
  def __init__(self, status_code: int, code: str, error_message: str)

Java SDK では、エラーは `NoApiKeyException` および `InputRequiredException` としてスローされます。

HTTP サービスを使用した語彙の管理

カスタム語彙サービスは HTTPS プロトコルを使用します。HTTP リクエストを介して、語彙の作成、読み取り、更新、削除ができます。

API キーを環境変数として設定していない場合は、cURL コマンドの `Authorization: Bearer $DASHSCOPE_API_KEY` を `Authorization: Bearer your-api-key` に置き換えてください。ここで、`your-api-key` はご自身の API キーです。

説明

応答の `usage` フィールドは課金目的のものです。カスタム語彙サービスは無料であるため、このフィールドは無視してかまいません。

語彙の作成

cURL 例：

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "speech-biasing",
    "input": {
            "action": "create_vocabulary",
            "target_model": "paraformer-realtime-v2",
            "prefix": "testpfx",
            "vocabulary": [
              {"text": "Wu Yigong", "weight": 4, "lang": "zh"},
              {"text": "Confucius' Family", "weight": 4, "lang": "zh"}
            ]
        }
}'

レスポンスの例：

{
  "output": {
    "task_status": "PENDING",
    "task_id": "c2e5d63b-96e1-4607-bb91-************"
  },
  "request_id": "77ae55ae-be17-97b8-9942--************"
}

すべての語彙のクエリ

cURL 例：

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "speech-biasing",
    "input": {
                "action": "list_vocabulary",
                "prefix": null,
                "page_index": 0,
                "page_size": 10
            }
}'

説明

この例では、`prefix` フィールドが null であり、すべての語彙が返されることを意味します。必要に応じて、この値を特定の文字列に変更できます。

応答例：

{
	"output": {
		"vocabulary_list": [{
			"gmt_create": "2024-11-05 16:31:32",
			"vocabulary_id": "vocab-testpfx-6977ae49f65c4c3db054727cxxxxxxxx",
			"gmt_modified": "2024-11-05 16:31:32",
			"status": "OK"
		}]
	},
	"usage": {
		"count": 1
	},
	"request_id": "4e7df7c0-18a8-9f3e-bfc4-xxxxxxxxxxxx"
}

特定の語彙のクエリ

cURL 例：

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "speech-biasing",
    "input": {
                "action": "query_vocabulary",
                "vocabulary_id": "vocab-testpfx-6977ae49f65c4c3db054727cxxxxxxxx"
            }
}'

応答例：

{
	"output": {
		"gmt_create": "2024-11-05 16:31:32",
		"vocabulary": [{
			"weight": 4,
			"text": "Wu Yigong",
			"lang": "zh"
		}, {
			"weight": 4,
			"text": "Confucius' Family",
			"lang": "zh"
		}],
		"target_model": "paraformer-realtime-v2",
		"gmt_modified": "2024-11-05 16:31:32",
		"status": "OK"
	},
	"usage": {
		"count": 1
	},
	"request_id": "b02d18a4-ff8d-9fd4-b4f0-xxxxxxxxxxxx"
}

語彙の更新

cURL 例：

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "speech-biasing",
    "input": {
                "action": "update_vocabulary",
                "vocabulary_id": "vocab-testpfx-6977ae49f65c4c3db054727cxxxxxxxx",
                "vocabulary": [
                  {"text": "Wu Yigong", "weight": 4, "lang": "zh"}
                ]      
            }
}'

応答例：

{
	"output": {},
	"usage": {
		"count": 1
	},
	"request_id": "a51f3139-7aaa-941b-994f-xxxxxxxxxxxx"
}

語彙の削除

cURL 例：

curl -X POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "speech-biasing",
    "input": {
                "action": "delete_vocabulary",
                "vocabulary_id": "vocab-testpfx-6977ae49f65c4c3db054727cxxxxxxxx"
            }
}'

応答例：

{
	"output": {},
	"usage": {
		"count": 1
	},
	"request_id": "d7499ee5-6c91-956c-a1aa-xxxxxxxxxxxx"
}

エラーコード

トラブルシューティング情報については、「エラーメッセージ」をご参照ください。