画像から画像への検索 - Alibaba Cloud Model Studio - Alibaba Cloud ドキュメントセンター

「画像から画像への検索」ツールを使用すると、モデルがインターネット上で入力画像と視覚的に類似した画像を検索し、その結果を分析・推論できます。類似製品の検索や、視覚コンテンツの出所の特定などにご活用ください。

使用方法

画像から画像への検索機能は、レスポンス API を通じて呼び出します。そのため、tools パラメーターを指定し、image_search ツールを追加したうえで、input にマルチモーダル形式で画像を渡す必要があります。

input には画像コンテンツを含める必要があります。input_image タイプを使用して画像 URL を渡します。また、検索の文脈を補足するために、input_text タイプでテキストを併記することも可能です。

# 依存関係をインポートし、クライアントを作成します...
input_content = [
    {"type": "input_text", "text": "この画像とスタイルが類似した風景画像を検索します。"},
    {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
]
response = client.responses.create(
    model="qwen-plus",
    input=[{"role": "user", "content": input_content}],
    tools=[{"type": "image_search"}]
)

print(response.output_text)

対応モデル

国際

Qwen-Plus: qwen3.5-plus, qwen3.5-plus-2026-02-15
Qwen-Flash: qwen3.5-flash, qwen3.5-flash-2026-02-23
オープンソース Qwen: qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b

グローバル

Qwen-Plus: qwen3.5-plus, qwen3.5-plus-2026-02-15
Qwen-Flash: qwen3.5-flash, qwen3.5-flash-2026-02-23
オープンソース Qwen: qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b

中国本土

Qwen-Plus: qwen3.5-plus, qwen3.5-plus-2026-02-15
Qwen-Flash: qwen3.5-flash, qwen3.5-flash-2026-02-23
オープンソース Qwen: qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b

本機能は、レスポンス API のみを通じて利用可能です。

クイックスタート

以下のコードを実行して、レスポンス API 経由で「画像から画像への検索」ツールを呼び出します。このツールは、入力画像と視覚的に類似または関連する画像を検索します。

開始する前に、API キーの取得および環境変数としての設定が必要です。

注：サンプルコード内の image_url は、公開アクセス可能な画像 URL に置き換えてください。

Python

import os
import json
from openai import OpenAI

client = OpenAI(
    # 環境変数が設定されていない場合は、API キーを直接指定してクライアントを初期化します：api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
)

input_content = [
    {"type": "input_text", "text": "この画像とスタイルが類似した風景画像を検索します。"},
    # image_url をご利用の画像の公開アクセス可能な URL に置き換えます。
    {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
]

response = client.responses.create(
    model="qwen-plus",
    input=[{"role": "user", "content": input_content}],
    tools=[
        {
            "type": "image_search"
        }
    ]
)

# 出力内容を各ステップごとに表示するため、反復処理を行います。
for item in response.output:
    if item.type == "image_search_call":
        print(f"[ツール呼び出し] 画像から画像への検索 (ステータス: {item.status})")
        # 取得された画像の一覧を解析して表示します。
        if item.output:
            images = json.loads(item.output)
            print(f"  {len(images)} 件の画像を検索しました：")
            for img in images[:5]:  # 最初の 5 件の画像を表示します。
                print(f"  [{img['index']}] {img['title']}")
                print(f"      {img['url']}")
            if len(images) > 5:
                print(f"  ... 合計 {len(images)} 件の画像があります。")
    elif item.type == "message":
        print(f"\n[モデル応答]")
        print(response.output_text)

# トークン使用量およびツール呼び出し統計を表示します。
print(f"\n[トークン使用量] 入力: {response.usage.input_tokens}, 出力: {response.usage.output_tokens}, 合計: {response.usage.total_tokens}")
if hasattr(response.usage, 'x_tools') and response.usage.x_tools:
    for tool_name, info in response.usage.x_tools.items():
        print(f"[ツール統計] {tool_name} 呼び出し回数: {info.get('count', 0)}")

Node.js

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    // 環境変数が設定されていない場合は、API キーを直接指定してクライアントを初期化します：apiKey: "sk-xxx",
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen-plus",
        input: [
            {
                role: "user",
                content: [
                    { type: "input_text", text: "この画像とスタイルが類似した風景画像を検索します。" },
                    // image_url をご利用の画像の公開アクセス可能な URL に置き換えます。
                    { type: "input_image", image_url: "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png" }
                ]
            }
        ],
        tools: [
            { type: "image_search" }
        ]
    });

    // 出力内容を各ステップごとに表示するため、反復処理を行います。
    for (const item of response.output) {
        if (item.type === "image_search_call") {
            console.log(`[ツール呼び出し] 画像から画像への検索 (ステータス: ${item.status})`);
            // 取得された画像の一覧を解析して表示します。
            if (item.output) {
                const images = JSON.parse(item.output);
                console.log(`  ${images.length} 件の画像を検索しました：`);
                images.slice(0, 5).forEach(img => {
                    console.log(`  [${img.index}] ${img.title}`);
                    console.log(`      ${img.url}`);
                });
                if (images.length > 5) {
                    console.log(`  ... 合計 ${images.length} 件の画像があります。`);
                }
            }
        } else if (item.type === "message") {
            console.log(`\n[モデル応答]`);
            console.log(response.output_text);
        }
    }

    // トークン使用量およびツール呼び出し統計を表示します。
    console.log(`\n[トークン使用量] 入力: ${response.usage.input_tokens}, 出力: ${response.usage.output_tokens}, 合計: ${response.usage.total_tokens}`);
    if (response.usage && response.usage.x_tools) {
        for (const [toolName, info] of Object.entries(response.usage.x_tools)) {
            console.log(`[ツール統計] ${toolName} 呼び出し回数: ${info.count || 0}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "input": [
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "この画像とスタイルが類似した風景画像を検索します。"},
                {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
            ]
        }
    ],
    "tools": [
        {"type": "image_search"}
    ]
}'

コードを実行すると、以下のような応答が返されます。

[ツール呼び出し] 画像から画像への検索 (ステータス: completed)
  2 件の画像を検索しました：
  [1] 2024 年清明節休暇のお知らせ
      https://www.healthcabin.net/blog/wp-content/uploads/2024/04/QingMing-Festival-Holiday-Notice-2024.jpg
  [2] 霧に包まれた湖面に映る静かなアジア風景の石橋
      https://thumbs.dreamstime.com/b/serene-asian-landscape-stone-bridge-reflecting-misty-water-tranquil-illustration-traditional-arch-spanning-lake-style-376972039.jpg

[モデル応答]
承知いたしました。スタイルが類似した風景画像をいくつか見つけました。

これらの画像はすべて、中国の水墨画や伝統的な山水画の雰囲気を備えており、以下の共通点があります：
*   **伝統的な建築物**： 例として、楼閣、塔、アーチ型の橋など。
*   **自然要素**： 例として、遠くの山々、湖、柳、蓮の花など。
*   **芸術的スタイル**： 上品な色使いと柔らかな線で、穏やかで深遠な雰囲気を演出しています。

...

[トークン使用量] 入力: 2753, 出力: 181, 合計: 2934
[ツール統計] image_search 呼び出し回数: 1

ストリーミング応答

「画像から画像への検索」ツールは処理に時間がかかる場合があります。ストリーミング出力を有効にすることで、リアルタイムで中間結果を受信できます。

Python

import os
import json
from openai import OpenAI

client = OpenAI(
    # 環境変数が設定されていない場合は、API キーを直接指定してクライアントを初期化します：api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
)

input_content = [
    {"type": "input_text", "text": "この画像とスタイルが類似した風景画像を検索します。"},
    # image_url をご利用の画像の公開アクセス可能な URL に置き換えます。
    {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
]

stream = client.responses.create(
    model="qwen-plus",
    input=[{"role": "user", "content": input_content}],
    tools=[{"type": "image_search"}],
    stream=True
)

for event in stream:
    # ツール呼び出しが開始されました。
    if event.type == "response.output_item.added":
        if event.item.type == "image_search_call":
            print("[ツール呼び出し] 画像から画像への検索を実行中...")
    # ツール呼び出しが完了しました。取得された画像の一覧を解析して表示します。
    elif event.type == "response.output_item.done":
        if event.item.type == "image_search_call":
            print(f"[ツール呼び出し] 画像から画像への検索が完了しました (ステータス: {event.item.status})")
            if event.item.output:
                images = json.loads(event.item.output)
                print(f"  {len(images)} 件の画像を検索しました：")
                for img in images[:5]:  # 最初の 5 件の画像を表示します。
                    print(f"  [{img['index']}] {img['title']}")
                    print(f"      {img['url']}")
                if len(images) > 5:
                    print(f"  ... 合計 {len(images)} 件の画像があります。")
    # モデル応答が開始されました。
    elif event.type == "response.content_part.added":
        print(f"\n[モデル応答]")
    # テキスト出力をストリーミングします。
    elif event.type == "response.output_text.delta":
        print(event.delta, end="", flush=True)
    # 応答が完了しました。使用量の統計情報を表示します。
    elif event.type == "response.completed":
        usage = event.response.usage
        print(f"\n\n[トークン使用量] 入力: {usage.input_tokens}, 出力: {usage.output_tokens}, 合計: {usage.total_tokens}")
        if hasattr(usage, 'x_tools') and usage.x_tools:
            for tool_name, info in usage.x_tools.items():
                print(f"[ツール統計] {tool_name} 呼び出し回数: {info.get('count', 0)}")

Node.js

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    // 環境変数が設定されていない場合は、API キーを直接指定してクライアントを初期化します：apiKey: "sk-xxx",
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen-plus",
        input: [
            {
                role: "user",
                content: [
                    { type: "input_text", text: "この画像とスタイルが類似した風景画像を検索します。" },
                    // image_url をご利用の画像の公開アクセス可能な URL に置き換えます。
                    { type: "input_image", image_url: "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png" }
                ]
            }
        ],
        tools: [{ type: "image_search" }],
        stream: true
    });

    for await (const event of stream) {
        // ツール呼び出しが開始されました。
        if (event.type === "response.output_item.added") {
            if (event.item.type === "image_search_call") {
                console.log("[ツール呼び出し] 画像から画像への検索を実行中...");
            }
        }
        // ツール呼び出しが完了しました。取得された画像の一覧を解析して表示します。
        else if (event.type === "response.output_item.done") {
            if (event.item && event.item.type === "image_search_call") {
                console.log(`[ツール呼び出し] 画像から画像への検索が完了しました (ステータス: ${event.item.status})`);
                if (event.item.output) {
                    const images = JSON.parse(event.item.output);
                    console.log(`  ${images.length} 件の画像を検索しました：`);
                    images.slice(0, 5).forEach(img => {
                        console.log(`  [${img.index}] ${img.title}`);
                        console.log(`      ${img.url}`);
                    });
                    if (images.length > 5) {
                        console.log(`  ... 合計 ${images.length} 件の画像があります。`);
                    }
                }
            }
        }
        // モデル応答が開始されました。
        else if (event.type === "response.content_part.added") {
            console.log(`\n[モデル応答]`);
        }
        // テキスト出力をストリーミングします。
        else if (event.type === "response.output_text.delta") {
            process.stdout.write(event.delta);
        }
        // 応答が完了しました。使用量の統計情報を表示します。
        else if (event.type === "response.completed") {
            const usage = event.response.usage;
            console.log(`\n\n[トークン使用量] 入力: ${usage.input_tokens}, 出力: ${usage.output_tokens}, 合計: ${usage.total_tokens}`);
            if (usage && usage.x_tools) {
                for (const [toolName, info] of Object.entries(usage.x_tools)) {
                    console.log(`[ツール統計] ${toolName} 呼び出し回数: ${info.count || 0}`);
                }
            }
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "qwen-plus",
    "input": [
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "この画像とスタイルが類似した風景画像を検索します。"},
                {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
            ]
        }
    ],
    "tools": [
        {"type": "image_search"}
    ],
    "stream": true
}'

コードを実行すると、以下のようなストリーミング応答が返されます。

[ツール呼び出し] 画像から画像への検索を実行中...
[ツール呼び出し] 画像から画像への検索が完了しました (ステータス: completed)
  3 件の画像を検索しました：
  [1] 2024 年清明節休暇のお知らせ
      https://www.healthcabin.net/blog/wp-content/uploads/2024/04/QingMing-Festival-Holiday-Notice-2024.jpg
  [2] 霧に包まれた湖面に映る静かなアジア風景の石橋
      https://thumbs.dreamstime.com/b/serene-asian-landscape-stone-bridge-reflecting-misty-water-...
  [3] ...

[モデル応答]
承知いたしました。スタイルが類似した画像をいくつか見つけました。これらの画像はすべて、中国の水墨画や細密画のスタイルを示しています...

[トークン使用量] 入力: 5339, 出力: 164, 合計: 5503
[ツール統計] image_search 呼び出し回数: 1

課金

本機能の課金には、以下のコンポーネントが含まれます：

モデル呼び出し料金：ツールは画像検索結果の情報をプロンプトに追加するため、入力トークン使用量が増加します。使用されるモデルに応じた標準料金で課金されます。料金の詳細については、「モデル一覧」をご参照ください。
ツール呼び出し料金：1,000 回の呼び出し単位で課金されます。国際、グローバル、および中国本土向けデプロイメントの場合、料金は米ドル 0.40 ドルです。

よくある質問

Q：サポートされる画像フォーマットおよび入力方法は何ですか？

A：「画像制限」および「ファイル入力方法」をご参照ください。

OpenAI SDK では、ローカルファイルパスの指定はサポートされていません。

Q：1 回のリクエストで複数の画像を渡すことは可能ですか？

A：モデルの最大入力長により、渡せる画像の数が制限されます。画像とテキストの合計トークン数は、モデルがサポートする上限を超えてはなりません。ツール呼び出し 1 回につき 1 枚の画像を検索しますが、複数の画像を処理する場合は、ツールを複数回呼び出すことが可能です。

検索する画像の数は、モデルが決定します。

Q：検索結果として返される画像の数はいくつですか？

A：返される画像の数はモデルが決定し、最大で 10 件までとなります。正確な件数は固定されていません。