All Products
Search
Document Center

Alibaba Cloud Model Studio:Deep research (Qwen-Deep-Research)

Last Updated:Mar 15, 2026

Automates complex research through planning, multiple rounds of web searches, and structured report generation. Gathers and synthesizes information without manual effort.

Note

This document applies only to the Chinese mainland (Beijing) region. To use the model, use an API key from the Chinese mainland (Beijing) region.

Getting started

Get an API key and export the API key as an environment variable. If you use an SDK to make calls, install the DashScope SDK.

Two-step workflow: follow-up question (the model clarifies research scope) and deep research (the model searches, analyzes, and generates a report).

Currently, the model does not support DashScope SDK for Java or OpenAI-compatible API calls.
import os
import dashscope

# Configure the API key
# If not set, replace the following line with your Model Studio API key (format: sk-xxx)
API_KEY = os.getenv('DASHSCOPE_API_KEY')

def call_deep_research_model(messages, step_name):
    print(f"\n=== {step_name} ===")
    
    try:
        responses = dashscope.Generation.call(
            api_key=API_KEY,
            model="qwen-deep-research",
            messages=messages,
            # The qwen-deep-research model currently only supports streaming output
            stream=True
            # incremental_output=True Add this parameter for incremental output
        )
        
        return process_responses(responses, step_name)
        
    except Exception as e:
        print(f"An error occurred when calling the API: {e}")
        return ""


# Display phase content
def display_phase_content(phase, content, status):
    if content:
        print(f"\n[{phase}] {status}: {content}")
    else:
        print(f"\n[{phase}] {status}")

# Process the response
def process_responses(responses, step_name):
    current_phase = None
    phase_content = ""
    research_goal = ""
    web_sites = []
    references = []
    keepalive_shown = False  # Flag to check if the KeepAlive prompt has been shown

    for response in responses:
        # Check the response status code
        if hasattr(response, 'status_code') and response.status_code != 200:
            print(f"HTTP return code: {response.status_code}")
            if hasattr(response, 'code'):
                print(f"Error code: {response.code}")
            if hasattr(response, 'message'):
                print(f"Error message: {response.message}")
            print("For more information, see: https://www.alibabacloud.com/help/zh/model-studio/error-code")
            continue

        if hasattr(response, 'output') and response.output:
            message = response.output.get('message', {})
            phase = message.get('phase')
            content = message.get('content', '')
            status = message.get('status')
            extra = message.get('extra', {})

            # Phase change detection
            if phase != current_phase:
                if current_phase and phase_content:
                    # Display different completion descriptions based on phase and step names
                    if step_name == "Step 1: Model query confirmation" and current_phase == "answer":
                        print(f"\n Query confirmation phase completed")
                    else:
                        print(f"\n {current_phase} phase completed")
                current_phase = phase
                phase_content = ""
                keepalive_shown = False  # Reset KeepAlive prompt flag

                # Display different descriptions based on phase and step names
                if step_name == "Step 1: Model query confirmation" and phase == "answer":
                    print(f"\n Entering query confirmation phase")
                else:
                    print(f"\n Entering {phase} phase")
                    
            # Process reference information in the Answer phase
            if phase == "answer":
                if extra.get('deep_research', {}).get('references'):
                    new_references = extra['deep_research']['references']
                    if new_references and new_references != references:  # Avoid duplicate display
                        references = new_references
                        print(f"\n   References ({len(references)}):")
                        for i, ref in enumerate(references, 1):
                            print(f"     {i}. {ref.get('title', 'No title')}")
                            if ref.get('url'):
                                print(f"        URL: {ref['url']}")
                            if ref.get('description'):
                                print(f"        Description: {ref['description'][:100]}...")
                            print()

            # Process special information in the WebResearch phase
            # Note: The qwen-deep-research-2025-12-15 model uses the streamingThinking status
            # instead of streamingQueries and streamingWebResult
            if phase == "WebResearch":
                if extra.get('deep_research', {}).get('research'):
                    research_info = extra['deep_research']['research']

                    # Process streamingThinking (snapshot model) or streamingQueries (mainline model) status
                    if status in ("streamingThinking", "streamingQueries"):
                        if 'researchGoal' in research_info:
                            goal = research_info['researchGoal']
                            if goal:
                                research_goal += goal
                                print(f"\n   Research goal: {goal}", end='', flush=True)

                    # Process streamingWebResult status (mainline model)
                    # The snapshot model merges this status using streamingThinking
                    elif status == "streamingWebResult":
                        if 'webSites' in research_info:
                            sites = research_info['webSites']
                            if sites and sites != web_sites:  # Avoid duplicate display
                                web_sites = sites
                                print(f"\n   Found {len(sites)} relevant websites:")
                                for i, site in enumerate(sites, 1):
                                    print(f"     {i}. {site.get('title', 'No title')}")
                                    print(f"        Description: {site.get('description', 'No description')[:100]}...")
                                    print(f"        URL: {site.get('url', 'No link')}")
                                    if site.get('favicon'):
                                        print(f"        Icon: {site['favicon']}")
                                    print()

                    # Process WebResultFinished status
                    elif status == "WebResultFinished":
                        print(f"\n   Web search completed. Found {len(web_sites)} reference sources.")
                        if research_goal:
                            print(f"   Research goal: {research_goal}")

            # Accumulate and display content
            if content:
                phase_content += content
                # Display content in real-time
                print(content, end='', flush=True)

            # Display phase status changes
            if status and status != "typing":
                print(f"\n   Status: {status}")

                # Display status description
                if status == "streamingThinking":
                    print("   → Decomposing research tasks and summarizing web content (WebResearch phase)")
                elif status == "streamingQueries":
                    print("   → Generating research goals and search queries (WebResearch phase)")
                elif status == "streamingWebResult":
                    print("   → Performing searches, web page reading, and code execution (WebResearch phase)")
                elif status == "WebResultFinished":
                    print("   → Web search phase completed (WebResearch phase)")

            # When status is finished, display token consumption
            if status == "finished":
                if hasattr(response, 'usage') and response.usage:
                    usage = response.usage
                    print(f"\n    Token consumption statistics:")
                    print(f"      Input tokens: {usage.get('input_tokens', 0)}")
                    print(f"      Output tokens: {usage.get('output_tokens', 0)}")
                    print(f"      Request ID: {response.get('request_id', 'Unknown')}")

            if phase == "KeepAlive":
                # Only display the prompt the first time entering the KeepAlive phase
                if not keepalive_shown:
                    print("Current step completed. Preparing for the next step.")
                    keepalive_shown = True
                continue

    if current_phase and phase_content:
        if step_name == "Step 1: Model query confirmation" and current_phase == "answer":
            print(f"\n Query confirmation phase completed")
        else:
            print(f"\n {current_phase} phase completed")

    return phase_content

def main():
    # Check API key
    if not API_KEY:
        print("Error: DASHSCOPE_API_KEY environment variable not set")
        print("Set the environment variable or modify the API_KEY variable directly in the code")
        return
    
    print("User initiates conversation: Research the application of artificial intelligence in education")
    
    # Step 1: Model query confirmation
    # The model analyzes the user's question and asks clarifying questions to define the research direction
    messages = [{'role': 'user', 'content': 'Research the application of artificial intelligence in education'}]
    step1_content = call_deep_research_model(messages, "Step 1: Model query confirmation")

    # Step 2: Deep research
    # Based on the query confirmation from Step 1, the model performs the full research process
    messages = [
        {'role': 'user', 'content': 'Research the application of artificial intelligence in education'},
        {'role': 'assistant', 'content': step1_content},  # Includes the model's query confirmation content
        {'role': 'user', 'content': 'I mainly focus on personalized learning and intelligent assessment'}
    ]
    
    call_deep_research_model(messages, "Step 2: Deep research")
    print("\n Research completed!")

if __name__ == "__main__":
    main()
echo "Step 1: Model query confirmation"
curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'X-DashScope-SSE: enable' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "input": {
        "messages": [
            {
                "content": "Research the application of artificial intelligence in education", 
                "role": "user"
            }
        ]
    },
    "model": "qwen-deep-research"
}'

echo -e "\n\n" 
echo "Step 2: Deep research"
curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'X-DashScope-SSE: enable' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "input": {
        "messages": [
            {
                "content": "Research the application of artificial intelligence in education", 
                "role": "user"
            },
            {
                "content": "Tell me which specific application scenarios of artificial intelligence in education you want to focus on?", 
                "role": "assistant"
            },
            {
                "content": "I mainly focus on personalized learning", 
                "role": "user"
            }
        ]
    },
    "model": "qwen-deep-research"
}'

Specifications

Model

Context window (tokens)

Max input (tokens)

Max output (tokens)

qwen-deep-research

1,000,000

997,952

32,768

qwen-deep-research-2025-12-15

Note

qwen-deep-research: mainline model, continuously updated. qwen-deep-research-2025-12-15: snapshot version with improved depth, quality, and MCP tool calling. Both support image input and are billed separately.

Core capabilities

Track progress via phase (current task) and status (task progress).

Follow-up question and report generation (phase: "answer")

Analyzes your query, asks clarifying questions to define the scope, and generates final reports.

Status changes:

  • typing: Generating text content

  • finished: Text content generation completed

Research planning (phase: "ResearchPlanning")

Creates a research outline from your query.

Status changes:

  • typing: Generating the research plan

  • finished: Research plan completed

Web search (phase: "WebResearch")

Performs multiple rounds of web searches and content analysis. WebResultFinished: after each round. finished: when the phase completes.

Status changes:

  • streamingThinking: Decomposing research tasks and summarizing web content (specific to qwen-deep-research-2025-12-15, replaces streamingQueries and streamingWebResult)

  • streamingQueries: Generating search queries (for qwen-deep-research only)

  • streamingWebResult: Performing web searches and analyzing web content (for qwen-deep-research only)

  • WebResultFinished: Search round completed

  • finished: Web search phase completed

Connection keepalive (phase: "KeepAlive")

Maintains the connection between long-running tasks. Ignore.

Image input

Supports image input. The model analyzes image content and performs research. Use array format for content field with image and text objects.

  • Supports JPEG, PNG, BMP, WEBP. Max 10 MB per image.

  • Up to 5 images per request. Supports public URLs and Base64 encoding.

  • Response format is identical to text-only requests. The model generates a report based on image content.

Request example

import os
import dashscope

API_KEY = os.getenv('DASHSCOPE_API_KEY')

messages = [
    {
        "role": "user",
        "content": [
            {"image": "https://example.aliyuncs.com/example.png"},
            {"text": "Analyze the data trends in this chart and conduct in-depth research on key findings"}
        ]
    }
]

responses = dashscope.Generation.call(
    api_key=API_KEY,
    model="qwen-deep-research",
    messages=messages,
    stream=True
)

for response in responses:
    if hasattr(response, 'output') and response.output:
        message = response.output.get('message', {})
        content = message.get('content', '')
        if content:
            print(content, end='', flush=True)
curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'X-DashScope-SSE: enable' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "input": {
        "messages": [
            {
                "content": [
                    {"image": "https://example.aliyuncs.com/example.png"},
                    {"text": "Analyze the data trends in this chart and conduct in-depth research on key findings"}
                ],
                "role": "user"
            }
        ]
    },
    "model": "qwen-deep-research"
}'

MCP tool calling

Note

MCP tool calling is only supported by qwen-deep-research-2025-12-15. qwen-deep-research does not support this feature.

qwen-deep-research-2025-12-15 supports MCP via research_tools parameter. Calls external tools during WebResearch phase. Response format identical to standard calls.

For details about research_tools and MCP tool specifications, see Qwen-Deep-Research.

Request example

import os
import dashscope

API_KEY = os.getenv('DASHSCOPE_API_KEY')

messages = [
    {
        "role": "user",
        "content": "Use the knowledge base to search for recently published product update announcements and compile them into a research report"
    }
]

responses = dashscope.Generation.call(
    api_key=API_KEY,
    model="qwen-deep-research-2025-12-15",
    messages=messages,
    stream=True,
    enable_feedback=False,
    research_tools=[{
        "type": "mcp",
        "server_label": "my-server",
        "server_url": "https://your-mcp-server.example.com/sse",
        "allowed_tools": ["search", "fetch"],
        "authentication": {
            "bearer": "your_jwt_token_here"
        }
    }]
)

for response in responses:
    if hasattr(response, 'output') and response.output:
        message = response.output.get('message', {})
        content = message.get('content', '')
        if content:
            print(content, end='', flush=True)
curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'X-DashScope-SSE: enable' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "input": {
        "messages": [
            {
                "content": "Use the knowledge base to search for recently published product update announcements and compile them into a research report",
                "role": "user"
            }
        ]
    },
    "model": "qwen-deep-research-2025-12-15",
    "parameters": {
        "enable_feedback": false,
        "research_tools": [{
            "type": "mcp",
            "server_label": "my-server",
            "server_url": "https://your-mcp-server.example.com/sse",
            "allowed_tools": ["search", "fetch"],
            "authentication": {
                "bearer": "your_jwt_token_here"
            }
        }]
    }
}'

Billing

Model

Input cost (per 1K tokens)

Output cost (per 1K tokens)

Free quota

qwen-deep-research

$0.007742

$0.023367

No free quota

qwen-deep-research-2025-12-15

To be determined

To be determined

No free quota

Billing is based on input tokens (user messages + system prompts) and output tokens (follow-up questions, plans, goals, queries, report).

Going live

Handle streaming output

Supports only streaming output (stream=True). Track progress via phase and status fields.

Handle errors

Check the response status code for non-200 errors and handle them appropriately.

Monitor token usage

When status is finished, retrieve token usage from response.usage (input tokens, output tokens, request ID).

Handle connection keepalive

KeepAlive phase maintains connection between long tasks. Ignore and continue processing.

FAQ

  • Why is the output field empty for some response chunks?

    Early chunks contain metadata only. Content arrives in subsequent chunks.

  • How do I determine if a phase is complete?

    A phase completes when status changes to "finished".

  • Does the model support OpenAI-compatible API calls?

    No. OpenAI-compatible API calls are not supported.

  • How are input and output tokens calculated?

    Input tokens: user messages + system prompts. Output tokens: follow-up questions, plans, goals, queries, final report.

  • What is the difference between qwen-deep-research and qwen-deep-research-2025-12-15?

    qwen-deep-research: mainline (continuously updated). qwen-deep-research-2025-12-15: snapshot with improved depth, quality, and MCP support. Both support image input, billed separately.

  • How do I pass images for research?

    Use array format for content: {"image": "URL"} and {"text": "description"}. Both models support images.

  • How do I skip the follow-up question and go straight to research?

    Set enable_feedback to false in parameters to skip follow-up and start research immediately.

API reference

For input and output parameters, see Qwen-Deep-Research.

Error codes

If the model call fails and returns an error message, see Error messages for resolution.

Rate limiting

See Rate limiting.