All Products
Search
Document Center

Alibaba Cloud Model Studio:Deep research (Qwen-Deep-Research)

Last Updated:Nov 05, 2025

Traditional manual searches for complex topics are time-consuming, and large models that incorporate web search often struggle with deep, systematic analysis. The Qwen-Deep-Research model automates this process by planning research steps, performing multiple rounds of in-depth searches, integrating information, and generating a structured research report.

Important

This document applies only to the China (Beijing) region. To use the model, you must use an API key from the China (Beijing) region.

Getting started

You must obtain an API key and set the API key as an environment variable. If you use an SDK to make calls, you must also install the DashScope SDK.

The model uses a two-step workflow: a follow-up question to clarify the research scope, and deep research to perform searches and generate a report.

Currently, the model can only be called using the DashScope SDK. The Java DashScope SDK and OpenAI-compatible API calls are not supported.
import os
import dashscope

# Configure the API key
# If you have not configured the environment variable, replace the following line with your Model Studio API key: API_KEY = "sk-xxx"
API_KEY = os.getenv('DASHSCOPE_API_KEY')

def call_deep_research_model(messages, step_name):
    print(f"\n=== {step_name} ===")
    
    try:
        responses = dashscope.Generation.call(
            api_key=API_KEY,
            model="qwen-deep-research",
            messages=messages,
            # The qwen-deep-research model currently supports only streaming output
            stream=True
            # To use incremental output, add the incremental_output=True parameter
        )
        
        return process_responses(responses, step_name)
        
    except Exception as e:
        print(f"An error occurred when calling the API: {e}")
        return ""


# Display the phase content
def display_phase_content(phase, content, status):
    if content:
        print(f"\n[{phase}] {status}: {content}")
    else:
        print(f"\n[{phase}] {status}")

# Process the responses
def process_responses(responses, step_name):
    current_phase = None
    phase_content = ""
    research_goal = ""
    web_sites = []
    references = []
    keepalive_shown = False  # A flag to check if the KeepAlive prompt has been displayed

    for response in responses:
        # Check the response status code
        if hasattr(response, 'status_code') and response.status_code != 200:
            print(f"HTTP return code: {response.status_code}")
            if hasattr(response, 'code'):
                print(f"Error code: {response.code}")
            if hasattr(response, 'message'):
                print(f"Error message: {response.message}")
            print("For more information, see https://www.alibabacloud.com/help/en/model-studio/error-code")
            continue

        if hasattr(response, 'output') and response.output:
            message = response.output.get('message', {})
            phase = message.get('phase')
            content = message.get('content', '')
            status = message.get('status')
            extra = message.get('extra', {})

            # Detect phase changes
            if phase != current_phase:
                if current_phase and phase_content:
                    # Display different completion descriptions based on the phase name and step name
                    if step_name == "Step 1: Model follow-up question for confirmation" and current_phase == "answer":
                        print(f"\n Model follow-up question phase completed")
                    else:
                        print(f"\n {current_phase} phase completed")
                current_phase = phase
                phase_content = ""
                keepalive_shown = False  # Reset the KeepAlive prompt flag

                # Display different descriptions based on the phase name and step name
                if step_name == "Step 1: Model follow-up question for confirmation" and phase == "answer":
                    print(f"\n Entering model follow-up question phase")
                else:
                    print(f"\n Entering {phase} phase")
                    
            # Process reference information for the Answer phase
            if phase == "answer":
                if extra.get('deep_research', {}).get('references'):
                    new_references = extra['deep_research']['references']
                    if new_references and new_references != references:  # Avoid duplicate display
                        references = new_references
                        print(f"\n   References ({len(references)}):")
                        for i, ref in enumerate(references, 1):
                            print(f"     {i}. {ref.get('title', 'No title')}")
                            if ref.get('url'):
                                print(f"        URL: {ref['url']}")
                            if ref.get('description'):
                                print(f"        Description: {ref['description'][:100]}...")
                            print()

            # Process special information for the WebResearch phase
            if phase == "WebResearch":
                if extra.get('deep_research', {}).get('research'):
                    research_info = extra['deep_research']['research']

                    # Process the streamingQueries status
                    if status == "streamingQueries":
                        if 'researchGoal' in research_info:
                            goal = research_info['researchGoal']
                            if goal:
                                research_goal += goal
                                print(f"\n   Research goal: {goal}", end='', flush=True)

                    # Process the streamingWebResult status
                    elif status == "streamingWebResult":
                        if 'webSites' in research_info:
                            sites = research_info['webSites']
                            if sites and sites != web_sites:  # Avoid duplicate display
                                web_sites = sites
                                print(f"\n   Found {len(sites)} relevant websites:")
                                for i, site in enumerate(sites, 1):
                                    print(f"     {i}. {site.get('title', 'No title')}")
                                    print(f"        Description: {site.get('description', 'No description')[:100]}...")
                                    print(f"        URL: {site.get('url', 'No link')}")
                                    if site.get('favicon'):
                                        print(f"        Icon: {site['favicon']}")
                                    print()

                    # Process the WebResultFinished status
                    elif status == "WebResultFinished":
                        print(f"\n   Web search completed. Found {len(web_sites)} reference sources.")
                        if research_goal:
                            print(f"   Research goal: {research_goal}")

            # Accumulate and display content
            if content:
                phase_content += content
                # Display content in real time
                print(content, end='', flush=True)

            # Display phase status changes
            if status and status != "typing":
                print(f"\n   Status: {status}")

                # Display status description
                if status == "streamingQueries":
                    print("   → Generating research goals and search queries (WebResearch phase)")
                elif status == "streamingWebResult":
                    print("   → Performing search, web page reading, and code execution (WebResearch phase)")
                elif status == "WebResultFinished":
                    print("   → Web search phase completed (WebResearch phase)")

            # When the status is 'finished', display token usage
            if status == "finished":
                if hasattr(response, 'usage') and response.usage:
                    usage = response.usage
                    print(f"\n    Token usage statistics:")
                    print(f"      Input tokens: {usage.get('input_tokens', 0)}")
                    print(f"      Output tokens: {usage.get('output_tokens', 0)}")
                    print(f"      Request ID: {response.get('request_id', 'Unknown')}")

            if phase == "KeepAlive":
                # Display the prompt only the first time the KeepAlive phase is entered
                if not keepalive_shown:
                    print("The current step is complete. Preparing for the next step.")
                    keepalive_shown = True
                continue

    if current_phase and phase_content:
        if step_name == "Step 1: Model follow-up question for confirmation" and current_phase == "answer":
            print(f"\n Model follow-up question phase completed")
        else:
            print(f"\n {current_phase} phase completed")

    return phase_content

def main():
    # Check the API key
    if not API_KEY:
        print("Error: The DASHSCOPE_API_KEY environment variable is not set.")
        print("Set the environment variable or directly modify the API_KEY variable in the code.")
        return
    
    print("User initiates conversation: Research the applications of artificial intelligence in education")
    
    # Step 1: Model follow-up question for confirmation
    # The model analyzes the user's question and asks follow-up questions to clarify the research direction.
    messages = [{'role': 'user', 'content': 'Research the applications of artificial intelligence in education'}]
    step1_content = call_deep_research_model(messages, "Step 1: Model follow-up question for confirmation")

    # Step 2: Deep research
    # Based on the content of the follow-up question in Step 1, the model executes the complete research process.
    messages = [
        {'role': 'user', 'content': 'Research the applications of artificial intelligence in education'},
        {'role': 'assistant', 'content': step1_content},  # Contains the model's follow-up question
        {'role': 'user', 'content': 'I am mainly interested in personalized learning and intelligent assessment.'}
    ]
    
    call_deep_research_model(messages, "Step 2: Deep research")
    print("\n Research complete!")

if __name__ == "__main__":
    main()
echo "Step 1: Model follow-up question for confirmation"
curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'X-DashScope-SSE: enable' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "input": {
        "messages": [
            {
                "content": "Research the applications of artificial intelligence in education", 
                "role": "user"
            }
        ]
    },
    "model": "qwen-deep-research"
}'

echo -e "\n\n" 
echo "Step 2: Deep research"
curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'X-DashScope-SSE: enable' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "input": {
        "messages": [
            {
                "content": "Research the applications of artificial intelligence in education", 
                "role": "user"
            },
            {
                "content": "Which specific application scenarios of artificial intelligence in education would you like to focus on?", 
                "role": "assistant"
            },
            {
                "content": "I am mainly interested in personalized learning.", 
                "role": "user"
            }
        ]
    },
    "model": "qwen-deep-research"
}'

Model list

Model

Context window (Tokens)

Maximum input (Tokens)

Maximum output (Tokens)

qwen-deep-research

1,000,000

997,952

32,768

Core capabilities

The model uses the phase and status fields to indicate the status of its workflow. The phase field indicates the current core task, and the status field indicates the progress of that task.

Follow-up question and report generation (phase: "answer")

The model analyzes the user's initial question and asks follow-up questions to confirm the research scope. This phase is reused when the final report is generated.

Status changes:

  • typing: Generating text content.

  • finished: Text generation is complete.

Research planning (phase: "ResearchPlanning")

This phase creates a research outline based on the user's requirements.

Status changes:

  • typing: Generating the research plan.

  • finished: The research plan is complete.

Web search (phase: "WebResearch")

In this phase, the model performs searches and processes information.

Status changes:

  • streamingQueries: Generating search queries.

  • streamingWebResult: Performing web searches and analyzing page content.

  • WebResultFinished: Web search and information extraction are complete.

Persistent connection (phase: "KeepAlive")

This message is sent between long-running tasks to maintain the connection. This phase does not contain business-related content and can be ignored.

Billing information

Model

Input price (per 1,000 tokens)

Output price (per 1,000 tokens)

Free quota

qwen-deep-research

$0.007742

$0.023367

No free quota

Billing method: Billing is based on the total number of input and output tokens. Input tokens include the content of user messages and the model's built-in system prompts. Output tokens include all generated content, such as follow-up questions, research plans, research goals, search queries, and the final research report.

Going live

Handle streaming output

The model supports only streaming output (stream=True). When you process responses, you must parse the phase and status fields to determine the current phase and its completion status.

Handle errors

Check the status code of the response and handle any errors for non-200 statuses. In the early stages of a streaming response, some response chunks might contain only metadata. Subsequent chunks contain the actual content.

Monitor token consumption

When the status is finished, you can retrieve token consumption statistics from response.usage. These statistics include the number of input tokens, output tokens, and the request ID.

Manage connections

The model might send a KeepAlive phase response between long-running tasks to maintain the connection. You can ignore the content of this phase and continue to process subsequent responses.

FAQ

  • Why is the output field empty for some response chunks?

    In the early stages of a streaming response, some response chunks might contain only metadata. Subsequent chunks contain the actual content.

  • How can I determine if a phase is complete?

    A phase is complete when the status field changes to "finished".

  • Does the model support OpenAI-compatible API calls?

    No, the model does not currently support OpenAI-compatible API calls.

  • How are the numbers of input and output tokens calculated?

    Input tokens include the content of messages sent by the user and the model's built-in system prompts. Output tokens include all content generated by the model throughout the research process, such as the follow-up question, research plan, research goals, search queries, and the final research report.

API reference

For information about the input and output parameters of the Qwen-Deep-Research model, see Qwen API details.

Error codes

If a call fails, see Error messages for troubleshooting.

Rate limiting

For information about rate limiting, see Rate limits.