All Products
Search
Document Center

Alibaba Cloud Model Studio:Text rerank

Last Updated:Mar 15, 2026

Retrieval systems often return imprecise results during initial retrieval. A rerank model sorts these documents more accurately to surface the most relevant results.

Model overview

Singapore

Model

Max number of documents

Max input tokens per item

Max input tokens per request

Supported languages

Price (per 1M tokens)

Free quota

Scenarios

qwen3-rerank

500

4,000

120,000

Over 100 major languages, such as Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian

$0.1

1 million tokens

Valid for 90 days after activating Model Studio

  • Text semantic retrieval

  • RAG applications

Beijing

Model

Max number of documents

Max input tokens per item

Max input tokens per request

Supported languages

Price (per 1M tokens)

Free quota

Scenarios

qwen3-vl-rerank

100

8,000

120,000

33 major languages, such as Chinese, English, Japanese, Korean, French, and German

Image: $0.258

Text: $0.1

No free quota

  • Image clustering

  • Cross-modal search

  • Image retrieval

  • Video retrieval

gte-rerank-v2

500

4,000

30,000

Over 50 languages, such as Chinese, English, Japanese, Korean, Thai, Spanish, French, Portuguese, German, Indonesian, and Arabic

$0.115

  • Text semantic retrieval

  • RAG applications

  • Max input tokens per item: The maximum tokens allowed per query or document. Exceeding this limit triggers truncation, which may reduce ranking accuracy.

  • Max number of documents: The maximum documents per request.

  • Max input tokens per request: Calculated using the formula Query Tokens × Number of documents + Total document tokens. This total must not exceed the per-request token limit.

Input limitations

Model

Image

Video

qwen3-vl-rerank

JPEG, PNG, WEBP, BMP, TIFF, ICO, DIB, ICNS, and SGI (URL or Base64 supported)

MP4, AVI, and MOV (URL only)

Prerequisites

Get an API key and set the API key as an environment variable. To use the SDK: install the DashScope SDK.

HTTP

POST https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank

Request

qwen3-rerank

curl --request POST \
  --url https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
        "model": "qwen3-rerank",
        "documents": [
                "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
                "Quantum computing is a cutting-edge field of computer science.",
                "The development of pre-trained language models has brought new advancements to rerank models."
        ],
        "query": "What is a rerank model?",
        "top_n": 2,
        "instruct": "Given a web search query, retrieve relevant passages that answer the query."
}'

qwen3-vl-rerank

Text query

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen3-vl-rerank",
    "input":{
         "query": {"text": "What is a rerank model?"},
         "documents": [
            {"text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."},
            {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
            {"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250107/lbcemt/new+video.mp4"}
         ]
    },
    "parameters": {
        "return_documents": true,
        "top_n": 2,
        "fps": 1.0
    }
}'

Image query

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen3-vl-rerank",
    "input":{
         "query": {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
         "documents": [
            {"text": "Text rerank models are widely used in search engines and recommendation systems to sort candidate captions based on text relevance."},
            {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
            {"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250107/lbcemt/new+video.mp4"}
         ]
    },
    "parameters": {
        "return_documents": true,
        "top_n": 2,
        "fps": 1.0
    }
}'

gte-rerank-v2

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "gte-rerank-v2",
    "input":{
         "query": "What is a rerank model?",
         "documents": [
         "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
         "Quantum computing is a cutting-edge field of computer science.",
         "The development of pre-trained language models has brought new advancements to rerank models."
         ]
    },
    "parameters": {
        "return_documents": true,
        "top_n": 2
    }
}'

Request headers

Content-Type string (Required)

The content type of the request. Must be application/json.

Authorization string (Required)

The authentication credentials using a Model Studio API key.

Example: Bearer sk-xxxx

Request body

model string (Required)

The model name. Supported models include qwen3-rerank, gte-rerank-v2, and qwen3-vl-rerank.

input object (Required)

The input content.

When you use qwen3-rerank without the input object, the query and documents must be at the same level as the model parameter.

Properties

query string | object (Required)

The query text (max 4,000 tokens).

For qwen3-vl-rerank, query supports two formats:

  • String format: You can pass a text string directly. For example, "query": "What is a rerank model?".

  • Object format: You can pass a dictionary to specify the modality type and value. The format is {"modality type": "input content"}. The text and image modality types are supported.

    • Text query: "query": {"text": "What is a text rerank model?"}

    • Image query: "query": {"image": "Image URL or Base64-encoded string"}

documents array (Required)

Candidate documents to sort. Each element is a string.

For qwen3-vl-rerank, each element is a dictionary or string specifying content type and value: {"modality type": "text/image URL/video URL"}. Supported types: text, image, video.

  • Text: Key is text, value is a string. You can pass the string directly without a dictionary.

  • Image: Key is image, value is a URL or Base64 Data URI (data:image/{format};base64,{data}, where {format} is jpeg/png and {data} is the encoded string).

  • Video: Key is video, value must be a publicly accessible URL.

parameters object (Optional)

Optional parameters.

When you use qwen3-rerank, the parameters object is not required. In this case, the top_n and instruct parameters must be at the same level as the model parameter.

Properties

top_n int (Optional)

Number of top-ranked documents to return (default: all). If the value exceeds total documents, all are returned.

return_documents bool (Optional)

Whether to return document text in results (default: false to reduce network overhead). Supported models: gte-rerank-v2, qwen3-vl-rerank.

instruct string (Optional)

Custom instruction for sorting. Applies only to qwen3-rerank and qwen3-vl-rerank. Guides the model to apply different sorting policies. Examples:

  • Q&A retrieval task (default): "Given a web search query, retrieve relevant passages that answer the query."

    • Focus: Find answers. Model prioritizes whether a document answers the query.

    • Example: Query "How to prevent a cold?" - Document "Washing hands frequently prevents colds" scores high; "A cold is a common illness" scores low (topically relevant but doesn't answer).

  • Semantic similarity sorting task: "Retrieve semantically similar text."

    • Focus: Determine semantic equivalence. Model evaluates whether query and document have consistent core meaning, regardless of wording or structure.

    • Example: In FAQ scenarios, "How do I change my password?" and "What if I forget my password?" are semantically similar (high score). Model focuses on whether both reflect the same user intent.

Write instructions in English. Default: Q&A retrieval task. For more instructions, see model repository.

fps float (Optional)

Supported only by qwen3-vl-rerank. Controls video frame extraction count. Smaller values = fewer frames. Range: 0-1 (default: 1.0).

Response

Successful response

{
    "output": {
        "results": [
            {
                "document": {
                    "text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."
                },
                "index": 0,
                "relevance_score": 0.9334521178273196
            },
            {
                "document": {
                    "text": "The development of pre-trained language models has brought new advancements to rerank models."
                },
                "index": 2,
                "relevance_score": 0.34100082626411193
            }
        ]
    },
    "usage": {
        "total_tokens": 79
    },
    "request_id": "85ba5752-1900-47d2-8896-23f99b13f6e1"
}

Failed response

If a request fails, code and message fields indicate the error cause.

{
    "code":"InvalidApiKey",
    "message":"Invalid API-key provided.",
    "request_id":"fb53c4ec-1c12-4fc4-a580-cdb7c3261fc1"
}

request_id string

Unique identifier for the request. Use for tracing and troubleshooting issues.

output object

Task output.

Properties

results array

Sorting results, ordered by relevance_score (descending).

Properties

document dict

Original document object. Returned only when return_documents is true. Structure: {"text": "Original document text"}.

index int

Document's original index in the input documents list.

relevance_score double

Semantic relevance score between document and query. Range: 0.0-1.0 (higher = stronger relevance).

Note

This score is relative to the current request and used for sorting within this request only. Cannot be compared across requests.

usage object

Output statistics.

Properties

total_tokens int

Total tokens consumed by the request.

code string

The error code. Returned only when the request fails. See error codes for details.

message string

Detailed error message. Returned only when the request fails. See error codes for details.

Use the SDK

Example

Example: Call the rerank model API.

SDK parameter names match the HTTP API, but the structure is encapsulated. HTTP API uses nested input and parameters structures; SDK uses a flat structure. Note this difference during development.
import dashscope

def text_rerank():
    resp = dashscope.TextReRank.call(
        model="gte-rerank-v2",
        query="What is a rerank model?",
        documents=[
            "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
            "Quantum computing is a cutting-edge field of computer science.",
            "The development of pre-trained language models has brought new advancements to rerank models."
        ],
        top_n=2,
        return_documents=True
    )
    print(resp)

if __name__ == '__main__':
    text_rerank()

Example: Use qwen3-vl-rerank for multimodal sorting with an image query.

import dashscope
from http import HTTPStatus
import json

def vl_rerank():
    resp = dashscope.TextReRank.call(
        model="qwen3-vl-rerank",
        query={"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
        documents=[
            {"text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."},
            {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
            {"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250107/lbcemt/new+video.mp4"}
        ],
        top_n=2,
        return_documents=True
    )
    if resp.status_code == HTTPStatus.OK:
        print(json.dumps(resp, default=str, ensure_ascii=False, indent=4))
    else:
        print(resp)


if __name__ == '__main__':
    vl_rerank()

Sample output

Note

SDK encapsulates the HTTP response. For successful requests, code and message fields are always empty strings.

{
    "status_code": 200,
    "request_id": "4b0805c0-6b36-490d-8bc1-4365f4c89905",
    "code": "",
    "message": "",
    "output": {
        "results": [
            {
                "index": 0,
                "relevance_score": 0.9334521178273196,
                "document": {
                    "text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."
                }
            },
            {
                "index": 2,
                "relevance_score": 0.34100082626411193,
                "document": {
                    "text": "The development of pre-trained language models has brought new advancements to rerank models."
                }
            }
        ]
    },
    "usage": {
        "total_tokens": 79
    }
}

Error codes

If the model call fails and returns an error message, see Error messages for resolution.