All Products
Search
Document Center

Alibaba Cloud Model Studio:Text rerank

Last Updated:Jun 04, 2026

A rerank model re-scores documents returned by initial retrieval, surfacing the most relevant results at the top.

Model overview

Important

The gte-rerank model will be discontinued on May 30, 2026. Switch to qwen3-rerank.

Singapore

Model

Max documents

Max input tokens per item

Max input tokens per request

Supported languages

Scenarios

qwen3-rerank

500

4,000

120,000

Over 100 major languages, such as Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian

  • Text semantic retrieval

  • RAG applications

Beijing

Model

Max documents

Max input tokens per item

Max input tokens per request

Supported languages

Scenarios

qwen3-vl-rerank

Text: 100

Image: 40

Video: 4

8,000

120,000

33 major languages, such as Chinese, English, Japanese, Korean, French, and German

  • Image clustering

  • Cross-modal search

  • Image retrieval

  • Video retrieval

gte-rerank-v2

500

4,000

30,000

Over 50 languages, such as Chinese, English, Japanese, Korean, Thai, Spanish, French, Portuguese, German, Indonesian, and Arabic

  • Text semantic retrieval

  • RAG applications

  • Max input tokens per item: Maximum tokens per query or document. Exceeding this limit triggers truncation, which may reduce ranking accuracy.

  • Max documents: Maximum documents per request. For qwen3-vl-rerank, the limit varies by document type (text, image, video, or mixed).

  • Max input tokens per request: Calculated as Query Tokens × Number of documents + Total document tokens. Must not exceed the per-request limit.

Input limitations

Model

Image

Video

qwen3-vl-rerank

JPEG, PNG, WEBP, BMP, TIFF, ICO, DIB, ICNS, and SGI (URL or Base64 supported)

MP4, AVI, and MOV (URL only)

Prerequisites

Create an API key and set the API key as an environment variable. To use the SDK: install the DashScope SDK.

HTTP

Each model uses a different endpoint:

  • qwen3-rerank: POST https://dashscope.aliyuncs.com/compatible-api/v1/reranks

  • qwen3-vl-rerank / gte-rerank-v2: POST https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank

The two APIs differ in request body structure and response format. See the request and response examples for each model.

Request

qwen3-rerank

# This is the Singapore region URL. Replace WorkspaceId with your actual workspace ID. URLs differ by region.
curl --request POST \
  --url https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1/reranks \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
        "model": "qwen3-rerank",
        "documents": [
                "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
                "Quantum computing is a cutting-edge field of computer science.",
                "The development of pre-trained language models has brought new advancements to rerank models."
        ],
        "query": "What is a rerank model?",
        "top_n": 2,
        "instruct": "Given a web search query, retrieve relevant passages that answer the query."
}'
Replace WorkspaceId with your actual workspace ID.

qwen3-vl-rerank

Text query

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen3-vl-rerank",
    "input":{
         "query": {"text": "What is a rerank model?"},
         "documents": [
            {"text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."},
            {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
            {"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250107/lbcemt/new+video.mp4"}
         ]
    },
    "parameters": {
        "return_documents": true,
        "top_n": 2,
        "fps": 1.0
    }
}'

Image query

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen3-vl-rerank",
    "input":{
         "query": {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
         "documents": [
            {"text": "Text rerank models are widely used in search engines and recommendation systems to sort candidate captions based on text relevance."},
            {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
            {"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250107/lbcemt/new+video.mp4"}
         ]
    },
    "parameters": {
        "return_documents": true,
        "top_n": 2,
        "fps": 1.0
    }
}'

gte-rerank-v2

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "gte-rerank-v2",
    "input":{
         "query": "What is a rerank model?",
         "documents": [
         "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
         "Quantum computing is a cutting-edge field of computer science.",
         "The development of pre-trained language models has brought new advancements to rerank models."
         ]
    },
    "parameters": {
        "return_documents": true,
        "top_n": 2
    }
}'

Request headers

Content-Type string (Required)

The content type of the request. Must be application/json.

Authorization string (Required)

Authenticates the request with a Model Studio API key. Example: Bearer sk-xxxx.

Request body

model string (Required)

The model name. Supported values: qwen3-rerank, gte-rerank-v2, qwen3-vl-rerank.

input object (Required)

Input content.

For qwen3-rerank without the input object, place query and documents at the same level as model.

Properties

query string | object (Required)

Query text (max 4,000 tokens).

qwen3-vl-rerank supports two query formats:

  • String format: Pass a text string directly, e.g., "query": "What is a rerank model?".

  • Object format: Pass a dictionary specifying modality type and value as {"modality type": "input content"}. Supported types: text and image.

    • Text query: "query": {"text": "What is a text rerank model?"}

    • Image query: "query": {"image": "Image URL or Base64-encoded string"}

documents array (Required)

Candidate documents to sort. Each element is a string.

qwen3-vl-rerank accepts a dictionary or string per element: {"modality type": "text/image URL/video URL"}. Supported types: text, image, video.

  • Text: Key is text, value is a string. A plain string without a dictionary wrapper also works.

  • Image: Key is image, value is a URL or Base64 Data URI (data:image/{format};base64,{data}, where {format} is jpeg/png and {data} is the encoded string).

  • Video: Key is video, value must be a publicly accessible URL.

parameters object (Optional)

Optional parameters.

For qwen3-rerank, the parameters object is not required. Place top_n and instruct at the same level as model.

Properties

top_n int (Optional)

Number of top-ranked documents to return. Default: all documents. Values exceeding total documents return all.

return_documents bool (Optional)

Whether to include document text in results. Default: false. Supported models: gte-rerank-v2, qwen3-vl-rerank.

instruct string (Optional)

Custom sorting instruction. Applies to qwen3-rerank and qwen3-vl-rerank. Guides the model to apply different sorting policies. Examples:

  • Q&A retrieval task (default): "Given a web search query, retrieve relevant passages that answer the query."

    • Focus: Find answers. Model prioritizes whether a document answers the query.

    • Example: Query "How to prevent a cold?" - Document "Washing hands frequently prevents colds" scores high; "A cold is a common illness" scores low (topically relevant but doesn't answer).

  • Semantic similarity sorting task: "Retrieve semantically similar text."

    • Focus: Determine semantic equivalence. Model evaluates whether query and document have consistent core meaning, regardless of wording or structure.

    • Example: In FAQ scenarios, "How do I change my password?" and "What if I forget my password?" are semantically similar (high score). Model focuses on whether both reflect the same user intent.

Write instructions in English. Default: Q&A retrieval task. More instructions are available in the model repository.

fps float (Optional)

qwen3-vl-rerank only. Controls video frame extraction count. Smaller values = fewer frames. Range: 0-1 (default: 1.0).

Response

Successful response

qwen3-rerank

{
    "object": "list",
    "results": [
        {
            "index": 0,
            "relevance_score": 0.9334521178273196
        },
        {
            "index": 2,
            "relevance_score": 0.34100082626411193
        }
    ],
    "model": "qwen3-rerank",
    "id": "85ba5752-1900-47d2-8896-23f99b13f6e1",
    "usage": {
        "total_tokens": 79
    }
}

qwen3-vl-rerank / gte-rerank-v2

{
    "output": {
        "results": [
            {
                "document": {
                    "text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."
                },
                "index": 0,
                "relevance_score": 0.9334521178273196
            },
            {
                "document": {
                    "text": "The development of pre-trained language models has brought new advancements to rerank models."
                },
                "index": 2,
                "relevance_score": 0.34100082626411193
            }
        ]
    },
    "usage": {
        "total_tokens": 79
    },
    "request_id": "85ba5752-1900-47d2-8896-23f99b13f6e1"
}

Failed response

If a request fails, code and message indicate the error cause.

{
    "code":"InvalidApiKey",
    "message":"Invalid API-key provided.",
    "request_id":"fb53c4ec-1c12-4fc4-a580-cdb7c3261fc1"
}

request_id string

Unique request identifier for tracing and troubleshooting.

output object

Task output.

For qwen3-rerank, the response does not contain the output object. The results array is at the top level.

Properties

results array

Sorting results, ordered by relevance_score descending.

Properties

document dict

Original document object. Returned only when return_documents is true. Format: {"text": "Original document text"}.

index int

Document index in the input documents array.

relevance_score double

Semantic relevance between document and query. Range: 0.0-1.0 (higher = more relevant).

Note

Scores are relative to the current request and cannot be compared across requests.

usage object

Token usage statistics.

Properties

total_tokens int

Total tokens consumed by the request.

code string

Error code. Returned only for failed requests. See Error codes.

message string

Detailed error message. Returned only for failed requests. See Error codes.

Use the SDK

Example

Call the rerank model API.

SDK parameter names match the HTTP API, but the structure differs. HTTP uses nested input and parameters objects; the SDK uses a flat structure.
import dashscope

def text_rerank():
    resp = dashscope.TextReRank.call(
        model="gte-rerank-v2",
        query="What is a rerank model?",
        documents=[
            "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
            "Quantum computing is a cutting-edge field of computer science.",
            "The development of pre-trained language models has brought new advancements to rerank models."
        ],
        top_n=2,
        return_documents=True
    )
    print(resp)

if __name__ == '__main__':
    text_rerank()

Use qwen3-vl-rerank for multimodal reranking with an image query.

import dashscope
from http import HTTPStatus
import json

def vl_rerank():
    resp = dashscope.TextReRank.call(
        model="qwen3-vl-rerank",
        query={"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
        documents=[
            {"text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."},
            {"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
            {"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250107/lbcemt/new+video.mp4"}
        ],
        top_n=2,
        return_documents=True
    )
    if resp.status_code == HTTPStatus.OK:
        print(json.dumps(resp, default=str, ensure_ascii=False, indent=4))
    else:
        print(resp)


if __name__ == '__main__':
    vl_rerank()

Sample output

Note

The SDK wraps the HTTP response. For successful requests, code and message are always empty strings.

{
    "status_code": 200,
    "request_id": "4b0805c0-6b36-490d-8bc1-4365f4c89905",
    "code": "",
    "message": "",
    "output": {
        "results": [
            {
                "index": 0,
                "relevance_score": 0.9334521178273196,
                "document": {
                    "text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."
                }
            },
            {
                "index": 2,
                "relevance_score": 0.34100082626411193,
                "document": {
                    "text": "The development of pre-trained language models has brought new advancements to rerank models."
                }
            }
        ]
    },
    "usage": {
        "total_tokens": 79
    }
}

Error codes

If the model call fails and returns an error message, see Error codes for resolution.