All Products
Search
Document Center

Alibaba Cloud Model Studio:Text rerank

Last Updated:Oct 28, 2025

To maintain efficiency, text retrieval systems may return results that are not precise enough during the retrieval phase. A text rerank model can perform a second, more precise sorting on the retrieved documents. This process ensures that the results most relevant to a user's query are ranked highest, which improves the application's accuracy.

Model overview

Model

Max number of documents

Max input tokens per item

Max input tokens per request

Supported languages

Price (Million input tokens)

Scenarios

gte-rerank-v2

500

4,000

30,000

Over 50 languages, such as Chinese, English, Japanese, Korean, Thai, Spanish, French, Portuguese, German, Indonesian, and Arabic

$0.115

  • Text semantic retrieval

  • RAG applications

  • Max input tokens per item: The maximum number of tokens for each query or document is 4,000. If the input content exceeds this length, it is truncated. The API calculates the result based on the truncated content, which may lead to inaccurate sorting.

  • Max number of documents: The maximum number of documents in each request is 500.

  • Max input tokens per request: The total number of tokens for the query and all documents in a request cannot exceed 30,000.

Prerequisites

You must get an API key and set the API key as an environment variable. If you use the SDK to make calls, you must also install the DashScope SDK.

HTTP

POST https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank

Request

Text rerank

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "gte-rerank-v2",
    "input":{
         "query": "What is a text rerank model",
         "documents": [
         "Text rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance",
         "Quantum computing is a cutting-edge field in computer science",
         "The development of pre-trained language models has brought new progress to text rerank models"
         ]
    },
    "parameters": {
        "return_documents": true,
        "top_n": 2,
        "instruct": "Given a web search query, retrieve relevant passages that answer the query."
    }
}'

Headers

Content-Type string (Required)

The content type of the request. Set this parameter to application/json.

Authorization string (Required)

The identity authentication credentials for the request. This API uses an Model Studio API key for identity authentication. Example: Bearer sk-xxxx.

Request body

model string (Required)

The model name. The value must be gte-rerank-v2.

input object (Required)

The input content.

Properties

query string (Required)

The query text. The maximum length is 4,000 tokens.

documents array (Required)

A list of candidate documents to sort. Each element is a string. The list can contain up to 500 documents, and each document can be up to 4,000 tokens long.

parameters object (Optional)

Optional parameters.

Properties

top_n int (Optional)

The number of top-ranked documents to return. By default, all documents are returned. If the specified value is greater than the total number of documents, all documents are returned.

return_documents bool (Optional)

Specifies whether to return the original text of the documents in the sorting results. The default value is false to reduce network overhead.

Response

Successful response

{
    "output": {
        "results": [
            {
                "document": {
                    "text": "Text rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance"
                },
                "index": 0,
                "relevance_score": 0.9334521178273196
            },
            {
                "document": {
                    "text": "The development of pre-trained language models has brought new progress to text rerank models"
                },
                "index": 2,
                "relevance_score": 0.34100082626411193
            }
        ]
    },
    "usage": {
        "total_tokens": 79
    },
    "request_id": "85ba5752-1900-47d2-8896-23f99b13f6e1"
}

Failed response

If a request fails, the code and message fields in the output indicate the cause of the error.

{
    "code":"InvalidApiKey",
    "message":"Invalid API-key provided.",
    "request_id":"fb53c4ec-1c12-4fc4-a580-cdb7c3261fc1"
}

request_id string

The unique request ID. You can use this ID to trace and troubleshoot issues.

output object

The task output.

Properties

results array

A list of the sorting results. The results are sorted by relevance_score in descending order.

Properties

document dict

The original document object. This is returned only when the return_documents request parameter is true. The structure is {"text": "Original document text"}.

index int

The original index of the corresponding document in the input documents list.

relevance_score double

The semantic relevance score between the document and the query. The value ranges from 0.0 to 1.0. A higher score indicates stronger relevance.

Note

This score is a relative value within the current request and is used primarily for sorting documents within this request. It cannot be used as an absolute value for comparison across different requests.

usage object

Usage statistics for the request.

Properties

total_tokens int

The total number of tokens consumed by the request.

code string

The error code for a failed request. This parameter is not returned if the request is successful. For more information, see Error messages.

message string

The detailed information about a failed request. This parameter is not returned if the request is successful. For more information, see Error messages.

Use the SDK

Example

The following example shows how to call the text rerank model API.

The parameter names in the SDK are mostly consistent with those in the HTTP API, but the parameter structure is encapsulated. For example, the HTTP API uses nested input and parameters structures, while the SDK uses a flat structure. Note this difference during development.
import dashscope

def text_rerank():
    resp = dashscope.TextReRank.call(
        model="gte-rerank-v2",
        query="What is a text rerank model",
        documents=[
            "Text rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance",
            "Quantum computing is a cutting-edge field in computer science",
            "The development of pre-trained language models has brought new progress to text rerank models"
        ],
        top_n=2,
        return_documents=True
    )
    print(resp)

if __name__ == '__main__':
    text_rerank()

Sample output

Note

The SDK encapsulates the original HTTP response. For a successful request, the SDK always returns the code and message fields with empty strings as their values.

{
    "status_code": 200,
    "request_id": "4b0805c0-6b36-490d-8bc1-4365f4c89905",
    "code": "",
    "message": "",
    "output": {
        "results": [
            {
                "index": 0,
                "relevance_score": 0.9334521178273196,
                "document": {
                    "text": "Text rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance"
                }
            },
            {
                "index": 2,
                "relevance_score": 0.34100082626411193,
                "document": {
                    "text": "The development of pre-trained language models has brought new progress to text rerank models"
                }
            }
        ]
    },
    "usage": {
        "total_tokens": 79
    }
}

Error codes

If a call fails, see Error messages for troubleshooting.