All Products
Search
Document Center

Platform For AI:API feature description

Last Updated:Apr 23, 2025

Judge models allow you to call algorithm services by using OpenAI SDK for Python and HTTP. This topic describes how to call APIs, and API parameters and sample code for judge models.

Chat completions

Sample code

  1. Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        completion = client.chat.completions.create(
            model='pai-judge',
            messages=[
                {
                    "role": "user",
                    "content": [
                        {
                            "mode": "single",
                            "type": "json",
                            "json": {
                                "question": "According to the first couplet, give the second couplet. first couplet: To climb the mountain, reach the peak",
                                "answer": "To cross the river, find the creek."
                            }
                        }
                    ]
                }
            ]
        )
        print(completion.model_dump())
    
    
    if __name__ == '__main__':
        main()
    $ curl -X POST https://aiservice.cn-hangzhou.aliyuncs.com/v1/chat/completions \
      -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "pai-judge",
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "mode": "single",
                        "type": "json",
                        "json": {
                            "question": "According to the first couplet, give the second couplet. first couplet: To climb the mountain, reach the peak",
                            "answer": "To cross the river, find the creek."
                        }
                    }
                ]
            }
        ]
    }'
  2. Response example

    {
        "id": "3b7c3822-1e51-4dc9-b2ad-18b9649a7f19",
        "choices": [
            {
                "finish_reason": "stop",
                "index": 0,
                "logprobs": null,
                "message": {
                    "content": "I think the overall score of the answer is [[2]] due to the following reasons:\n Advantages of the answer:\n1. Relevance: The answer directly addresses the user's question, providing a second couplet that corresponds to the first couplet. ThiS meets the relevance criteria.[[4]]\n2. Harmlessness: The answer does not contain offensive content. This meets the harmlessness criteria.[[5]]\n\n Disadvantages of the answer: \n1. Accuracy: The content \"To cross the river, find the creek\" does not fully align with the logical sequence of \"climbing\" and \"reaching the peak\" in the user's question, which affects the accuracy of the answer.[[2]]\n2. Completeness: The answer does not fully address all aspects of the question because the answer does not provide a complete story or completely align with the question. This affects the completeness of the answer.[[2]]\n3. Source reliability: The answer does not provide source information. Although the source information may not be necessary in some scenarios, the information can enhance the credibility of the answer.[[3]]\n4. Clarity and structure: Although the answer is simple in structure, its clarity and comprehensibility are affected because the answer does not fully correspond to the question.[[3]]\n5. Adaptability to the user level: Although the answer directly addresses the question, the answer may not be completely suitable for users who have a certain understanding of couplets or traditional literature due to inaccuracy.[[3]]\n\n In summary, although the answer performs well in relevance and harmlessness, the answer shows shortcomings in accuracy, completeness, source reliability, clarity and structure, and adaptability to the user level, which results in an overall rating of 2.",
                    "role": "assistant",
                    "function_call": null,
                    "tool_calls": null,
                    "refusal": ""
                }
            }
        ],
        "created": 1733260,
        "model": "pai-judge",
        "object": "chat.completion",
        "service_tier": "",
        "system_fingerprint": "",
        "usage": {
            "completion_tokens": 333,
            "prompt_tokens": 790,
            "total_tokens": 1123
        }
    }
    

Request parameters

The APIs of judge models are compatible with those of OpenAI. You can configure the parameters described in the following table. For information about other parameters, see OpenAI documentation.

Note

If you cannot access the web page, you may need to configure a proxy and then try again.

Parameter

Type

Required

Default value

Description

model

string

Yes

None

The name of the model. Valid values:

  • pai-judge (standard edition)

  • pai-judge-plus (advanced edition)

For more information about the models, see Supported judge models.

messages

array

Yes

None

The content to be evaluated.

temperature

float

No

0.2

Controls the randomness and diversity of the answer provided by the model. Valid values: [0, 2).

top_p

float

No

None

The probability threshold of the nucleus sampling method used in the generation process.

stream

boolean

No

False

Specifies whether to enable the streaming output mode.

stream_options

object

No

None

Specifies whether to display the number of tokens used in the streaming output mode. This parameter takes effect only when the stream parameter is set to True. If you want to count the number of tokens in streaming output mode, set this parameter to stream_options={"include_usage":True}.

max_tokens

integer

No

2048

The maximum number of tokens that can be generated by the model.

frequency_penalty

integer or null

No

0

Positive values are used to penalize the model for new words that frequently occur in the text. This reduces the possibility that the model repeatedly generates the same line of text word for word. Valid values: [-2.0, 2.0].

presence_penalty

float

No

0

Penalizes the model for content that has already appeared in the text. If you set this parameter to a large value, the possibility of generating content that already appeared in the text can be reduced. Valid values: [-2.0, 2.0].

seed

integer

No

None

The random seed used during content generation. This parameter controls the randomness of the generated content. Valid values: unsigned 64-bit integers.

stop

string or array

No

None

If you set this parameter to a string or a token ID, the model stops generating content before the string or token ID is generated. This parameter helps achieve precise control over the content generation process.

tools

array

No

None

The list of tools that can be called by the model. The model calls a tool from the tool list during each function call.

tool_choice

string or object

No

None

The tool that can be called by the model.

parallel_tool_calls

object

No

True

Specifies whether to enable the parallel function calling feature when a tool is used.

user

string

No

None

The user identifier.

logit_bias

map

No

None

The possibility that the model generates a specific label.

logprobs

boolean

No

False

Specifies whether to return the log probability of each output label. Valid values:

  • False

  • True

top_logprobs

integer

No

None

The maximum number of labels that can be returned from each labeled position. Each label has a log probability. Valid values: 0 to 20. If you configure this parameter, you must set the logprobs parameter to True.

n

integer

No

1

The number of chat completion options that are generated for each input message.

response_format

object

No

{"type": "text"}

The format of the object that the model must return. Valid values:

  • {"type": "text"} (default): The model returns an evaluation response in natural language.

  • {"type": "json_object"}: The model returns an evaluation response in the JSON format. Example:

    {
      "Total score": "4",
      "Accuracy": {
        "Score": "5",
        "Reason": "The answer accurately provides the corresponding couplet \"To cross the river, find the creek\" to \"To climb the mountain, reach the peak\", conforming to the rules of couplets and demonstrating a high degree of accuracy."
      },
      "Relevance": {
        "Score": "3",
        "Reason": "The answer directly addresses the user's question without including any unnecessary information, ensuring high relevance."
      }
    }
    

service_tier

string

No

None

The latency tier that is used to process requests.

messages parameter

The messages parameter specifies the information received by a judge model. The following sample code provides an example:

messages=[
    {
        "role": "user",
        "content": [
            {
                "mode": "single",
                "type": "json",
                "json": {
                    "question": "Provide the second couplet to \"To climb the mountain, reach the peak\"",
                    "answer": "To cross the river, find the creek"
                }
            }
        ]
    }
]

The messages parameter is in the array format. Each element is in the {"role":role, "content": content} format. role specifies the user. content specifies the content to be evaluated by the model. The content includes the following information:

  • mode: the evaluation mode. single specifies single-model evaluation. pairwise specifies dual-model competition.

  • type: Set the value to json.

  • json: the details of the content to be evaluated.

The following table describes the json parameter.

Parameter

Type

Required

Description

Default value

question

string

Yes

The question.

None

answer

string

Required for single-model evaluation

The answer.

None

answer1

string

Required for dual-model competition

The answer from Model 1.

None

answer2

string

Required for dual-model competition

The answer from Model 2.

None

ref_answer

string

No

The reference answer.

None

scene

string

No

The scenario name.

The name is automatically generated by the judge model. Example: Open questions.

scene_desc

string

No

The scenario description.

The description is automatically generated by the judge model. For example, a user asks an open question, and the answer is open, such as casual conversation, consultation advice, and recommendation.

metric

string

No

The scenario dimension.

The dimension is automatically generated by the judge model. Example:

  • Accuracy: Ensure the accuracy of the provided information, follow common sense and facts, and avoid misleading users.

  • Relevance: Address the question asked by the user, avoid irrelevant content, and ensure the relevance of the information.

  • Cultural sensitivity and harmlessness: Understand and respect the cultural background and differences of users, conform to ethics and morality, avoid cultural bias and insensitive expressions, and avoid including any potentially offensive content.

  • Information richness: Provide detailed information while ensuring accuracy, especially background information that the user may not explicitly request but is helpful for understanding the question.

  • Clarity: Use clear and easy-to-understand language to answer questions, and avoid technical terms or complex structures that may cause misunderstandings.

  • User engagement: Encourage users to further communicate, show attention and thinking about user questions, and promote communication by asking questions or providing feedback.

  • Empathy: Consider the emotional state of the user when answering, appropriately express empathy and understanding, especially when answering questions with emotional overtones.

  • Constructive feedback: Maintain a positive and constructive attitude even when facing critical or negative questions, and provide valuable responses and suggestions.

max_score

integer

No

The scoring range. Valid values: 2 to 10.

5

score_desc

string

No

The detailed description of each score. We recommend that you define the quality of answers that are scored from 1 to the value of the max_score parameter in sequence.

1: The answer has major flaws, completely deviates from the standards, and should not appear in practice.

2: Some content of the answer meets the standards and can be accepted, but as a whole, the quality of the answer is not qualified.

3: The answer has both advantages and disadvantages, and the advantages outweigh the disadvantages within the required evaluation criteria.

4: The quality of the answer is acceptable, the answer generally meets the standards, and there are small issues that can be improved. This level represents the quality of a reference answer.

5: The answer is perfect and strictly meets the standards in all aspects. When a reference answer is provided, this level represents the quality of the answer that is better than the reference answer.

steps

string

No

The evaluation steps.

  1. Recall the answer criteria of related Artificial Intelligence (AI) assistants, and carefully read and understand the answer to be evaluated.

  2. Identify the key criteria for the current question and answer from all criteria, including criteria for high-quality and low-quality answers.

  3. In addition to the given criteria, you can add other important criteria that you consider necessary to evaluate the current answer to the question.

  4. According to your final criteria, score the answer (1 to 5) in sequence, and provide the overall score of the answer after weighting and summing up the scores of all sub-items. Think carefully and then give your conclusion. The following section provides the evaluation templates. Take note that you must retain '[[' and ']]' in the templates.

The parameters in content are used to fill in and generate prompt templates. The single-model evaluation and dual-model competition request examples use the following templates to construct requests. The content of content is automatically filled in the corresponding positions of the template.

Single-model evaluation request template

Your task is to evaluate the quality of an AI assistant's answer.

You clearly understand that when a user provides a question about the [${scene}] scenario (defined as: ${scene_desc}), an AI assistant's answer should meet the following criteria (listed in order of importance from highest to lowest):
[Start of criteria]
${metric}
[End of criteria]

The scoring system uses a ${max_score}-level scale (1-${max_score}), with each score level defined as follows:
[Start of score description]
${score_desc}
[End of score description]

We have collected the following answer from an AI assistant to a user question.
Please comprehensively evaluate the answer based on the criteria that you use for the current scenario. The following items describe the question and answer from an AI assistant:
[Start of data] 
***
[User question]: ${question}
***
[Answer]: ${answer}
***
[Reference answer]: ${ref_answer}
***
[End of data]

You must evaluate the preceding answer based on the following process:
${steps}

Think carefully and then provide your conclusion.

Dual-model competition request template

Your task is to evaluate the quality of AI assistants' answers.

You clearly understand that when a user provides a question about the [${scene}] scenario (defined as: ${scene_desc}), an AI assistant's answer should meet the following criteria (listed in order of importance from highest to lowest):
[Start of criteria]
${metric}
[End of criteria]

The scoring system uses a ${max_score}-level scale (1-${max_score}), with each score level defined as follows:
[Start of score description]
${score_desc}
[End of score description]

For a user question in the [${scene}] scenario, we have collected answers from two AI assistants.
Please comprehensively evaluate the answers and determine which answer is better or whether the quality of the two answers is the same based on the criteria that you use for the current scenario. The following items describe a question and the answers from the AI assistants:
[Start of data]
***
[User question]: ${question}
***
[Answer 1]: ${answer1}
***
[Answer 2]: ${answer2}
***
[Reference answer]: ${ref_answer}
***
[End of data]

You must evaluate and compare the two answers based on the following process:
{steps}

Think carefully and then provide your conclusion.
Note

If you leave the ${scene} parameter empty, the judge model automatically configures the scenario based on the ${question} parameter that you specify, and generates the corresponding scenario description specified by the ${scene_desc} parameter and the scenario dimension specified by the ${metric} parameter.

Response parameters

Parameter

Type

Description

id

string

The ID that is automatically generated by the system to identify the model call.

model

string

The name of the model that is called.

system_fingerprint

string

The configuration version of the model that is called. This parameter is not supported and an empty string "" is returned.

choices

array

The details that are generated by the model.

choices[i].finish_reason

string

The model evaluation status.

  • null: The evaluation result is being generated.

  • stop: The evaluation stopped because the stop condition in the request parameter is triggered.

  • length: The evaluation stopped because the content generated by the model exceeds the length limit.

choices[i].message

object

The message returned by the model.

choices[i].message.role

string

The role of the model. Set the value to assistant.

choices[i].message.content

string

The content generated by the model.

choices[i].index

integer

The sequence number of the content. Default value: 0.

created

integer

The timestamp of the generated content. Unit: seconds.

usage

string or array

The number of tokens that are consumed during the request.

usage.prompt_tokens

integer

The number of tokens that are converted from the input text.

usage.completion_tokens

integer

The number of tokens that are converted from the response generated by the model.

usage.total_tokens

integer

The sum of the values returned by the usage.prompt_tokens and usage.completion_tokens parameters.

Status codes

Status code

Code

Error message

Description

200

OK

None

The request is successful.

400

MessagesError

"messages" not in body or type of "messages" is not list.

The messages parameter cannot be left empty. The reason may be that the format of the value for the messages parameter is invalid. The value must be in the list format.

400

ContentError

Content should be like: {"content": [{"type": "json", "mode": "[single / pairwise]", "json": {"question": "<question>", "answer": "<answer>" ...}}]

The content is incorrect. Configure the content settings based on the example:

{
  "content": [
    {
      "type": "json", 
      "mode": "[single / pairwise]", 
      "json": {
        "question": "<question>", 
        "answer": "<answer>",
        ...
      }
    }
  ]
}

400

ResponseFormatError

Response_format should be one of [{"type": "text"}, {"type": "json_object"}]

You must set the response_format parameter to one of the following values:

  • {"type": "text"}

  • {"type": "json_object"}

400

ModeError

Mode must be in [single, pairwise], mode: {mode}.

You must set the mode parameter to one of the following values:

  • single

  • pairwise

400

QuestionError

Question should not be empty

The question parameter cannot be left empty.

400

AnswerError

Answer should not be empty when mode=single.

The answer parameter cannot be left empty when the mode parameter is set to single.

400

AnswerError

Answer1 or answer2 should not be empty when mode=pairwise, answer1: {answer1}, answer2: {answer2}.

The answer1 and answer2 parameters cannot be left empty when the mode parameter is set to pairwise.

400

SceneError

Scene need to be specified a judge-native scece when scene_desc and metric is empty.

If the scene_desc and metric parameters are left empty, you must set the scene parameter to one of the following built-in scenarios:

  • Mathematical questions

  • Definitive questions

  • Open questions

  • Text rewriting

  • Creative style writing

  • Informational and professional writing

  • Practical style writing

  • Professional style writing

  • Translation

  • Reading comprehension and information extraction

  • Role-play

  • Code generation, modification, and analysis

400

SceneError

Scene_desc and metric need not be specified when scene is not empty and not a inner scene, scene_desc: {scene_desc}, metric: {metric}.

If you configure the scene parameter and the value is not a built-in scenario, you must configure the scene_desc and metric parameters.

400

SceneError

Scene_desc and metric need not to be specified when scene is empty, scene_desc: {scene_desc}, metric: {metric}.

If you leave the scene parameter empty, you must leave the scene_desc and metric parameters empty.

400

ScoreError

Score_desc need to be specified when max_score is not empty.

If you configure the max_score parameter, you must configure the score_desc parameter.

400

ScoreError

Score_desc need not to be specified when max_score is empty.

If you leave the max_score parameter empty, you must leave the score_desc parameter empty.

401

InvalidToken

Invalid Token provided.

The token is invalid.

402

InvalidBody

json load request body error

The request body is not in the JSON format.

403

GreenNetFilter

The output content contains high risk. risk_info: xxx

The output content has a high risk.

404

ModelNotFound

Model not found, model must in ['pai-judge', 'pai-judge-plus']

The model that you access does not exist.

500

ModelServiceFailed

Scenario_division, load error, request_id: xxx, errmsg: xxx

Failed to call the scene segmentation model.

500

ModelServiceFailed

Request_judge_model, load error, request_id: xxx, errmsg: xxx

Failed to call the judge model.

500

ModelServiceFailed

Request_judge_model_with_stream, load error, request_id: xxx, errmsg: xxx

Failed to call the judge model in streaming output mode.

Files

Upload a file: POST /v1/files

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        upload_files = client.files.create(
            file=open("/home/xxx/input.jsonl", "rb"),
            purpose="batch",
        )
        print(upload_files.model_dump_json(indent=4))
    
    
    if __name__ == '__main__':
        main()
    $ curl -XPOST https://aiservice.cn-hangzhou.aliyuncs.com/v1/files \
      -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" \
      -F purpose="batch"  \
      -F file="@/home/xxx/input.jsonl"
    
  • Response example

    {
        "id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
        "object": "file",
        "bytes": 698,
        "created_at": 1742454203,
        "filename": "input.jsonl",
        "purpose": "batch"
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    file

    file

    Yes

    The file.

    purpose

    string

    Yes

    The expected usage of the file.

    • assistants: the assistant and message files.

    • vision: the assistant image file.

    • batch: the API file for batch processing.

    • fine-tune: the fine-tuning file.

  • Response parameters

    For more information, see File description.

Query a list of files: GET /v1/files

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        list_files = client.files.list(
            purpose="batch",
            order="desc",
            limit=10,
            after=""
        )
        print(list_files.model_dump_json(indent=4))
    
        
    if __name__ == '__main__':
        main()
    $ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/files \
      -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
  • Response example

    {
        "object": "list",
        "data": [
            {
                "id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
                "object": "file",
                "bytes": 698,
                "created_at": 1742454203,
                "filename": "input.jsonl",
                "purpose": "batch"
            },
            {
                "id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022",
                "object": "file",
                "bytes": 1420,
                "created_at": 1742455638,
                "filename": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022_success.jsonl",
                "purpose": "batch_output"
            }
        ]
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    purpose

    string

    No

    Returns only files with the specific purpose.

    limit

    string

    No

    The expected usage of the file.

    Only API files for batch processing are supported.

    Default value: 10000.

    order

    string

    No

    The order in which the created_at parameter is used to sort the query results.

    • asc

    • desc (default)

    after

    string

    No

    The cursor for pagination. This parameter defines an object ID that indicates a position in the query results. For example, if you send a request and receive 100 objects ending with obj_foo, you can set after to obj_foo in a subsequent call to obtain the query results after obj_foo.

  • Response parameters

    Parameter

    Type

    Description

    object

    string

    Returns only files with the specific purpose.

    data

    array

    For more information, see File description.

Query a file: GET /v1/files/{file_id}

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        retrieve_files = client.files.retrieve(
            file_id="file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
        )
        print(retrieve_files.model_dump_json(indent=4))
    
    
    if __name__ == '__main__':
        main()
    $ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/files/file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713 \
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
  • Response example

    {
        "id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
        "object": "file",
        "bytes": 698,
        "created_at": 1742454203,
        "filename": "input.jsonl",
        "purpose": "batch"
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    file_id

    string

    Yes

    The ID of the file.

  • Response parameters

    For more information, see File description.

Query or download the file content: GET /v1/files/{file_id}/content

Note

Only files for batch processing are supported.

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        content_files = client.files.content(
            file_id="file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022",
        )
        print(content_files)
    
    
    if __name__ == '__main__':
        main()
    $ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/files/file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022/content \
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" > output.jsonl
  • Response example

    {"id":"dcee3584-6f30-9541-a855-873a6d86b7d9","custom_id":"request-1","response":{"status_code":200,"request_id":"dcee3584-6f30-9541-a855-873a6d86b7d9","body":{"created":1737446797,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-dcee3584-6f30-9541-a855-873a6d86b7d9","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}
    {"id":"dcee3584-6f30-9541-a855-873a6d86b7d9","custom_id":"request-2","response":{"status_code":200,"request_id":"dcee3584-6f30-9541-a855-873a6d86b7d9","body":{"created":1737446797,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-dcee3584-6f30-9541-a855-873a6d86b7d9","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}
  • Request parameters

    Parameter

    Type

    Required

    Description

    file_id

    string

    Yes

    The ID of the file.

  • Response parameters

    For more information, see File description.

Delete a file: DELETE /v1/files/{file_id}

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        delete_files = client.files.delete(
            file_id="file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022",
        )
        print(delete_files)
    
    
    if __name__ == '__main__':
        main()
    $ curl -XDELETE https://aiservice.cn-hangzhou.aliyuncs.com/v1/files/file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022 \
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
  • Response example

    {
        "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022",
        "object": "file",
        "deleted": "true"
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    file_id

    string

    Yes

    The ID of the file.

  • Response parameters

    Parameter

    Type

    Description

    id

    string

    The ID of the file that was deleted.

    object

    string

    The type of the deleted file.

    deleted

    string

    Indicates whether the file was deleted.

File description

Parameter

Type

Description

id

string

The ID of the file that was deleted.

object

string

The type of the deleted file.

bytes

integer

The size of the file.

created_at

integer

The time when the file was created.

filename

string

The name of the file.

purpose

string

The expected usage of the file.

Batch

Create a batch task: POST /v1/batches

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        create_batches = client.batches.create(
            endpoint="/v1/chat/completions",
            input_file_id="file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
            completion_window="24h",
        )
        print(create_batches.model_dump_json(indent=4))
    
    
    if __name__ == '__main__':
        main()
    $ curl -XPOST https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches \ 
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" \
        -d '{
            "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
            "endpoint": "/v1/chat/completions",
            "completion_window": "24h"
     }'
  • Response example

    {
        "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022",
        "object": "batch",
        "endpoint": "/v1/chat/completions",
        "errors": null,
        "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
        "completion_window": "24h",
        "status": "Creating",
        "output_file_id": null,
        "error_file_id": null,
        "created_at": 1742455213,
        "in_process_at": null,
        "expires_at": null,
        "FinalizingAt": null,
        "completed_at": null,
        "failed_at": null,
        "expired_at": null,
        "cancelling_at": null,
        "cancelled_at": null,
        "request_counts": {
            "total": 3,
            "completed": 0,
            "failed": 0
        },
        "metadata": null
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    input_file_id

    string

    Yes

    The ID of the file that is formatted as a .jsonl file and must be uploaded with the target batch task. The file can contain up to 50,000 requests and be up to 20 MB in size.

    endpoint

    string

    Yes

    The endpoint to use for all requests in a batch task. Only /v1/chat/completions is supported.

    completion_window

    string

    Yes

    The time range in which a batch of files are processed. Set the value to 24h.

    completion_window

    object

    No

    The custom metadata for the batch task.

  • Response parameters

    For more information, see Batch task description.

Query a list of batche tasks: GET /v1/files

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "http://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        list_batches = client.batches.list(
            after="batch_66f245a0-88d1-458c-8e1c-a819a5943022",
            limit=10,
        )
        print(list_batches.model_dump_json(indent=4))
    
    
    if __name__ == '__main__':
        main()
    $ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches \
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
  • Response example

    {
        "object": "list",
        "data": [
            {
                "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022",
                "object": "batch",
                "endpoint": "/v1/chat/completions",
                "errors": null,
                "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
                "completion_window": "24h",
                "status": "Succeeded",
                "output_file_id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022",
                "error_file_id": null,
                "created_at": 1742455213,
                "in_process_at": 1742455640,
                "expires_at": 1742455640,
                "FinalizingAt": 1742455889,
                "completed_at": 1742455889,
                "failed_at": null,
                "expired_at": null,
                "cancelling_at": null,
                "cancelled_at": null,
                "request_counts": {
                    "total": 3,
                    "completed": 3,
                    "failed": 0
                },
                "metadata": null
            }
        ],
        "first_id": "",
        "last_id": "",
        "has_more": false
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    purpose

    string

    No

    Returns only files with the specific purpose.

    limit

    string

    No

    The expected usage of the file.

    • assistants: the assistant and message files.

    • vision: the assistant image file.

    • batch: the API file for batch processing.

    • fine-tune: the fine-tuning file.

    order

    string

    No

    The order in which the created_at parameter is used to sort the query results.

    • asc

    • desc (default)

    after

    string

    No

    The cursor for pagination. This parameter defines an object ID that indicates a position in the query results. For example, if you send a request and receive 100 objects ending with obj_foo, you can set after to obj_foo in a subsequent call to obtain the query results after obj_foo.

  • Response parameters

    Parameter

    Type

    Description

    object

    string

    The object type.

    data

    array

    The batch task description.

Query a batch task: GET /v1/batches/{batch_id}

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "http://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        retrieve_batches = client.batches.retrieve(
            batch_id="batch_66f245a0-88d1-458c-8e1c-a819a5943022",
        )
        print(retrieve_batches.model_dump_json(indent=4))
    
    
    if __name__ == '__main__':
        main()
    $ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches/batch_66f245a0-88d1-458c-8e1c-a819a5943022 \
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
  • Response example

    {
        "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022",
        "object": "batch",
        "endpoint": "/v1/chat/completions",
        "errors": null,
        "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
        "completion_window": "24h",
        "status": "Succeeded",
        "output_file_id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022",
        "error_file_id": null,
        "created_at": 1742455213,
        "in_process_at": 1742455640,
        "expires_at": 1742455640,
        "FinalizingAt": 1742455889,
        "completed_at": 1742455889,
        "failed_at": null,
        "expired_at": null,
        "cancelling_at": null,
        "cancelled_at": null,
        "request_counts": {
            "total": 3,
            "completed": 3,
            "failed": 0
        },
        "metadata": null
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    batch_id

    string

    Yes

    The ID of the batch task.

  • Response parameters

    For more information, see Batch task description.

Cancel a batch task: POST /v1/batches/{batch_id}/cancel

Cancels an in-progress batch task. A batch task maintains the cancelling state for up to 10 minutes before changing to the cancelled state. In this case, if the batch task generates outputs, they are available in the output file.

  • Request example

    import os
    from openai import OpenAI
    
    
    def main():
        base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1"
        judge_model_token = os.getenv("JUDGE_MODEL_TOKEN")
    
        client = OpenAI(
            api_key=f'Authorization: Bearer {judge_model_token}',
            base_url=base_url
        )
        cancel_batches = client.batches.cancel(
            batch_id="batch_66f245a0-88d1-458c-8e1c-a819a5943022",
        )
        print(cancel_batches.model_dump_json(indent=4))
    
    
    if __name__ == '__main__':
        main()
    $ curl -XPOST https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches/batch_66f245a0-88d1-458c-8e1c-a819a5943022/cancel \
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
  • Response example

    {
        "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022",
        "object": "batch",
        "endpoint": "/v1/chat/completions",
        "errors": null,
        "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713",
        "completion_window": "24h",
        "status": "Stopping",
        "output_file_id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022",
        "error_file_id": null,
        "created_at": 1742455213,
        "in_process_at": 1742455640,
        "expires_at": 1742455640,
        "FinalizingAt": 1742455889,
        "completed_at": 1742455889,
        "failed_at": null,
        "expired_at": null,
        "cancelling_at": null,
        "cancelled_at": null,
        "request_counts": {
            "total": 3,
            "completed": 3,
            "failed": 0
        },
        "metadata": null
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    batch_id

    string

    Yes

    The ID of the batch task.

  • Response parameters

    For more information, see Batch task description.

Delete a batch task: DELETE /v1/batches/{batch_id}

  • Request example

    $ curl -XDELETE https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches/batch_66f245a0-88d1-458c-8e1c-a819a5943022 \
        -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
  • Response example

    {
        "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022",
        "object": "batch",
        "deleted": "true"
    }
  • Request parameters

    Parameter

    Type

    Required

    Description

    batch_id

    string

    Yes

    The ID of the batch task.

  • Response parameters

    Parameter

    Type

    Description

    id

    string

    The ID of the batch task.

    object

    string

    The type of the deleted batch task.

    deleted

    string

    Indicates whether the batch task was deleted.

Batch task description

Parameter

Type

Description

id

string

The ID of the batch task that was deleted.

object

string

The type of the deleted batch task.

endpoint

string

The data endpoint.

errors

string

The error message.

input_file_id

string

The ID of the input file.

completion_window

string

The time window.

status

string

The running status.

output_file_id

string

The ID of the output file.

error_file_id

string

The ID of the error file.

created_at

integer

The time when the batch task was created.

in_process_at

integer

The time when the batch task started to be processed.

expires_at

integer

The time when the batch task expires.

finalizing_at

integer

The time when the batch task was finalized.

completed_at

integer

The time when the batch task was completed.

failed_at

integer

The time when the batch task failed.

expired_at

integer

The actual expiration time.

cancelling_at

integer

The time when the batch task started to be cancelled.

cancelled_at

integer

The time when the batch task was cancelled.

request_counts

object

The details of the request count.

request_counts.total

integer

The total number of requests.

request_counts.completed

integer

The number of successful requests.

request_counts.failed

integer

The number of failed requests.

metadata

object

The metadata.