All Products
Search
Document Center

Alibaba Cloud Model Studio:OpenAI compatible - Batch

Last Updated:Nov 11, 2025

Alibaba Cloud Model Studio provides an OpenAI-compatible Batch API. You can use this API to submit tasks in batches using files for asynchronous execution. This lets you process large-scale data offline during non-peak hours. Results are returned after the task is complete or the maximum wait time is reached. The cost is 50% of the cost for real-time calls.

For information about how to perform this operation in the console, see Batch inference.

Prerequisites

  • Activate Alibaba Cloud Model Studio and obtain an API key.

    We recommend that you set the API key as an environment variable to reduce the risk of API key leakage.
  • If you use the OpenAI Python SDK to call the Batch API, run the following command to install the latest version of the OpenAI SDK.

    pip3 install -U openai

Scope

  • Supported region: International (Singapore)

  • Supported models: qwen-max, qwen-plus, and qwen-turbo

Billing

  • Unit price: The unit price for the input and output tokens of all successful requests is 50% of the real-time inference price for the corresponding model. For more information, see Model list.

  • Billing scope:

    • Only successfully executed requests in a task are billed.

    • No fees are charged for file parsing failures, task execution failures, or row-level request errors.

    • For canceled tasks, requests that were successfully completed before the cancellation are billed.

Important

Batch inference is a separate billable item and does not support discounts, such as subscriptions (savings plans) and the free quota, or features such as context cache.

Getting started

Before you start a formal batch task, you can use the test model batch-test-model to perform a complete end-to-end test. This test includes verifying input data, creating a task, querying the task, and downloading the result file. Note the following:

  • The test file must meet the requirements for an input file. The file size cannot exceed 1 MB, and the number of lines cannot exceed 100.

  • Concurrency limit: The maximum number of parallel tasks is 2.

  • Resource usage: The test model does not perform the inference process, so no model inference fees are incurred.

The procedure is as follows:

  1. Prepare a test file

    • Download the sample file test_model.jsonl, which contains request information, to your local machine. Make sure that the file is in the same directory as the Python script that is described later in this topic.

    • Sample content: The model parameter is set to batch-test-model, and the url parameter is set to /v1/chat/ds-test.

      {"custom_id":"1","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
      {"custom_id":"2","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
  2. Run the script

    • Execute the following Python script.

      You can modify the code as needed to adjust the file path or other parameters.
      import os
      from pathlib import Path
      from openai import OpenAI
      import time
      
      # Initialize the client.
      client = OpenAI(
          # If the environment variable is not set, you can replace the following line with api_key="sk-xxx". 
          # However, we do not recommend hard-coding the API key in your code in a production environment to reduce the risk of leakage.
          api_key=os.getenv("DASHSCOPE_API_KEY"),
          base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"  # The base_url of Alibaba Cloud Model Studio.
      )
      
      def upload_file(file_path):
          print(f"Uploading the JSONL file that contains the request information...")
          file_object = client.files.create(file=Path(file_path), purpose="batch")
          print(f"File uploaded successfully. File ID: {file_object.id}\n")
          return file_object.id
      
      def create_batch_job(input_file_id):
          print(f"Creating a batch task based on the file ID...")
          # Note: The value of the endpoint parameter must be the same as the value of the url field in the input file. 
          # For the test model (batch-test-model), set this to /v1/chat/ds-test. For other models, set this to /v1/chat/completions.
          batch = client.batches.create(input_file_id=input_file_id, endpoint="/v1/chat/ds-test", completion_window="24h")
          print(f"Batch task created. Batch task ID: {batch.id}\n")
          return batch.id
      
      def check_job_status(batch_id):
          print(f"Checking the batch task status...")
          batch = client.batches.retrieve(batch_id=batch_id)
          print(f"Batch task status: {batch.status}\n")
          return batch.status
      
      def get_output_id(batch_id):
          print(f"Getting the output file ID for successful requests in the batch task...")
          batch = client.batches.retrieve(batch_id=batch_id)
          print(f"Output file ID: {batch.output_file_id}\n")
          return batch.output_file_id
      
      def get_error_id(batch_id):
          print(f"Getting the error file ID for failed requests in the batch task...")
          batch = client.batches.retrieve(batch_id=batch_id)
          print(f"Error file ID: {batch.error_file_id}\n")
          return batch.error_file_id
      
      def download_results(output_file_id, output_file_path):
          print(f"Printing and downloading the results of successful requests in the batch task...")
          content = client.files.content(output_file_id)
          # Print some of the content for testing.
          print(f"Printing the first 1,000 characters of the successful results: {content.text[:1000]}...\n")
          # Save the result file to your local machine.
          content.write_to_file(output_file_path)
          print(f"The complete output results have been saved to the local output file result.jsonl\n")
      
      def download_errors(error_file_id, error_file_path):
          print(f"Printing and downloading the information of failed requests in the batch task...")
          content = client.files.content(error_file_id)
          # Print some of the content for testing.
          print(f"Printing the first 1,000 characters of the failure information: {content.text[:1000]}...\n")
          # Save the error information file to your local machine.
          content.write_to_file(error_file_path)
          print(f"The complete failure information has been saved to the local error file error.jsonl\n")
      
      def main():
          # File paths
          input_file_path = "test_model.jsonl"  # Replace with your input file path.
          output_file_path = "result.jsonl"  # Replace with your output file path.
          error_file_path = "error.jsonl"  # Replace with your error file path.
          try:
              # Step 1: Upload the JSONL file that contains the request information to get the input file ID.
              input_file_id = upload_file(input_file_path)
              # Step 2: Create a batch task based on the input file ID.
              batch_id = create_batch_job(input_file_id)
              # Step 3: Check the batch task status until it is complete.
              status = ""
              while status not in ["completed", "failed", "expired", "cancelled"]:
                  status = check_job_status(batch_id)
                  print(f"Waiting for the task to complete...")
                  time.sleep(10)  # Wait 10 seconds and query the status again.
              # If the task fails, print the error message and exit.
              if status == "failed":
                  batch = client.batches.retrieve(batch_id)
                  print(f"Batch task failed. Error message:{batch.errors}\n")
                  print(f"For more information, see Error codes: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
                  return
              # Step 4: Download the results. If an output file ID exists, print the first 1,000 characters of the successful results and download the complete results to a local output file.
              # If an error file ID exists, print the first 1,000 characters of the failure information and download the complete information to a local error file.
              output_file_id = get_output_id(batch_id)
              if output_file_id:
                  download_results(output_file_id, output_file_path)
              error_file_id = get_error_id(batch_id)
              if error_file_id:
                  download_errors(error_file_id, error_file_path)
                  print(f"For more information, see Error codes: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
          except Exception as e:
              print(f"An error occurred: {e}")
              print(f"For more information, see Error codes: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
      
      if __name__ == "__main__":
          main()
  3. Verify the test results

    • The task status is completed.

    • The result file result.jsonl contains the fixed response {"content":"This is a test result."}.

      {"id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","custom_id":"1","response":{"status_code":200,"request_id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-bca7295b-67c3-4b1f-8239-d78323bb669f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}
      {"id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","custom_id":"2","response":{"status_code":200,"request_id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-1e32a8ba-2b69-4dc4-be42-e2897eac9e84","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}
      If an error occurs, see Error messages for a solution.

After you verify the test, perform the following steps to run a formal batch task:

  1. Prepare an input file that meets the requirements described in Input file. In the file, set the model parameter to a supported model and set the url parameter to /v1/chat/completions.

  2. Replace the endpoint in the Python script.

    Important

    Make sure that the endpoint in the script is the same as the url parameter in the input file.

  3. Run the script and wait for the task to complete. If the task is successful, an output file named result.jsonl is generated in the same directory.

    If the task fails, the program exits and prints an error message.
    If an error file ID exists, an error file named error.jsonl is generated in the same directory for you to review.
    Exceptions that occur during the process are caught and an error message is printed.

Data file format

Input file

Before you create a batch inference task, you must prepare a file that meets the following specifications:

  • Format: JSONL with UTF-8 encoding. Each line must be an independent JSON object.

  • Size limits: A single file can contain up to 50,000 requests and be up to 500 MB in size.

  • Line limit: Each JSON object can be up to 6 MB and cannot exceed the context length of the model.

  • Consistency: All requests in the same file must use the same model and, if applicable, the same thinking mode.

  • Unique identifier: Each request must contain a custom_id field that is unique within the file. This field is used for result matching.

Request examples

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello!"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

JSONL batch generation tool

You can use the tool below to quickly generate JSONL files. To prevent performance issues, process no more than 10,000 lines at a time. For larger amounts of data, process them in batches.

JSONL batch generation tool

Request parameters

Field

Type

Required

Description

custom_id

String

Yes

A user-defined request ID. Each line represents a request, and each request has a unique custom_id. After the batch job is complete, you can find the request result that corresponds to this custom_id in the output file.

method

String

Yes

The request method. Currently, only POST is supported.

url

String

Yes

The URL associated with the API. This must be the same as the endpoint field specified when you create a batch job.

  • For the test model batch-test-model, enter /v1/chat/ds-test.

  • For other models, enter /v1/chat/completions.

body

Object

Yes

The request body for the model call. It includes all parameters required to call the model, such as model, messages.

The parameters in the request body are the same as those supported by the real-time inference API. For more information about the parameters, see OpenAI compatible API.

To add more parameters, such as max_tokens and temperature, you can also add them to the body. Separate the parameters with commas.

Example:

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-turbo-latest","stream":true,"enable_thinking":true,"thinking_budget":50,"messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Who are you?"}],"max_tokens": 1000,"temperature":0.7}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-turbo-latest","stream":true,"enable_thinking":true,"thinking_budget":50,"messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}],"max_tokens": 1000,"temperature":0.7}}

body.model

String

Yes

The model used for this batch job.

Important

For a single job, all batch requests must use the same model. The thinking mode, if supported, must also be consistent across all requests.

body.messages

Array

Yes

A list of messages.

[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "What is 2+2?"}
]

Convert a CSV file to a JSONL file

If you have a CSV file with a request ID (`custom_id`) in the first column and content in the second, you can use the following Python script to quickly create a JSONL file that is formatted for batch tasks. The CSV file must be in the same folder as the Python script.

Alternatively, you can use the template file that is provided in this topic:

  1. Download the template file and place it in the same folder as the Python script.

  2. The template is a CSV file. The first column is for the request ID (`custom_id`) and the second column is for the content. Paste your content into this file.

Running the following Python script creates a JSONL file named input_demo.jsonl in the same folder. This file is formatted for batch tasks.

You can modify the file path or other parameters in the code as needed.
import csv
import json
def messages_builder_example(content):
    messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": content}]
    return messages

with open("input_demo.csv", "r") as fin:
    with open("input_demo.jsonl", 'w', encoding='utf-8') as fout:
        csvreader = csv.reader(fin)
        for row in csvreader:
            body = {"model": "qwen-turbo", "messages": messages_builder_example(row[1])}
            # The default value is /v1/chat/completions.
            request = {"custom_id": row[0], "method": "POST", "url": "/v1/chat/completions", "body": body}
            fout.write(json.dumps(request, separators=(',', ':'), ensure_ascii=False) + "\n", )

Output file

The output is a JSONL file. Each line is a JSON object that corresponds to a request result.

Response examples

Example of a single-line response:

{"id":"73291560-xxx","custom_id":"1","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

Example of a multi-line response:

{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-max","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. I am here to help you with information queries, learning materials, problem-solving methods, or anything else you need. Just tell me how I can assist you."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

Response parameters

Field

Type

Required

Description

id

String

Yes

Request ID.

custom_id

String

Yes

User-defined request ID.

response

Object

No

Request result.

error

Object

No

Error response.

error.code

String

No

Error code.

error.message

String

No

Error message.

completion_tokens

Integer

No

Number of tokens for the generated completion.

prompt_tokens

Integer

No

Number of tokens in the prompt.

model

String

No

Model used for inference in the task.

Convert a JSONL file to a CSV file

CSV files are ideal for automated scripts and batch tasks because they contain only data values, without the extra keys or metadata that are found in JSONL files. You can use the following Python script to convert the JSONL output file from a batch task to a CSV file.

Place the result.jsonl file in the same folder as the Python script. Running the script creates a CSV file named result.csv.

You can modify the code to adjust the file path or other parameters as needed.
import json
import csv
columns = ["custom_id",
           "model",
           "request_id",
           "status_code",
           "error_code",
           "error_message",
           "created",
           "content",
           "usage"]

def dict_get_string(dict_obj, path):
    obj = dict_obj
    try:
        for element in path:
            obj = obj[element]
        return obj
    except:
        return None

with open("result.jsonl", "r") as fin:
    with open("result.csv", 'w', encoding='utf-8') as fout:
        rows = [columns]
        for line in fin:
            request_result = json.loads(line)
            row = [dict_get_string(request_result, ["custom_id"]),
                   dict_get_string(request_result, ["response", "body", "model"]),
                   dict_get_string(request_result, ["response", "request_id"]),
                   dict_get_string(request_result, ["response", "status_code"]),
                   dict_get_string(request_result, ["error", "error_code"]),
                   dict_get_string(request_result, ["error", "error_message"]),
                   dict_get_string(request_result, ["response", "body", "created"]),
                   dict_get_string(request_result, ["response", "body", "choices", 0, "message", "content"]),
                   dict_get_string(request_result, ["response", "body", "usage"])]
            rows.append(row)
        writer = csv.writer(fout)
        writer.writerows(rows)
If a CSV file displays garbled text when opened in Excel, you can use a text editor, such as Sublime, to change the encoding format of the file. Then, you can open the file in Excel. Alternatively, you can create a new file in Excel and specify the correct encoding format, such as UTF-8, when you import the data.

Procedure

1. Prepare and upload a file

Before you create a batch task, you must upload a JSONL file that meets the input file requirements through the file upload API. After the upload is complete, retrieve the file_id and set the `purpose` parameter to batch.

The maximum size for a single file that you can upload for a batch task is 500 MB. The Model Studio storage space under your Alibaba Cloud account supports a maximum of 10,000 files, with a total size not exceeding 100 GB. The files do not have an expiration date.

OpenAI Python SDK

Request example

import os
from pathlib import Path
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx". 
    # Do not hard code the API key in production environments to reduce the risk of leaks.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)

# test.jsonl is a local sample file. The purpose must be batch.
file_object = client.files.create(file=Path("test.jsonl"), purpose="batch")

print(file_object.model_dump_json())

Content of the test file test.jsonl:

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

Response example

{
    "id": "file-batch-xxx",
    "bytes": 437,
    "created_at": 1742304153,
    "filename": "test.jsonl",
    "object": "file",
    "purpose": "batch",
    "status": "processed",
    "status_details": null
}

curl

Request example

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
--form 'file=@"test.jsonl"' \
--form 'purpose="batch"'

Content of the test file test.jsonl:

{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "qwen-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"}]}}

Response example

{
    "id": "file-batch-xxx",
    "bytes": 231,
    "created_at": 1729065815,
    "filename": "test.jsonl",
    "object": "file",
    "purpose": "batch",
    "status": "processed",
    "status_details": null
}

2. Create a batch task

You can create a batch task by passing the file ID returned by the Prepare and upload a file API to the input_file_id parameter.

The API rate limit is 1,000 calls per minute per Alibaba Cloud account. The maximum number of running tasks is 1,000. This includes all tasks that have not finished. If you exceed the maximum, you must wait for a task to finish before you can create another one.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx". 
    # Do not hard code the API key in production environments to reduce the risk of leaks.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)

batch = client.batches.create(
    input_file_id="file-batch-xxx",  # The ID returned after uploading the file
    endpoint="/v1/chat/completions",  # For the test model batch-test-model, enter /v1/chat/ds-test. For other models, enter /v1/chat/completions.
    completion_window="24h",
    metadata={'ds_name':"Task Name",'ds_description':'Task Description'} # Metadata. This is an optional field used to create a task name and description.
)
print(batch)

curl

Request example

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches \
  -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "file-batch-xxx",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h",
    "metadata":{"ds_name":"Task Name","ds_description":"Task Description"}
  }'
Replace the value of input_file_id with the actual value.

Input parameter settings

Field

Type

Passed In

Required

Description

input_file_id

String

Body

Yes

Specifies the file ID to use as the input file for the batch task.

Use the file ID returned by the Prepare and upload a file API, such as file-batch-xxx.

endpoint

String

Body

Yes

The access path. It must be consistent with the `url` field in the input file.

  • For the test model batch-test-model, enter /v1/chat/ds-test.

  • For other models, enter /v1/chat/completions.

completion_window

String

Body

Yes

The completion window for the task. The minimum is 24h and the maximum is 336h. Only integers are supported.

The units "h" and "d" are supported, such as "24h" or "14d".

metadata

Map

Body

No

Extended metadata for the task. Attach information in key-value pairs.

metadata.ds_name

String

Body

No

The name of the task.

Example: "ds_name":"Batch Task"

Limit: The length cannot exceed 100 characters.

If this field is defined multiple times, the last value passed is used.

metadata.ds_description

String

Body

No

The description of the task.

Example: "ds_description":"Test for batch inference task"

Limit: The length cannot exceed 200 characters.

If this field is defined multiple times, the last value passed is used.

Response example

{
    "id": "batch_xxx",
    "object": "batch",
    "endpoint": "/v1/chat/completions",
    "errors": null,
    "input_file_id": "file-batch-xxx",
    "completion_window": "24h",
    "status": "validating",
    "output_file_id": null,
    "error_file_id": null,
    "created_at": 1742367779,
    "in_progress_at": null,
    "expires_at": null,
    "finalizing_at": null,
    "completed_at": null,
    "failed_at": null,
    "expired_at": null,
    "cancelling_at": null,
    "cancelled_at": null,
    "request_counts": {
        "total": 0,
        "completed": 0,
        "failed": 0
    },
    "metadata": {
        "ds_name": "Task Name",
        "ds_description": "Task Description"
    }
}

Response parameters

Field

Type

Description

id

String

The batch task ID.

object

String

The object type. The value is fixed to batch.

endpoint

String

The access path.

errors

Map

The error message.

input_file_id

String

The file ID.

completion_window

String

The completion window for the task. The minimum is 24h and the maximum is 336h. Only integers are supported.

The units "h" and "d" are supported, such as "24h" or "14d".

status

String

The status of the task. Valid values include validating, failed, in_progress, finalizing, completed, expired, cancelling, and cancelled.

output_file_id

String

The ID of the output file for successfully executed requests.

error_file_id

String

The ID of the output file for failed requests.

created_at

Integer

The UNIX timestamp (in seconds) when the task was created.

in_progress_at

Integer

The UNIX timestamp (in seconds) when the task started running.

expires_at

Integer

The UNIX timestamp (in seconds) when the task expires.

finalizing_at

Integer

The UNIX timestamp (in seconds) when the task started finalizing.

completed_at

Integer

The UNIX timestamp (in seconds) when the task was completed.

failed_at

Integer

The UNIX timestamp (in seconds) when the task failed.

expired_at

Integer

The UNIX timestamp (in seconds) when the task expired.

cancelling_at

Integer

The UNIX timestamp (in seconds) when the task was set to cancelling.

cancelled_at

Integer

The UNIX timestamp (in seconds) when the task was cancelled.

request_counts

Map

The number of requests in different states.

metadata

Map

Additional information in key-value pairs.

metadata.ds_name

String

The name of the current task.

metadata.ds_description

String

The description of the current task.

3. Query and manage batch tasks

Query batch task details

You can query the details of a specific batch task by passing the batch task ID that was returned when you created the batch task. You can query only tasks that were created within the last 30 days.

The API rate limit is 1,000 calls per minute per Alibaba Cloud account. Because a batch task takes time to execute, we recommend that you call this query API once per minute after you create the task to retrieve its status.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx". 
    # Do not hard code the API key in production environments to reduce the risk of leaks.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.retrieve("batch_id")  # Replace batch_id with the ID of the batch task.
print(batch)

curl

Request example

curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id with the actual value.

Input parameter settings

Field

Type

Passed In

Required

Description

batch_id

String

Path

Yes

The ID of the batch task to query. This is the ID returned when you created the batch task. The ID starts with `batch`, for example, `batch_xxx`.

Response example

For more information, see the response example for Create a batch task.

Response parameters

For more information, see the response parameters for Create a batch task.

You can retrieve the content of the files specified by output_file_id and error_file_id in the response parameters by downloading the batch result file.

Query a list of batch tasks

You can use the batches.list() method to query a list of batch tasks. You can use paging to retrieve the complete list incrementally.

  • Use the after parameter: Pass the ID of the last batch task from the previous page to retrieve the next page of data.

  • Use the limit parameter: Set the number of tasks to return.

  • You can filter the query using parameters such as input_file_ids.

The API rate limit is 100 calls per minute per Alibaba Cloud account.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx". 
    # Do not hard code the API key in production environments to reduce the risk of leaks.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)
batches = client.batches.list(after="batch_xxx", limit=2,extra_query={'ds_name':'Task Name','input_file_ids':'file-batch-xxx,file-batch-xxx','status':'completed,expired','create_after':'20250304000000','create_before':'20250306123000'})
print(batches)

curl

Request example

curl --request GET  'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?after=batch_xxx&limit=2&ds_name=Batch&input_file_ids=file-batch-xxx,file-batch-xxx&status=completed,failed&create_after=20250303000000&create_before=20250320000000' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id in after=batch_id with the actual value. Set the limit parameter to the number of tasks to return. For ds_name, enter a partial task name. For `input_file_ids`, you can enter multiple file IDs. For status, enter multiple batch task statuses. For create_after and create_before, enter specific points in time.

Input parameter settings

Field

Type

Passed In

Required

Description

after

String

Query

No

A cursor for pagination. The value of the after parameter is a batch task ID. The query returns data that comes after this ID. For paged queries, you can set this parameter to the `last_id` from the previous response to get the next page of data.

For example, if the current query returns 20 rows of data and the last batch task ID (`last_id`) is `batch_xxx`, you can set after=batch_xxx in the subsequent query to get the next page of the list.

limit

Integer

Query

No

The number of batch tasks to return for each query. The range is [1, 100]. The default is 20.

ds_name

String

Query

No

Filters by task name using a partial match. Enter any part of a task name to match tasks that contain that string. For example, entering "Batch" matches "Batch Task" and "Batch Task_20240319".

input_file_ids

String

Query

No

Filters by multiple file IDs, separated by commas. You can enter up to 20 IDs. These can be file IDs returned by Prepare and upload a file.

status

String

Query

No

Filters by multiple statuses, separated by commas. Valid values include validating, failed, in_progress, finalizing, completed, expired, cancelling, and cancelled.

create_after

String

Query

No

Filters for tasks created after this point in time. Format: yyyyMMddHHmmss. For example, to filter for tasks created after 00:00:00 on March 4, 2025, enter 20250304000000.

create_before

String

Query

No

Filters for tasks created before this point in time. Format: yyyyMMddHHmmss. For example, to filter for tasks created before 12:30:00 on March 4, 2025, enter 20250304123000.

Response example

{
  "object": "list",
  "data": [
    {
      "id": "batch_xxx",
      "object": "batch",
      "endpoint": "/v1/chat/completions",
      "errors": null,
      "input_file_id": "file-batch-xxx",
      "completion_window": "24h",
      "status": "completed",
      "output_file_id": "file-batch_output-xxx",
      "error_file_id": null,
      "created_at": 1722234109,
      "in_progress_at": 1722234109,
      "expires_at": null,
      "finalizing_at": 1722234165,
      "completed_at": 1722234165,
      "failed_at": null,
      "expired_at": null,
      "cancelling_at": null,
      "cancelled_at": null,
      "request_counts": {
        "total": 100,
        "completed": 95,
        "failed": 5
      },
      "metadata": {}
    },
    { ... }
  ],
  "first_id": "batch_xxx",
  "last_id": "batch_xxx",
  "has_more": true
}

Response parameters

Field

Type

Description

object

String

The type. The value is fixed to `list`.

data

Array

A batch task object. For more information, see the response parameters for creating a batch task.

first_id

String

The ID of the first batch task on the current page.

last_id

String

The ID of the last batch task on the current page.

has_more

Boolean

Indicates whether there is a next page.

Cancel a batch task

You can cancel a specific batch task by passing the batch task ID that was returned when you created the batch task.

The API rate limit is 1,000 calls per minute per Alibaba Cloud account.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx". 
    # Do not hard code the API key in production environments to reduce the risk of leaks.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.cancel("batch_id")  # Replace batch_id with the ID of the batch task.
print(batch)

curl

Request example

curl --request POST 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id with the actual value.

Input parameter settings

Field

Type

Parameter passing

Required

Description

batch_id

String

Path

Yes

The ID of the batch task to cancel. The ID starts with `batch`, for example, `batch_xxx`.

Response example

For more information, see the response example for Create a batch task.

Response parameters

For more information, see the response parameters for Create a batch task.

4. Download the batch result file

After a batch inference task is complete, you can use the API to download the result file.

You can obtain the file_id of the file to download from the output_file_id parameter that is returned by the Query Batch Task DetailsQuery Batch Task Listfile_id starts with file-batch_output.

OpenAI Python SDK

You can retrieve the content of a Batch job result file with the content method and save it to a local file with the write_to_file method.

Request example

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
content = client.files.content(file_id="file-batch_output-xxx")
# Print the content of the result file
print(content.text)
# Save the result file locally
content.write_to_file("result.jsonl")

Response example

{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-plus","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. Whether you need to look up information, find learning materials, solve problems, or need any other help, I am here to support you. Please tell me what you need help with."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-plus","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

curl

You can use the GET method and specify file_id in the URL to download the Batch job result file.

Request example

curl -X GET https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files/file-batch_output-xxx/content \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" > result.jsonl

Parameter settings

Field

Type

Parameter passing

Required

Description

file_id

string

Path

Yes

The ID of the file to download, which is the value of output_file_id in the response parameters of Query Batch Job Details or Query Batch Job List.

Response

A JSONL file that contains the batch task results. For more information about the format, see Output file.

Extended features

Error codes

If a call fails and returns an error message, see Error messages to resolve the issue.

FAQ

  1. Is there a basic rate limit for models that use batch pricing?

    No. Only real-time calls have a Requests Per Minute (RPM) limit. Batch calls do not have an RPM limit.

  2. Do I need to place an order to use batch calls? If so, where?

    No. Batch is a call method and does not require a separate order. This method uses a pay-as-you-go billing model, and you are billed directly for your batch API calls.

  3. How are submitted batch call requests processed? Are they executed in the order they are submitted?

    No. They are not processed in a queue. Instead, a scheduling mechanism is used. Batch request tasks are scheduled and executed based on resource availability.

  4. How long does it take for a submitted batch call request to complete?

    The running time of a batch task depends on the resource allocation of the system.

    If system resources are limited, tasks might not be completed within the configured maximum wait time.

    For scenarios that require fast model inference, use real-time calls. For scenarios that process large amounts of data and can tolerate some delay, use batch calls.