All Products
Search
Document Center

Alibaba Cloud Model Studio:OpenAI compatible - Batch

Last Updated:Dec 04, 2025

Alibaba Cloud Model Studio provides an OpenAI-compatible Batch API that lets you submit batch tasks using files for asynchronous execution. This service processes large-scale data offline during off-peak hours and returns the results after a task is complete or the maximum waiting time is reached, at 50% of the cost of real-time calls.

To perform this operation in the console, see Batch processing.

Prerequisites

  • Activate Model Studio and create an API key.

    Export the API key as an environment variable to reduce the risk of API key leaks.
  • If you use the OpenAI Python SDK to call the Batch API, run the following command to install the latest version of the OpenAI SDK.

    pip3 install -U openai

Availability

Beijing region

Supported models:

  • Text generation models: Stable and some latest versions of Qwen Max, Plus, Flash, and Long. Also supports the QwQ series (qwq-plus) and third-party models such as deepseek-r1 and deepseek-v3.

  • Multimodal models: Stable and some latest versions of Qwen VL Max, Plus, and Flash. Also supports the Qwen OCR model.

  • Text embedding models: The text-embedding-v4 model.

List of supported model names

Singapore region

Supported models: qwen-max, qwen-plus, and qwen-turbo.

Getting started

Before you start a batch, use the test model batch-test-model to perform a full-link, closed-loop test. This test includes validating input data, creating a job, querying the job, and downloading the result file. Note:

  • The test file must meet the requirements for an input file. It also must not exceed 1 MB in size or have more than 100 lines.

  • Concurrency limit: A maximum of 2 parallel jobs.

  • Resource usage: The test model does not go through the inference process and does not incur model inference fees.

The steps are as follows:

  1. Prepare the test file

    • Download the sample file test_model.jsonl, which contains request information, to your local machine. Make sure that it is in the same directory as the Python script below.

    • Sample content: Set the model parameter to batch-test-model and the url parameter to /v1/chat/ds-test .

      {"custom_id":"1","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
      {"custom_id":"2","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
  2. Run the script

    • Run this Python script.

      To adjust the file path or other parameters, modify the code as needed.
      import os
      from pathlib import Path
      from openai import OpenAI
      import time
      
      # Initialize the client
      client = OpenAI(
          # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
          # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
          api_key=os.getenv("DASHSCOPE_API_KEY"),
          # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
          base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"  # The base_url for the Alibaba Cloud Model Studio service
      )
      
      def upload_file(file_path):
          print(f"Uploading the JSONL file that contains request information...")
          file_object = client.files.create(file=Path(file_path), purpose="batch")
          print(f"File uploaded successfully. File ID: {file_object.id}\n")
          return file_object.id
      
      def create_batch_job(input_file_id):
          print(f"Creating a batch job based on the file ID...")
          # Note: The value of the endpoint parameter here must be the same as the url field in the input file. For the test model (batch-test-model), enter /v1/chat/ds-test. For text embedding models, enter /v1/embeddings. For other models, enter /v1/chat/completions.
          batch = client.batches.create(input_file_id=input_file_id, endpoint="/v1/chat/ds-test", completion_window="24h")
          print(f"Batch job created successfully. Batch job ID: {batch.id}\n")
          return batch.id
      
      def check_job_status(batch_id):
          print(f"Checking the batch job status...")
          batch = client.batches.retrieve(batch_id=batch_id)
          print(f"Batch job status: {batch.status}\n")
          return batch.status
      
      def get_output_id(batch_id):
          print(f"Getting the output file ID for successful requests in the batch job...")
          batch = client.batches.retrieve(batch_id=batch_id)
          print(f"Output file ID: {batch.output_file_id}\n")
          return batch.output_file_id
      
      def get_error_id(batch_id):
          print(f"Getting the error file ID for failed requests in the batch job...")
          batch = client.batches.retrieve(batch_id=batch_id)
          print(f"Error file ID: {batch.error_file_id}\n")
          return batch.error_file_id
      
      def download_results(output_file_id, output_file_path):
          print(f"Printing and downloading the results of successful requests from the batch job...")
          content = client.files.content(output_file_id)
          # Print some content for testing
          print(f"Printing the first 1,000 characters of the successful results: {content.text[:1000]}...\n")
          # Save the result file to your local machine
          content.write_to_file(output_file_path)
          print(f"The complete output results have been saved to the local output file result.jsonl\n")
      
      def download_errors(error_file_id, error_file_path):
          print(f"Printing and downloading the information for failed requests from the batch job...")
          content = client.files.content(error_file_id)
          # Print some content for testing
          print(f"Printing the first 1,000 characters of the failed request information: {content.text[:1000]}...\n")
          # Save the error information file to your local machine
          content.write_to_file(error_file_path)
          print(f"The complete failed request information has been saved to the local error file error.jsonl\n")
      
      def main():
          # File paths
          input_file_path = "test_model.jsonl"  # Replace with your input file path
          output_file_path = "result.jsonl"  # Replace with your output file path
          error_file_path = "error.jsonl"  # Replace with your error file path
          try:
              # Step 1: Upload the JSONL file containing request information to get the input file ID
              input_file_id = upload_file(input_file_path)
              # Step 2: Create a batch job based on the input file ID
              batch_id = create_batch_job(input_file_id)
              # Step 3: Check the batch job status until it is finished
              status = ""
              while status not in ["completed", "failed", "expired", "cancelled"]:
                  status = check_job_status(batch_id)
                  print(f"Waiting for the job to complete...")
                  time.sleep(10)  # Wait 10 seconds before checking the status again
              # If the job fails, print the error message and exit
              if status == "failed":
                  batch = client.batches.retrieve(batch_id)
                  print(f"Batch job failed. Error information: {batch.errors}\n")
                  print(f"For more information, see the error codes documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
                  return
              # Step 4: Download the results. If an output file ID exists, print the first 1,000 characters of the successful results and download the complete results to a local output file.
              # If an error file ID exists, print the first 1,000 characters of the failed request information and download the complete information to a local error file.
              output_file_id = get_output_id(batch_id)
              if output_file_id:
                  download_results(output_file_id, output_file_path)
              error_file_id = get_error_id(batch_id)
              if error_file_id:
                  download_errors(error_file_id, error_file_path)
                  print(f"For more information, see the error codes documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
          except Exception as e:
              print(f"An error occurred: {e}")
              print(f"For more information, see the error codes documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
      
      if __name__ == "__main__":
          main()
  3. Verify the test results

    • The job status shows completed.

    • Result file result.jsonl: Contains the fixed response {"content":"This is a test result."}.

      {"id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","custom_id":"1","response":{"status_code":200,"request_id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-bca7295b-67c3-4b1f-8239-d78323bb669f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}
      {"id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","custom_id":"2","response":{"status_code":200,"request_id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-1e32a8ba-2b69-4dc4-be42-e2897eac9e84","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}
      If an error occurs, see Error messages to resolve it.

After you verify the test, follow these steps to run a formal batch job.

  1. Prepare the input file according to the input file requirements. Set the model parameter in the file to a supported model and set the url parameter to: /v1/chat/completions

  2. Replace the endpoint in the Python script above.

    Important

    Make sure that the endpoint in the script matches the url parameter in the input file.

  3. Run the script and wait for the job to complete. If the job is successful, an output file named result.jsonl is generated in the same directory.

    If the job fails, the program exits and prints an error message.
    If an error file ID exists, an error file named error.jsonl is generated in the same directory for you to review.
    Exceptions that occur during the process are caught, and an error message is printed.

Data file format

Input file

Prepare a UTF-8 encoded .jsonl file that meets the following requirements:

  • Format: One JSON object per line, each describing an individual request.

  • Size limit: Up to 50,000 requests per file and no larger than 500 MB.

  • Line limit: Each JSON object up to 6 MB and within the model's context window.

  • Consistency: All requests in a file must target the same API endpoint (url) and use the same model (body.model).

  • Unique identifier: Each request requires a custom_id unique within the file, which can be used to reference results after completion.

Request example

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello!"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

JSONL batch generation tool

Use this tool to quickly generate JSONL files. To avoid performance issues, do not process more than 10,000 rows at a time. If you have a large data volume, process the data in batches.

JSONL Batch Generation Tool
Please select a region:

Request parameters

Field

Type

Required

Description

custom_id

String

Yes

A custom request ID. Each line represents a request, and each request has a unique custom_id. After the batch job is complete, find the result for the corresponding custom_id in the result file.

method

String

Yes

The request method. Currently, only POST is supported.

url

String

Yes

The URL associated with the API. It must be the same as the endpoint field used when creating the batch job.

  • For the test model batch-test-model, enter /v1/chat/ds-test.

  • For other models, enter /v1/chat/completions.

body

Object

Yes

The request body for the model call. It contains all the parameters required to call the model, such as model, messages.

The parameters in the request body are consistent with those supported by the real-time inference API. For more information about the parameters, see OpenAI compatible API.

To extend this further, also add more parameters, such as max_tokens and temperature, to the body. Separate the parameters with commas.

Example:

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-turbo-latest","stream":true,"enable_thinking":true,"thinking_budget":50,"messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Who are you?"}],"max_tokens": 1000,"temperature":0.7}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-turbo-latest","stream":true,"enable_thinking":true,"thinking_budget":50,"messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}],"max_tokens": 1000,"temperature":0.7}}

body.model

String

Yes

The model used for this batch job.

Important

All batch requests in the same job must use the same model. The thinking mode, if supported, must also be consistent.

body.messages

Array

Yes

A list of messages.

[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "What is 2+2?"}
]

Convert a CSV file to a JSONL file

If you have a CSV file where the first column is the request ID (custom_id) and the second column is the content, use the following Python code to quickly create a JSONL file for a batch job. The CSV file must be in the same directory as the Python script below.

You can also use the template file provided in this topic. The steps are as follows:

  1. Download the template file to your local machine and place it in the same directory as the Python script below.

  2. In this CSV template file, the first column is the request ID (custom_id) and the second column is the content. You can paste your business questions into this file.

After you run the following Python script, a JSONL file named input_demo.jsonl for a batch job is generated in the same directory.

To adjust the file path or other parameters, modify the code as needed.
import csv
import json
def messages_builder_example(content):
    messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": content}]
    return messages

with open("input_demo.csv", "r") as fin:
    with open("input_demo.jsonl", 'w', encoding='utf-8') as fout:
        csvreader = csv.reader(fin)
        for row in csvreader:
            body = {"model": "qwen-turbo", "messages": messages_builder_example(row[1])}
            # The default value is /v1/chat/completions.
            request = {"custom_id": row[0], "method": "POST", "url": "/v1/chat/completions", "body": body}
            fout.write(json.dumps(request, separators=(',', ':'), ensure_ascii=False) + "\n", )

Output file

A JSONL file. Each line is a JSON object that corresponds to a request result.

Response example

A single-line content example:

{"id":"73291560-xxx","custom_id":"1","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

A multi-line content example:

{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-max","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. I am here to support you, whether you need to query information, find learning materials, get solutions to problems, or need any other help. Please tell me what you need help with."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

Response parameters

Field

Type

Required

Description

id

String

Yes

The request ID.

custom_id

String

Yes

The custom request ID.

response

Object

No

The request result.

error

Object

No

The abnormal response result.

error.code

String

No

The error code.

error.message

String

No

The error message.

completion_tokens

Integer

No

The number of tokens required to complete the generation.

prompt_tokens

Integer

No

The number of tokens in the prompt.

model

String

No

The model used for inference in this job.

Convert a JSONL file to a CSV file

Compared to JSONL files, CSV files usually contain only the necessary data values without extra keys or metadata, making them ideal for automated scripts and batch jobs. To convert the JSONL output from a batch job to a CSV file, use the following Python code.

Make sure that the result.jsonl file is in the same directory as the Python script below. After you run the script, a CSV file named result.csv is generated.

To adjust the file path or other parameters, modify the code as needed.
import json
import csv
columns = ["custom_id",
           "model",
           "request_id",
           "status_code",
           "error_code",
           "error_message",
           "created",
           "content",
           "usage"]

def dict_get_string(dict_obj, path):
    obj = dict_obj
    try:
        for element in path:
            obj = obj[element]
        return obj
    except:
        return None

with open("result.jsonl", "r") as fin:
    with open("result.csv", 'w', encoding='utf-8') as fout:
        rows = [columns]
        for line in fin:
            request_result = json.loads(line)
            row = [dict_get_string(request_result, ["custom_id"]),
                   dict_get_string(request_result, ["response", "body", "model"]),
                   dict_get_string(request_result, ["response", "request_id"]),
                   dict_get_string(request_result, ["response", "status_code"]),
                   dict_get_string(request_result, ["error", "error_code"]),
                   dict_get_string(request_result, ["error", "error_message"]),
                   dict_get_string(request_result, ["response", "body", "created"]),
                   dict_get_string(request_result, ["response", "body", "choices", 0, "message", "content"]),
                   dict_get_string(request_result, ["response", "body", "usage"])]
            rows.append(row)
        writer = csv.writer(fout)
        writer.writerows(rows)
If a CSV file contains Chinese characters and you encounter garbled text when you open it with Excel, use a text editor such as Sublime to convert the CSV file's encoding to GBK and then open it in Excel. Alternatively, create a new Excel file and specify the correct encoding format, UTF-8, when you import the data.

Procedure

1. Prepare and upload the file

Before you create a batch job, prepare a JSONL file that meets the input file requirements. Upload the file using the file upload API operation to obtain a file_id. Use the `purpose` parameter to specify the file's purpose as batch.

The maximum size of a single file that you can upload for a batch job is 500 MB. The maximum number of files allowed in your Model Studio storage space under your Alibaba Cloud account is 10,000, and the total size cannot exceed 100 GB. Files do not currently have an expiration date.

OpenAI Python SDK

Request example

import os
from pathlib import Path
from openai import OpenAI

client = OpenAI(
    # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)

# test.jsonl is a local sample file. The purpose must be batch.
file_object = client.files.create(file=Path("test.jsonl"), purpose="batch")

print(file_object.model_dump_json())

Content of the test file test.jsonl:

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

Response example

{
    "id": "file-batch-xxx",
    "bytes": 437,
    "created_at": 1742304153,
    "filename": "test.jsonl",
    "object": "file",
    "purpose": "batch",
    "status": "processed",
    "status_details": null
}

curl

Request example

# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/files
# === Delete this comment before running ===
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
--form 'file=@"test.jsonl"' \
--form 'purpose="batch"'

Content of the test file test.jsonl:

{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "qwen-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"}]}}

Response example

{
    "id": "file-batch-xxx",
    "bytes": 231,
    "created_at": 1729065815,
    "filename": "test.jsonl",
    "object": "file",
    "purpose": "batch",
    "status": "processed",
    "status_details": null
}

2. Create a batch

Create a batch job by setting the input_file_id parameter to the file ID returned by the Prepare and upload the file API operation.

API rate limit: Each Alibaba Cloud account can make up to 1,000 calls per minute. The maximum number of running jobs is 1,000 (including all unfinished jobs). If you exceed the maximum, you must wait for a job to finish before creating a new one.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)

batch = client.batches.create(
    input_file_id="file-batch-xxx",  # The ID returned after uploading the file
    endpoint="/v1/chat/completions",  # For the test model batch-test-model, enter /v1/chat/ds-test. For other models, enter /v1/chat/completions.
    completion_window="24h",
    metadata={'ds_name':"Job Name",'ds_description':'Job Description'} # Metadata, an optional field, used to create a job name and description
)
print(batch)

curl

Request example

# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/files
# === Delete this comment before running ===
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches \
  -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "file-batch-xxx",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h",
    "metadata":{"ds_name":"Job Name","ds_description":"Job Description"}
  }'
Replace the value of input_file_id with the actual value.

Input parameter settings

Field

Type

Parameter passing method

Required

Description

input_file_id

String

Body

Yes

Specifies the file ID to be used as the input file for the batch job.

Use the file ID returned by the Prepare and upload the file API operation, such as file-batch-xxx.

endpoint

String

Body

Yes

The access path. It must be the same as the url field in the input file.

  • For the test model batch-test-model, enter /v1/chat/ds-test.

  • For other models, enter /v1/chat/completions.

completion_window

String

Body

Yes

The waiting time. The minimum waiting time is 24h, and the maximum is 336h. Only integers are supported.

The supported units are "h" and "d", such as "24h" or "14d".

metadata

Map

Body

No

Extended metadata for the job. Attach information as key-value pairs.

metadata.ds_name

String

Body

No

The name of the job.

Example: "ds_name":"Batch Job"

Limit: Up to 100 characters in length.

If this field is defined multiple times, the last value passed is used.

metadata.ds_description

String

Body

No

The description of the job.

Example: "ds_description":"Batch inference job test"

Limit: Up to 200 characters in length.

If this field is defined multiple times, the last value passed is used.

Response example

{
    "id": "batch_xxx",
    "object": "batch",
    "endpoint": "/v1/chat/completions",
    "errors": null,
    "input_file_id": "file-batch-xxx",
    "completion_window": "24h",
    "status": "validating",
    "output_file_id": null,
    "error_file_id": null,
    "created_at": 1742367779,
    "in_progress_at": null,
    "expires_at": null,
    "finalizing_at": null,
    "completed_at": null,
    "failed_at": null,
    "expired_at": null,
    "cancelling_at": null,
    "cancelled_at": null,
    "request_counts": {
        "total": 0,
        "completed": 0,
        "failed": 0
    },
    "metadata": {
        "ds_name": "Job Name",
        "ds_description": "Job Description"
    }
}

Response parameters

Field

Type

Description

id

String

The batch job ID.

object

String

The object type. The value is fixed to batch.

endpoint

String

The access path.

errors

Map

The error message.

input_file_id

String

The file ID.

completion_window

String

The waiting time. The minimum waiting time is 24h, and the maximum is 336h. Only integers are supported.

The supported units are "h" and "d", such as "24h" or "14d".

status

String

The job status, which can be validating, failed, in_progress, finalizing, completed, expired, cancelling, or cancelled.

output_file_id

String

The output file ID for successful requests.

error_file_id

String

The output file ID for failed requests.

created_at

Integer

The UNIX timestamp (in seconds) when the job was created.

in_progress_at

Integer

The UNIX timestamp (in seconds) when the job started running.

expires_at

Integer

The timestamp (in seconds) when the job started to time out.

finalizing_at

Integer

The timestamp (in seconds) when the job last started.

completed_at

Integer

The timestamp (in seconds) when the job was completed.

failed_at

Integer

The timestamp (in seconds) when the job failed.

expired_at

Integer

The timestamp (in seconds) when the job timed out.

cancelling_at

Integer

The timestamp (in seconds) when the job was set to cancelling.

cancelled_at

Integer

The timestamp (in seconds) when the job was canceled.

request_counts

Map

The number of requests in different states.

metadata

Map

Additional information as key-value pairs.

metadata.ds_name

String

The name of the current job.

metadata.ds_description

String

The description of the current job.

3. Query and manage batch jobs

Query batch job details

Pass the batch job ID returned by Create a batch job to query information about a specific batch job. You can only query batch jobs created within the last 30 days.

API rate limit: Each Alibaba Cloud account can make up to 1,000 calls per minute. Because a batch job takes some time to execute, we recommend calling this query API operation once per minute to retrieve job information after you create a batch job.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.retrieve("batch_id")  # Replace batch_id with the ID of the batch job
print(batch)

curl

Request example

# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/files
# === Delete this comment before running ===
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id with the actual value.

Input parameter settings

Field

Type

Parameter passing method

Required

Description

batch_id

String

Path

Yes

The ID of the batch job to query (the batch job ID returned by Create a batch job), which starts with "batch", for example, "batch_xxx".

Response example

For more information, see the response example for Create a batch job.

Response parameters

For more information, see the response parameters for Create a batch job.

The content of output_file_id and error_file_id in the response parameters can be retrieved using Download the batch result file.

Query the batch job list

You can use the batches.list() method to query the batch job list and use the paging mechanism to retrieve the complete job list.

  • Use the after parameter: Pass the ID of the last batch job from the previous page to retrieve the next page of data.

  • Use the limit parameter: Set the number of jobs to return.

  • You can filter queries using parameters such as input_file_ids.

API rate limit: Each Alibaba Cloud account can make up to 100 calls per minute.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)
batches = client.batches.list(after="batch_xxx", limit=2,extra_query={'ds_name':'Job Name','input_file_ids':'file-batch-xxx,file-batch-xxx','status':'completed,expired','create_after':'20250304000000','create_before':'20250306123000'})
print(batches)

curl

Request example

# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/batches?xxxxxx
# === Delete this comment before running ===
curl --request GET  'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?after=batch_xxx&limit=2&ds_name=Batch&input_file_ids=file-batch-xxx,file-batch-xxx&status=completed,failed&create_after=20250303000000&create_before=20250320000000' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id in after=batch_id with the actual value. Set the limit parameter to the number of jobs to return. Set ds_name to a fragment of the job name. For the value of input_file_ids, enter multiple file IDs. Set status to multiple batch job statuses. Set the values of create_after and create_before to specific points in time.

Input parameter settings

Field

Type

Parameter passing method

Required

Description

after

String

Query

No

A cursor for paging. The value of the after parameter is a batch job ID, which indicates that data after this ID should be queried. When performing a paged query, assign the last batch job ID (last_id) from the returned results to this parameter to get the next page of data.

For example, if the current query returns 20 rows of data and the last batch job ID (last_id) is batch_xxx, set after=batch_xxx in the subsequent query to get the next page of the list.

limit

Integer

Query

No

The number of batch jobs to return for each query. The range is [1, 100]. The default value is 20.

ds_name

String

Query

No

Performs a fuzzy search based on the job name. Enter any continuous character fragment to match job names that contain it. For example, entering "Batch" can match "Batch Job" and "Batch Job_20240319".

input_file_ids

String

Query

No

Filters by multiple file IDs, separated by commas. You can enter up to 20 IDs. The file ID returned by Prepare and upload the file.

status

String

Query

No

Filters by multiple statuses, separated by commas. The statuses include validating, failed, in_progress, finalizing, completed, expired, cancelling, and cancelled.

create_after

String

Query

No

Filters for jobs created after this point in time. The format is yyyyMMddHHmmss. For example, to filter for jobs created after 00:00:00 on March 4, 2025, enter 20250304000000.

create_before

String

Query

No

Filters for jobs created before this point in time. The format is yyyyMMddHHmmss. For example, to filter for jobs created before 12:30:00 on March 4, 2025, enter 20250304123000.

Response example

{
  "object": "list",
  "data": [
    {
      "id": "batch_xxx",
      "object": "batch",
      "endpoint": "/v1/chat/completions",
      "errors": null,
      "input_file_id": "file-batch-xxx",
      "completion_window": "24h",
      "status": "completed",
      "output_file_id": "file-batch_output-xxx",
      "error_file_id": null,
      "created_at": 1722234109,
      "in_progress_at": 1722234109,
      "expires_at": null,
      "finalizing_at": 1722234165,
      "completed_at": 1722234165,
      "failed_at": null,
      "expired_at": null,
      "cancelling_at": null,
      "cancelled_at": null,
      "request_counts": {
        "total": 100,
        "completed": 95,
        "failed": 5
      },
      "metadata": {}
    },
    { ... }
  ],
  "first_id": "batch_xxx",
  "last_id": "batch_xxx",
  "has_more": true
}

Response parameters

Field

Type

Description

object

String

The type. The value is fixed to list.

data

Array

The batch job object. For more information, see the response parameters for creating a batch job.

first_id

String

The ID of the first batch job on the current page.

last_id

String

The ID of the last batch job on the current page.

has_more

Boolean

Indicates whether there is a next page.

Cancel a batch job

Pass the batch job ID returned by Create a batch job to cancel the specified batch job.

API rate limit: Each Alibaba Cloud account can make up to 1,000 calls per minute.

OpenAI Python SDK

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.cancel("batch_id")  # Replace batch_id with the ID of the batch job
print(batch)

curl

Request example

# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel
# === Delete this comment before running ===
curl --request POST 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id with the actual value.

Input parameter settings

Field

Type

Parameter passing method

Required

Description

batch_id

String

Path

Yes

The ID of the batch job to cancel, which starts with "batch", for example, "batch_xxx".

Response example

For more information, see the response example for Create a batch job.

Response parameters

For more information, see the response parameters for Create a batch job.

4. Download the batch result file

After the batch inference job is complete, use the API operation to download the result file.

Get the file_id for downloading the file from the output_file_id in the response parameters of the Query batch job details or Query the batch job list API. Only files with file_id that starts with file-batch_output can be downloaded.

OpenAI Python SDK

Use the content method to retrieve the content of the batch job result file and use the write_to_file method to save it to your local machine.

Request example

import os
from openai import OpenAI

client = OpenAI(
    # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
    # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
content = client.files.content(file_id="file-batch_output-xxx")
# Print the content of the result file
print(content.text)
# Save the result file to your local machine
content.write_to_file("result.jsonl")

Response example

{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-plus","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. I am here to support you, whether you need to query information, find learning materials, get solutions to problems, or need any other help. Please tell me what you need help with."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-plus","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

curl

Use the GET method and specify the file_id in the URL to download the batch job result file.

Request example

# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel
# === Delete this comment before running ===
curl -X GET https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files/file-batch_output-xxx/content \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" > result.jsonl

Input parameter settings

Field

Type

Parameter passing method

Required

Description

file_id

string

Path

Yes

The ID of the file to download. This is the value of the output_file_id parameter returned by Query batch job details or Query the batch job list.

Returned result

A JSONL file of the batch job results. For more information about the format, see Output file.

Extended features

Billing

  • Unit price: The input and output tokens for successful requests are billed at 50% of the standard synchronous API for that model. Pricing details: Models.

  • Scope:

    • Only successfully executed requests in a task are billed.

    • File parsing failures, execution failures, or row-level request errors do not incur charges.

    • For canceled tasks, requests that successfully completed before the cancellation are still billed.

Important

Batches are billed separately and does not support savings plan, new user free quotas, or features such as context cache.

Error codes

If the call fails and an error message is returned, see Error messages to resolve the issue.

FAQ

  1. Do I need to place an order to use batch calls? If so, where?

    A: Batch is a calling method and does not require an additional order. This calling method uses a pay-as-you-go billing model, where you pay directly for batch API operation calls.

  2. How are submitted batch call requests processed in the background? Are they executed in the order they are submitted?

    A: It is not a queuing mechanism, but a scheduling mechanism. Batch request jobs are scheduled and executed based on resource availability.

  3. How long does it actually take for a submitted batch call request to be completed?

    A: The execution time of a batch job depends on the system's resource allocation.

    When system resources are tight, a job may not be fully completed within the set maximum waiting time.

    Therefore, for scenarios with strict requirements on the timeliness of model inference, we recommend using real-time calls. For scenarios that involve processing large-scale data and have a certain tolerance for timeliness, we recommend using batch calls.