All Products
Search
Document Center

Alibaba Cloud Model Studio:OpenAI compatible - Completions

Last Updated:Nov 11, 2025

The Completions API is designed for text completion scenarios, such as code completion and content continuation.

Important

This document applies only to the China (Beijing) region. To use the model, you must use an API key from the China (Beijing) region.

Supported models

Currently, the following Qwen Coder models are supported:

qwen2.5-coder-0.5b-instruct, qwen2.5-coder-1.5b-instruct, qwen2.5-coder-3b-instruct, qwen2.5-coder-7b-instruct, qwen2.5-coder-14b-instruct, qwen2.5-coder-32b-instruct, qwen-coder-turbo-0919, qwen-coder-turbo-latest, qwen-coder-turbo

Prerequisites

You must have obtained an API key and configured the API key as an environment variable. If you use the OpenAI SDK, you must install the SDK.

Get started

You can use the Completions API for text completion. The following scenarios are supported:

  1. Generate text that continues from a given prefix.

  2. Generate intermediate content based on a given prefix and suffix.

The API does not support generating content before a given suffix.

Get started

You can pass information, such as the function name, input parameters, and usage instructions, in the prefix. The Completions API then returns the generated code.

The prompt template is as follows:

<|fim_prefix|>{prefix_content}<|fim_suffix|>

In the template, {prefix_content} is the prefix that you provide.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key=os.getenv("DASHSCOPE_API_KEY")
)

completion = client.completions.create(
  model="qwen2.5-coder-32b-instruct",
  prompt="<|fim_prefix|>Write a Python quick sort function, def quick_sort(arr):<|fim_suffix|>",
)

print(completion.choices[0].text)
import OpenAI from "openai";

const openai = new OpenAI(
    {
        // If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx",
        apiKey: process.env.DASHSCOPE_API_KEY,
        baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
    }
);

async function main() {
    const completion = await openai.completions.create({
        model: "qwen2.5-coder-32b-instruct",
        prompt: "<|fim_prefix|>Write a Python quick sort function, def quick_sort(arr):<|fim_suffix|>",
    });
    console.log(completion.choices[0].text)
}

main();
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen2.5-coder-32b-instruct",
    "prompt": "<|fim_prefix|>Write a Python quick sort function, def quick_sort(arr):<|fim_suffix|>"
}'

Generate intermediate content based on prefix and suffix

The Completions API lets you generate intermediate content based on a given prefix and suffix. You can pass information, such as the function name, input parameters, and usage instructions, in the prefix, and information, such as the function's return parameters, in the suffix. The Completions API then returns the generated code.

The prompt template is as follows:

<|fim_prefix|>{prefix_content}<|fim_suffix|>{suffix_content}<|fim_middle|>

In the template, {prefix_content} is the prefix and {suffix_content} is the suffix.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key=os.getenv("DASHSCOPE_API_KEY")
)

prefix_content = f"""def reverse_words_with_special_chars(s):
'''
Reverses each word in a string (preserving the position of non-alphabetic characters) while maintaining the word order.
    Example:
    reverse_words_with_special_chars("Hello, world!") -> "olleH, dlrow!"
    Parameters:
        s (str): The input string (may contain punctuation).
    Returns:
        str: The processed string with words reversed but non-alphabetic characters in their original positions.
'''
"""

suffix_content = "return result"

completion = client.completions.create(
  model="qwen2.5-coder-32b-instruct",
  prompt=f"<|fim_prefix|>{prefix_content}<|fim_suffix|>{suffix_content}<|fim_middle|>",
)

print(completion.choices[0].text)
import OpenAI from 'openai';


const client = new OpenAI({
  baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1",
  apiKey: process.env.DASHSCOPE_API_KEY
});

const prefixContent = `def reverse_words_with_special_chars(s):
'''
Reverses each word in a string (preserving the position of non-alphabetic characters) while maintaining the word order.
    Example:
    reverse_words_with_special_chars("Hello, world!") -> "olleH, dlrow!"
    Parameters:
        s (str): The input string (may contain punctuation).
    Returns:
        str: The processed string with words reversed but non-alphabetic characters in their original positions.
'''
`;

const suffixContent = "return result";

async function main() {
  const completion = await client.completions.create({
    model: "qwen2.5-coder-32b-instruct",
    prompt: `<|fim_prefix|>${prefixContent}<|fim_suffix|>${suffixContent}<|fim_middle|>`
  });

  console.log(completion.choices[0].text);
}

main();
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen2.5-coder-32b-instruct",
    "prompt": "<|fim_prefix|>def reverse_words_with_special_chars(s):\n\"\"\"\nReverses each word in a string (preserving the position of non-alphabetic characters) while maintaining the word order.\n    Example:\n    reverse_words_with_special_chars(\"Hello, world!\") -> \"olleH, dlrow!\"\n    Parameters:\n        s (str): The input string (may contain punctuation).\n    Returns:\n        str: The processed string with words reversed but non-alphabetic characters in their original positions.\n\"\"\"\n<|fim_suffix|>return result<|fim_middle|>"
}'

Request and response parameters

Request parameters

Parameter

Type

Required

Description

model

string

Yes

The name of the model to call.

prompt

string

Yes

The prompt for which to generate completions.

max_tokens

integer

No

The maximum number of tokens to return in the response.

max_tokens setting does not affect the model's generation process. If the number of tokens generated by the model exceeds the max_tokens value, the response is truncated.

temperature

float

No

The sampling temperature, which controls the diversity of the generated text.

A higher temperature value results in more diverse generated text. A lower temperature value results in more deterministic generated text.

Valid values: [0, 2.0).

Because both `temperature` and `top_p` control text diversity, set only one of them.

top_p

float

No

The probability threshold for nucleus sampling, which controls the diversity of the generated text.

A higher top_p value results in more diverse generated text. A lower top_p value results in more deterministic generated text.

Valid values: (0, 1.0].

Because both `temperature` and `top_p` control text diversity, set only one of them.

stream

boolean

No

Specifies whether to use streaming output for the response. Valid values:

  • false (default): The result is returned after all the content is generated.

  • true: The output is generated and streamed incrementally. A chunk is immediately sent each time a part of the content is generated.

stream_options

object

No

When streaming output is enabled, set this parameter to {"include_usage": true} to display the number of tokens used in the last line of the output.

stop

string or array

No

The model stops generating text when it is about to output a string or token_id specified in the `stop` parameter.

Specify stop sequences to filter out unwanted content.

seed

integer

No

Setting the `seed` parameter makes the text generation process more deterministic. This is typically used to ensure that the model produces consistent results for each run.

To get consistent results, pass the same `seed` value that you specify in each model call and keep other parameters unchanged. The model will then attempt to return the same result.

Valid values: 0 to 231-1.

presence_penalty

float

No

Controls the degree of content repetition in the generated text.

Valid values: [-2.0, 2.0]. A positive value reduces repetition, and a negative value increases repetition.

Response parameters

Parameter

Type

Description

id

string

The unique identifier for the call.

choices

array

An array of generated content from the model.

choices[0].text

string

The content generated for the request.

choices[0].finish_reason

string

The reason why the model stopped generating content.

choices[0].index

integer

The index of the current element in the array. The value is always 0.

choices[0].logprobs

object

This parameter is always empty.

created

integer

The UNIX timestamp when the request was created.

model

string

The name of the model used for the request.

system_fingerprint

string

This parameter is always empty.

object

string

The object type, which is always "text_completion".

usage

object

The usage statistics for the request.

usage.prompt_tokens

integer

The number of tokens converted from the prompt.

usage.completion_tokens

integer

The number of tokens converted from choices[0].text.

usage.total_tokens

integer

The sum of usage.prompt_tokens and usage.completion_tokens.

Error codes

If a call fails, see Error messages for troubleshooting.