completions api for qwen models - Alibaba Cloud Model Studio

The Completions API is designed for text completion scenarios, such as code completion and content continuation.

Important

This document applies only to the China (Beijing) region. To use the model, you must use an API key from the China (Beijing) region.

Supported models

Currently, the following Qwen Coder models are supported:

qwen2.5-coder-0.5b-instruct, qwen2.5-coder-1.5b-instruct, qwen2.5-coder-3b-instruct, qwen2.5-coder-7b-instruct, qwen2.5-coder-14b-instruct, qwen2.5-coder-32b-instruct, qwen-coder-turbo-0919, qwen-coder-turbo-latest, qwen-coder-turbo

Prerequisites

You must have obtained an API key and configured the API key as an environment variable. If you use the OpenAI SDK, you must install the SDK.

Get started

You can use the Completions API for text completion. The following scenarios are supported:

Generate text that continues from a given prefix.
Generate intermediate content based on a given prefix and suffix.

The API does not support generating content before a given suffix.

Get started

You can pass information, such as the function name, input parameters, and usage instructions, in the prefix. The Completions API then returns the generated code.

The prompt template is as follows:

<|fim_prefix|>{prefix_content}<|fim_suffix|>

In the template, {prefix_content} is the prefix that you provide.

Python

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key=os.getenv("DASHSCOPE_API_KEY")
)

completion = client.completions.create(
  model="qwen2.5-coder-32b-instruct",
  prompt="<|fim_prefix|>Write a Python quick sort function, def quick_sort(arr):<|fim_suffix|>",
)

print(completion.choices[0].text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI(
    {
        // If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx",
        apiKey: process.env.DASHSCOPE_API_KEY,
        baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
    }
);

async function main() {
    const completion = await openai.completions.create({
        model: "qwen2.5-coder-32b-instruct",
        prompt: "<|fim_prefix|>Write a Python quick sort function, def quick_sort(arr):<|fim_suffix|>",
    });
    console.log(completion.choices[0].text)
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen2.5-coder-32b-instruct",
    "prompt": "<|fim_prefix|>Write a Python quick sort function, def quick_sort(arr):<|fim_suffix|>"
}'

Generate intermediate content based on prefix and suffix

The Completions API lets you generate intermediate content based on a given prefix and suffix. You can pass information, such as the function name, input parameters, and usage instructions, in the prefix, and information, such as the function's return parameters, in the suffix. The Completions API then returns the generated code.

The prompt template is as follows:

<|fim_prefix|>{prefix_content}<|fim_suffix|>{suffix_content}<|fim_middle|>

In the template, {prefix_content} is the prefix and {suffix_content} is the suffix.

Python

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key=os.getenv("DASHSCOPE_API_KEY")
)

prefix_content = f"""def reverse_words_with_special_chars(s):
'''
Reverses each word in a string (preserving the position of non-alphabetic characters) while maintaining the word order.
    Example:
    reverse_words_with_special_chars("Hello, world!") -> "olleH, dlrow!"
    Parameters:
        s (str): The input string (may contain punctuation).
    Returns:
        str: The processed string with words reversed but non-alphabetic characters in their original positions.
'''
"""

suffix_content = "return result"

completion = client.completions.create(
  model="qwen2.5-coder-32b-instruct",
  prompt=f"<|fim_prefix|>{prefix_content}<|fim_suffix|>{suffix_content}<|fim_middle|>",
)

print(completion.choices[0].text)

Node.js

import OpenAI from 'openai';


const client = new OpenAI({
  baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1",
  apiKey: process.env.DASHSCOPE_API_KEY
});

const prefixContent = `def reverse_words_with_special_chars(s):
'''
Reverses each word in a string (preserving the position of non-alphabetic characters) while maintaining the word order.
    Example:
    reverse_words_with_special_chars("Hello, world!") -> "olleH, dlrow!"
    Parameters:
        s (str): The input string (may contain punctuation).
    Returns:
        str: The processed string with words reversed but non-alphabetic characters in their original positions.
'''
`;

const suffixContent = "return result";

async function main() {
  const completion = await client.completions.create({
    model: "qwen2.5-coder-32b-instruct",
    prompt: `<|fim_prefix|>${prefixContent}<|fim_suffix|>${suffixContent}<|fim_middle|>`
  });

  console.log(completion.choices[0].text);
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen2.5-coder-32b-instruct",
    "prompt": "<|fim_prefix|>def reverse_words_with_special_chars(s):\n\"\"\"\nReverses each word in a string (preserving the position of non-alphabetic characters) while maintaining the word order.\n    Example:\n    reverse_words_with_special_chars(\"Hello, world!\") -> \"olleH, dlrow!\"\n    Parameters:\n        s (str): The input string (may contain punctuation).\n    Returns:\n        str: The processed string with words reversed but non-alphabetic characters in their original positions.\n\"\"\"\n<|fim_suffix|>return result<|fim_middle|>"
}'

Request and response parameters

Request parameters

Parameter	Type	Required	Description
model	string	Yes	The name of the model to call.
prompt	string	Yes	The prompt for which to generate completions.
max_tokens	integer	No	The maximum number of tokens to return in the response. `max_tokens` setting does not affect the model's generation process. If the number of tokens generated by the model exceeds the `max_tokens` value, the response is truncated.
temperature	float	No	The sampling temperature, which controls the diversity of the generated text. A higher temperature value results in more diverse generated text. A lower temperature value results in more deterministic generated text. Valid values: [0, 2.0). Because both `temperature` and `top_p` control text diversity, set only one of them.
top_p	float	No	The probability threshold for nucleus sampling, which controls the diversity of the generated text. A higher top_p value results in more diverse generated text. A lower top_p value results in more deterministic generated text. Valid values: (0, 1.0]. Because both `temperature` and `top_p` control text diversity, set only one of them.
stream	boolean	No	Specifies whether to use streaming output for the response. Valid values: false (default): The result is returned after all the content is generated. true: The output is generated and streamed incrementally. A chunk is immediately sent each time a part of the content is generated.
stream_options	object	No	When streaming output is enabled, set this parameter to `{"include_usage": true}` to display the number of tokens used in the last line of the output.
stop	string or array	No	The model stops generating text when it is about to output a string or `token_id` specified in the `stop` parameter. Specify stop sequences to filter out unwanted content.
seed	integer	No	Setting the `seed` parameter makes the text generation process more deterministic. This is typically used to ensure that the model produces consistent results for each run. To get consistent results, pass the same `seed` value that you specify in each model call and keep other parameters unchanged. The model will then attempt to return the same result. Valid values: 0 to 2³¹-1.
presence_penalty	float	No	Controls the degree of content repetition in the generated text. Valid values: [-2.0, 2.0]. A positive value reduces repetition, and a negative value increases repetition.

Response parameters

Parameter	Type	Description
id	string	The unique identifier for the call.
choices	array	An array of generated content from the model.
choices[0].text	string	The content generated for the request.
choices[0].finish_reason	string	The reason why the model stopped generating content.
choices[0].index	integer	The index of the current element in the array. The value is always 0.
choices[0].logprobs	object	This parameter is always empty.
created	integer	The UNIX timestamp when the request was created.
model	string	The name of the model used for the request.
system_fingerprint	string	This parameter is always empty.
object	string	The object type, which is always `"text_completion"`.
usage	object	The usage statistics for the request.
usage.prompt_tokens	integer	The number of tokens converted from the `prompt`.
usage.completion_tokens	integer	The number of tokens converted from `choices[0].text`.
usage.total_tokens	integer	The sum of `usage.prompt_tokens` and `usage.completion_tokens`.

Error codes

If a call fails, see Error messages for troubleshooting.