The ai-proxy plug-in implements the AI agent feature based on OpenAI API contracts. The ai-proxy plug-in supports AI service providers, such as OpenAI, Azure OpenAI, Moonshot, and Qwen.
Enable this plug-in for routes that process only AI traffic. For the requests that do not comply with the OpenAI API specifications, the plug-in returns the HTTP 404 status code.
If the suffix of a request path matches
/v1/chat/completions, the text-to-text protocol of OpenAI is used to parse the request body. Then, the system converts the parsed request to comply with the text-to-text protocol of the related large language model (LLM) provider.If the suffix of a request path matches
/v1/embeddings, the text vectorization protocol of OpenAI is used to parse the request body. Then, the system converts the parsed request to comply with the text vectorization protocol of the related LLM provider.
Running attributes
Plug-in execution stage: default stage. Plug-in execution priority: 100.
Configuration items
Basic configuration items
Name | Data type | Required | Default value | Description |
| object | Yes | - | The information about the AI service provider. |
The following table describes the fields in the provider configuration item.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The name of the AI service provider. |
| array of string | No | - | The token that is used for authentication when the plug-in accesses the AI service. If multiple tokens are configured, the plug-in randomly selects a token when it requests access to the AI service. Specific AI service providers support only one token. |
| number | No | - | The timeout period for accessing the AI service. Unit: milliseconds. Default value: 120000, which is equivalent to 2 minutes. |
| map of string | No | - | The AI model mapping table, which is used to map the model name in the request to the model name that is supported by the service provider.
|
| string | No | - | The API contract provided by the plug-in. Valid values: openai and original. openai indicates that the API contract of OpenAI is used. This is the default value. original indicates that the original API contract of the service provider is used. |
| object | No | - | The AI context information. |
| array of customSetting | No | - | The parameters to be overwritten or filled in for the AI request. |
The following table describes the fields in the context configuration item.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The URL of the file in which the AI context is saved. Only plaintext file content is supported. |
| string | Yes | - | The full name of the Higress backend service that corresponds to the URL. |
| number | Yes | - | The access port of the Higress backend service that corresponds to the URL. |
The following table describes the fields in the customSettings configuration item.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The name of the parameter you want to manage. Example: |
| string/int/float/bool | Yes | - | The value of the parameter that you want to manage. Example: 0. |
| string | No | "auto" | The parameter configuration mode. Valid values: "auto" and "raw". If you set this parameter to "auto", the parameter name is automatically rewritten based on protocols. If you set this parameter to "raw", the parameter name is not rewritten based on protocols, and no restriction check is performed on the parameter name. |
| bool | No | true | If the value of this parameter is set to false, the parameter is filled in only when you have not specified the parameter. Otherwise, the parameter settings that you specify are overwritten. |
The following table describes the parameter name rewriting rules for customSettings. The parameter names that you specify by using the name parameter are rewritten based on protocols. You must use the values in the settingName column to specify the name parameter. For example, if you set name to max_tokens, the parameter name is rewritten to max_tokens based on the OpenAI protocol and to maxOutputTokens based on the Gemini protocol. none indicates that the protocol does not support this parameter. When you set the name parameter to a value that is not listed in this table or a value that is not supported by the protocol, the configuration does not take effect if the raw mode is disabled.
settingName | openai | baidu | spark | qwen | gemini | hunyuan | claude | minimax |
max_tokens | max_tokens | max_output_tokens | max_tokens | max_tokens | maxOutputTokens | none | max_tokens | tokens_to_generate |
temperature | temperature | temperature | temperature | temperature | temperature | Temperature | temperature | temperature |
top_p | top_p | top_p | none | top_p | topP | TopP | top_p | top_p |
top_k | none | none | top_k | none | topK | none | top_k | none |
seed | seed | none | none | seed | none | none | none | none |
If the raw mode is enabled, the name and value that you specify in customSettings are directly used to rewrite the JSON content in the request. You do not need to modify parameter names. For most protocols, you can use customSettings to modify or fill in parameters in the root path of the JSON content. For the Qwen protocol, you can use the ai-proxy plug-in to configure parameters in the parameters subpath of the JSON content. For the Gemini protocol, you can use the ai-proxy plug-in to configure parameters in the generation_config subpath of the JSON content.
Configuration items specific to each service provider
OpenAI
The value of type for OpenAI is openai. The following table describes the configuration item that is specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | No | - | The custom URL of backend services based on the OpenAI protocol. Example: |
| object | No | - | The JSON schema that is predefined in OpenAI responses. Only specific models support this configuration item. |
Azure OpenAI
The value of type for Azure OpenAI is azure. The following table describes the configuration items specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The URL of the Azure OpenAI service. The value must include the |
Azure OpenAI supports only one API token.
Moonshot
The value of type for Moonshot is moonshot. The following table describes the configuration items that are specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | No | - | The ID of the file that is uploaded to Moonshot by using the file interface. The file content is used as the context of the AI conversation. This field cannot be used together with the |
Qwen
The value of type for Qwen is qwen. The following table describes the configuration items that are specific to this service provider.
Name | Data type | Required | Default value | Description |
| boolean | No | - | Specifies whether to enable the built-in Internet search feature of Qwen. |
| array of string | No | - | The ID of the file that is uploaded to DashScope by using the file interface. The file content is used as the context of the AI conversation. This field cannot be used together with the |
Baichuan AI
The value of type for Baichuan AI is baichuan. No specific configuration items are required.
Yi
The value of type for Yi is yi. No specific configuration items are required.
Zhipu AI
The value of type for Zhipu AI is zhipuai. No specific configuration items are required.
DeepSeek
The value of type for DeepSeek is deepseek. No specific configuration items are required.
Groq
The value of type for Groq is groq. No specific configuration items are required.
Baidu
The value of type for Baidu is baidu. No specific configuration items are required.
360 Brain
The value of type for 360 is ai360. No specific configuration items are required.
Mistral
The value of type for Mistral is mistral. No specific configuration items are required.
MiniMax
The value of type for MiniMax is minimax. The following table describes the configuration item that is specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | Required when the | - | ChatCompletion Pro is used when the |
Anthropic Claude
The value of type for Anthropic Claude is claude. The following table describes the configuration item that is specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | No | - | The API version of the Anthropic Claude service. Default value: 2023-06-01. |
Ollama
The value of type for Ollama is ollama. The following table describes the configuration items that are specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The host IP address of the Ollama server. |
| number | Yes | - | The port number of the Ollama server. Default value: 11434. |
Hunyuan
The value of type for Hunyuan is hunyuan. The following table describes the configuration items that are specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The ID of Hunyuan used for v3 authentication. |
| string | Yes | - | The key of Hunyuan used for v3 authentication. |
Stepfun
The value of type for Stepfun is stepfun. No specific configuration items are required.
Cloudflare Workers AI
The value of type for Cloudflare Workers AI is cloudflare. The following table describes the configuration item that is specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The Cloudflare account ID. For more information, see Cloudflare account ID. |
Spark
The value of type for Spark is spark. No specific configuration items are required.
The value of the apiTokens field of iFLYTEK Spark is in the APIKey:APISecret format. You must replace APIKey and APISecret with your API key and API secret. Separate the API key and API secret with a colon (:).
Gemini
The value of type for Gemini is gemini. The following table describes the configuration item that is specific to this service provider.
Name | Data type | Required | Default value | Description |
| map of string | No | - | The content filtering and security settings of Gemini. For more information, see Safety settings |
DeepL
The value of type for DeepL is deepl. The following table describes the configuration item that is specific to this service provider.
Name | Data type | Required | Default value | Description |
| string | Yes | - | The target language that you specify when you use the DeepL translation service. |
Cohere
The value of type for Cohere is cohere. No specific configuration items are required.
Examples
Use the Azure OpenAI service by using the OpenAI protocol
Use the most basic Azure OpenAI service without the need to configure context.
Configuration information
provider:
type: azure
apiTokens:
- "YOUR_AZURE_OPENAI_API_TOKEN"
azureServiceUrl: "https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-02-15-preview",Use the Qwen service by using the OpenAI protocol
Use the Qwen service and configure the model mapping from the OpenAI LLM to the Qwen service.
Configuration information
provider:
type: qwen
apiTokens:
- "YOUR_QWEN_API_TOKEN"
modelMapping:
'gpt-3': "qwen-turbo"
'gpt-35-turbo': "qwen-plus"
'gpt-4-turbo': "qwen-max"
'gpt-4-*': "qwen-max"
'gpt-4o': "qwen-vl-plus"
'text-embedding-v1': 'text-embedding-v1'
'*': "qwen-turbo"Use the Alibaba Cloud Model Studio service by using the original protocol
Configuration information
provider:
type: qwen
apiTokens:
- "YOUR_DASHSCOPE_API_TOKEN"
protocol: originalUse the Doubao service by using the OpenAI protocol
Configuration information
provider:
type: doubao
apiTokens:
- YOUR_DOUBAO_API_KEY
modelMapping:
'*': YOUR_DOUBAO_ENDPOINT
timeout: 1200000Use the Moonshot service based on the content of a file
Upload a file to the Moonshot service in advance. Then, use the Moonshot service together with the content of the file as the context.
Configuration information
provider:
type: moonshot
apiTokens:
- "YOUR_MOONSHOT_API_TOKEN"
moonshotFileId: "YOUR_MOONSHOT_FILE_ID",
modelMapping:
'*': "moonshot-v1-32k"Use the Groq service by using the OpenAI protocol
Configuration information
provider:
type: groq
apiTokens:
- "YOUR_GROQ_API_TOKEN"Use the Anthropic Claude service by using the OpenAI protocol
Configuration information
provider:
type: claude
apiTokens:
- "YOUR_CLAUDE_API_TOKEN"
version: "2023-06-01"Use the Hunyuan service by using the OpenAI protocol
Configuration information
provider:
type: "hunyuan"
hunyuanAuthKey: "<YOUR AUTH KEY>"
apiTokens:
- ""
hunyuanAuthId: "<YOUR AUTH ID>"
timeout: 1200000
modelMapping:
"*": "hunyuan-lite"Use the Baidu service by using the OpenAI protocol
Configuration information
provider:
type: baidu
apiTokens:
- "YOUR_BAIDU_API_TOKEN"
modelMapping:
'gpt-3': "ERNIE-4.0"
'*': "ERNIE-4.0"Use the MiniMax service by using the OpenAI protocol
Configuration information
provider:
type: minimax
apiTokens:
- "YOUR_MINIMAX_API_TOKEN"
modelMapping:
"gpt-3": "abab6.5g-chat"
"gpt-4": "abab6.5-chat"
"*": "abab6.5g-chat"
minimaxGroupId: "YOUR_MINIMAX_GROUP_ID"Use the 360 Brain service by using the OpenAI protocol
Configuration information
provider:
type: ai360
apiTokens:
- "YOUR_MINIMAX_API_TOKEN"
modelMapping:
"gpt-4o": "360gpt-turbo-responsibility-8k"
"gpt-4": "360gpt2-pro"
"gpt-3.5": "360gpt-turbo"
"text-embedding-3-small": "embedding_s1_v1.2"
"*": "360gpt-pro"Use the Cloudflare Workers AI service by using the OpenAI protocol
Configuration information
provider:
type: cloudflare
apiTokens:
- "YOUR_WORKERS_AI_API_TOKEN"
cloudflareAccountId: "YOUR_CLOUDFLARE_ACCOUNT_ID"
modelMapping:
"*": "@cf/meta/llama-3-8b-instruct"Use the Spark service by using the OpenAI protocol
Configuration information
provider:
type: spark
apiTokens:
- "APIKey:APISecret"
modelMapping:
"gpt-4o": "generalv3.5"
"gpt-4": "generalv3"
"*": "general"Use the Gemini service by using the OpenAI protocol
Configuration information
provider:
type: gemini
apiTokens:
- "YOUR_GEMINI_API_TOKEN"
modelMapping:
"*": "gemini-pro"
geminiSafetySetting:
"HARM_CATEGORY_SEXUALLY_EXPLICIT" :"BLOCK_NONE"
"HARM_CATEGORY_HATE_SPEECH" :"BLOCK_NONE"
"HARM_CATEGORY_HARASSMENT" :"BLOCK_NONE"
"HARM_CATEGORY_DANGEROUS_CONTENT" :"BLOCK_NONE"Use the DeepL translation service by using the OpenAI protocol
Configuration information
provider:
type: deepl
apiTokens:
- "YOUR_DEEPL_API_TOKEN"
targetLang: "ZH"Sample request
In the following sample request, model indicates the service type of DeepL. You can only set this value to Free or Pro. Enter the text that you want to translate in content. The content parameter in the role: system configuration can contain contextual information that helps improve translation accuracy but is not itself translated. For example, you can enter the description of a product as contextual information when you use the service to translate the product name. This may help improve the quality of the translation.
{
"model": "Free",
"messages": [
{
"role": "system",
"content": "money"
},
{
"content": "sit by the bank"
},
{
"content": "a bank in China"
}
]
}Sample response
{
"choices": [
{
"index": 0,
},
{
"index": 1,
}
],
"created": 1722747752,
"model": "Free",
"object": "chat.completion",
"usage": {}
}