All Products
Search
Document Center

Intelligent Media Services:AIAgentTemplateConfig

Last Updated:Mar 21, 2026

Parameter

Type

Description

Example

object

Agent template parameters.

VoiceChat

object

Voice chat parameters.

Greeting

string

The greeting message delivered when a user joins the session. If this parameter is omitted, the greeting configured in the agent template is used. Maximum length: 128 characters.

早上好,我的朋友

LlmHistory

array

The LLM/MLLM conversation history.

object

Role

string

The role of the conversation participant. Valid values:

  • user: The user.

  • assistant: The AI assistant.

  • system: The system.

  • function: A function call.

  • plugin: A plugin.

  • tool: A tool.

user

Content

string

The text content of the message.

你好

WorkflowOverrideParams

string

Workflow override parameters. Default: empty.

{}

EnableIntelligentSegment

boolean

Specifies whether to enable intelligent segmentation. If enabled, this feature intelligently merges pauses in a user's speech into a single, complete sentence. Default: true.

true

AvatarUrlType

string

The type of the agent's avatar URL. Default: none.

USER

AvatarUrl

string

The URL of the agent's avatar for voice chat. Default: none.

http://example.com/a.jpg

VoiceIdList

array

A list of available voices.

string

A voice ID.

zhixiaoxia

CharBreak

boolean

EnableVoiceInterrupt

boolean

Specifies whether to enable voice interruption. Default: true.

true

VoiceprintId

string

The unique ID for voiceprint recognition. Default: not specified.

uniqueId

GracefulShutdown

boolean

Specifies whether to enable graceful shutdown. Default: false.

  • If enabled, the agent finishes its current speech (up to 10 seconds) before stopping.

false

InterruptWords

array

A list of specific words or phrases that trigger a conversation interruption.

string

A specific word or phrase that triggers a conversation interruption.

停止

UserOnlineTimeout

integer

The time in seconds that the agent waits for a user to join before closing the task. Default: 60.

60

AsrLanguageId

string

The language ID for Automatic Speech Recognition (ASR). Possible values:

  • zh_mandarin: Chinese

  • en: English

  • zh_en: Chinese-English

  • es: Spanish

  • jp: Japanese

zh_mandarin

UserOfflineTimeout

integer

The time in seconds that the agent waits after a user leaves before closing the task. Default: 5.

5

LlmSystemPrompt

string

The system prompt for the LLM, applied when the call starts.

你是一位友好且乐于助人的助手,专注于为用户提供准确的信息和建议。

BailianAppParams

string

Parameters for Alibaba Cloud Bailian. For details, see Bailian App Params.

{}

VadLevel

integer

The interruption sensitivity threshold. A higher value makes it more difficult to interrupt the agent. Range: 0 to 11. Default: 11.

  • 0: Disables Voice Activity Detection (VAD).

  • 1 to 10: A higher value makes it more difficult to interrupt the agent.

  • 11: Offers lower audio distortion and stronger resistance to interference.

11

LlmHistoryLimit

integer

The maximum number of conversation turns to retain in the LLM/MLLM history. Default: 10.

10

AsrMaxSilence

integer

The maximum duration of silence in milliseconds before a sentence break is detected. Range: 200 to 1,200. Default: 400.

400

WakeUpQuery

string

An initial user query that the agent addresses immediately when the call starts.

今天天气怎么样?

Volume

integer

The speaking volume of the agent.

  • If omitted, the system uses adaptive volume mode.

  • If specified, the valid range is 0 to 400. The output volume is calculated as: Output Volume in Workflow * (volume/100). For example:

  1. If volume is 0, the output is silent.

  2. If volume is 100, the output volume is the original volume.

  3. If volume is 200, the output volume is twice the original volume.

100

VoiceId

string

The ID of the Text-to-Speech (TTS) voice. Changes take effect on the next utterance. If omitted, the default voice from the agent template is used. This parameter applies only to preset TTS voices. Maximum length: 64 characters. For available values, see Intelligent voice effect samples.

zhixiaoxia

UseVoiceprint

boolean

Specifies whether to use voiceprint recognition. Default: false.

false

MaxIdleTime

integer

The maximum idle time in seconds with no interaction before the agent goes offline. Default: 600.

600

AsrHotWords

array

A list of hot words to improve ASR accuracy. A maximum of 128 words is supported.

string

A hot word. Must be 1 to 10 characters long.

检查

EnablePushToTalk

boolean

Specifies whether to enable Push-to-Talk mode. Default: false.

false

VisionChat

object

Vision agent parameters.

Greeting

string

The greeting message delivered when a user joins the session. If this parameter is omitted, the greeting configured in the agent template is used. Maximum length: 128 characters.

早上好,我的朋友!

LlmHistory

array

The LLM/MLLM conversation history.

object

Role

string

The role of the conversation participant. Valid values:

  • user: The user.

  • assistant: The AI assistant.

  • system: The system.

  • function: A function call.

  • plugin: A plugin.

  • tool: A tool.

user

Content

string

The text content of the message.

你好

WorkflowOverrideParams

string

Workflow override parameters. Default: empty.

{}

EnableIntelligentSegment

boolean

Specifies whether to enable intelligent segmentation. If enabled, this feature intelligently merges pauses in a user's speech into a single, complete sentence. Default: true.

true

VoiceIdList

array

A list of available voices.

string

A voice ID.

zhixiaoxia

CharBreak

boolean

EnableVoiceInterrupt

boolean

Specifies whether to enable voice interruption. Default: true.

true

VoiceprintId

string

The unique ID for voiceprint recognition. Default: not specified.

uniqueId

GracefulShutdown

boolean

Specifies whether to enable graceful shutdown. Default: false.

  • If enabled, the agent finishes its current speech (up to 10 seconds) before stopping.

false

InterruptWords

array

A list of specific words or phrases that trigger a conversation interruption.

string

A specific word or phrase that triggers a conversation interruption.

停止

UserOnlineTimeout

integer

The time in seconds that the agent waits for a user to join before closing the task. Default: 60.

60

AsrLanguageId

string

The language ID for Automatic Speech Recognition (ASR). Possible values:

  • zh_mandarin: Chinese

  • en: English

  • zh_en: Chinese-English

  • es: Spanish

  • jp: Japanese

zh_mandarin

UserOfflineTimeout

integer

The time in seconds that the agent waits after a user leaves before closing the task. Default: 5.

5

LlmSystemPrompt

string

The system prompt for the LLM, applied when the call starts.

你是一位友好且乐于助人的助手,专注于为用户提供准确的信息和建议。

BailianAppParams

string

Parameters for Alibaba Cloud Bailian. For details, see Bailian App Params.

{}

VadLevel

integer

The interruption sensitivity threshold. A higher value makes it more difficult to interrupt the agent. Range: 0 to 11. Default: 11.

  • 0: Disables VAD.

  • 1 to 10: A higher value makes it more difficult to interrupt the agent.

  • 11: Offers lower audio distortion and stronger resistance to interference.

0

LlmHistoryLimit

integer

The maximum number of conversation turns to retain in the LLM/MLLM history. Default: 10.

10

AsrMaxSilence

integer

The maximum duration of silence in milliseconds before a sentence break is detected. Range: 200 to 1,200. Default: 400.

400

WakeUpQuery

string

An initial user query that the agent addresses immediately when the call starts.

今天天气怎么样?

Volume

integer

The speaking volume of the agent.

  • If omitted, the system uses adaptive volume mode.

  • If specified, the valid range is 0 to 400. The output volume is calculated as: Output Volume in Workflow * (volume/100). For example:

  1. If volume is 0, the output is silent.

  2. If volume is 100, the output volume is the original volume.

  3. If volume is 200, the output volume is twice the original volume.

100

VoiceId

string

The ID of the Text-to-Speech (TTS) voice. Changes take effect on the next utterance. If omitted, the default voice from the agent template is used. This parameter applies only to preset TTS voices. Maximum length: 64 characters. For available values, see Intelligent voice effect samples.

zhixiaoxia

UseVoiceprint

boolean

Specifies whether to use voiceprint recognition. Default: false.

false

MaxIdleTime

integer

The maximum idle time in seconds with no interaction before the agent goes offline. Default: 600.

600

AsrHotWords

array

A list of hot words to improve ASR accuracy. A maximum of 128 words is supported.

string

A hot word. Must be 1 to 10 characters long.

检查

EnablePushToTalk

boolean

Specifies whether to enable Push-to-Talk mode. Default: false.

false

AvatarChat3D

object

3D avatar parameters.

Greeting

string

The greeting message delivered when a user joins the session. If this parameter is omitted, the greeting configured in the agent template is used. Maximum length: 128 characters.

早上好,我的朋友!

LlmHistory

array

The LLM/MLLM conversation history.

object

Role

string

The role of the conversation participant. Valid values:

  • user: The user.

  • assistant: The AI assistant.

  • system: The system.

  • function: A function call.

  • plugin: A plugin.

  • tool: A tool.

user

Content

string

The text content of the message.

你好

WorkflowOverrideParams

string

Workflow override parameters. Default: empty.

{}

EnableIntelligentSegment

boolean

Specifies whether to enable intelligent segmentation. If enabled, this feature intelligently merges pauses in a user's speech into a single, complete sentence. Default: true.

true

VoiceIdList

array

A list of available voices.

string

A voice ID.

zhixiaoxia

AvatarId

string

The ID of the avatar model.

1231

CharBreak

boolean

EnableVoiceInterrupt

boolean

Specifies whether to enable voice interruption. Default: true.

true

VoiceprintId

string

The unique ID for voiceprint recognition. Default: not specified.

uniqueId

GracefulShutdown

boolean

Specifies whether to enable graceful shutdown. Default: false.

  • If enabled, the agent finishes its current speech (up to 10 seconds) before stopping.

false

InterruptWords

array

A list of specific words or phrases that trigger a conversation interruption.

string

A specific word or phrase that triggers a conversation interruption.

停止

UserOnlineTimeout

integer

The time in seconds that the agent waits for a user to join before closing the task. Default: 60.

60

AsrLanguageId

string

The language ID for Automatic Speech Recognition (ASR). Possible values:

  • zh_mandarin: Chinese

  • en: English

  • zh_en: Chinese-English

  • es: Spanish

  • jp: Japanese

zh_mandarin

UserOfflineTimeout

integer

The time in seconds that the agent waits after a user leaves before closing the task. Default: 5.

5

LlmSystemPrompt

string

The system prompt for the LLM, applied when the call starts.

你是一位友好且乐于助人的助手,专注于为用户提供准确的信息和建议。

BailianAppParams

string

Parameters for Alibaba Cloud Bailian. For details, see Bailian App Params.

{}

VadLevel

integer

The interruption sensitivity threshold. A higher value makes it more difficult to interrupt the agent. Range: 0 to 11. Default: 11.

  • 0: Disables VAD.

  • 1 to 10: A higher value makes it more difficult to interrupt the agent.

  • 11: Offers lower audio distortion and stronger resistance to interference.

0

LlmHistoryLimit

integer

The maximum number of conversation turns to retain in the LLM/MLLM history. Default: 10.

10

AsrMaxSilence

integer

The maximum duration of silence in milliseconds before a sentence break is detected. Range: 200 to 1,200. Default: 400.

400

WakeUpQuery

string

An initial user query that the agent addresses immediately when the call starts.

今天天气怎么样?

Volume

integer

The speaking volume of the agent.

  • If omitted, the system uses adaptive volume mode.

  • If specified, the valid range is 0 to 400. The output volume is calculated as: Output Volume in Workflow * (volume/100). For example:

  1. If volume is 0, the output is silent.

  2. If volume is 100, the output volume is the original volume.

  3. If volume is 200, the output volume is twice the original volume.

100

VoiceId

string

The ID of the Text-to-Speech (TTS) voice. Changes take effect on the next utterance. If omitted, the default voice from the agent template is used. This parameter applies only to preset TTS voices. Maximum length: 64 characters. For available values, see Intelligent voice effect samples.

zhixiaoxia

UseVoiceprint

boolean

Specifies whether to use voiceprint recognition. Default: false.

false

MaxIdleTime

integer

The maximum idle time in seconds with no interaction before the agent goes offline. Default: 600.

600

AsrHotWords

array

A list of hot words to improve ASR accuracy. A maximum of 128 words is supported.

string

A hot word. Must be 1 to 10 characters long.

检查

EnablePushToTalk

boolean

Specifies whether to enable Push-to-Talk mode. Default: false.

false