Read this topic to learn about the data types used in the iOS SDK.
Data Structure Overview
Deprecated parameters and methods exist in older SDK versions. Upgrade to the latest SDK version. For more information, see iOS User Guide.
Structure Type | Data Type | Description |
Enum | Agent type | |
Agent state | ||
Audio encoding configuration | ||
Audio scenario configuration | ||
Agent view rendering mode | ||
Agent view mirror mode | ||
Agent view rotation mode | ||
Network Status | ||
Reason the agent’s speech was interrupted | ||
VAD result | ||
Error code | ||
Network connection status during a call | ||
Method to detect when user speech ends | ||
Class | Agent runtime information | |
Call audio configuration | ||
Agent view configuration. Use this class to configure rendering for agents that require it, such as digital humans. | ||
Runtime configuration for vision understanding agents | ||
Request model to enable custom frame capture for vision understanding agents | ||
Send a message to an agent to request the Model. | ||
Configuration to start an agent call | ||
TemplateConfig parameter used to start a call | ||
Chat agent session configuration parameters | ||
Agent sharing configuration information | ||
Local video configuration for calls | ||
Agent startup and runtime configuration for calls | ||
Speech recognition configuration | ||
Speech synthesis configuration | ||
Large Language Model (LLM) configuration | ||
Digital human configuration | ||
Interrupt configuration | ||
Voiceprint denoising configuration | ||
Turn detection configuration | ||
VCR detection result | ||
VCR configuration | ||
Base VCR detection configuration | ||
VCR video frame detection configuration | ||
Experimental parameters used to control specific logic policies | ||
Call environment parameters | ||
Agent speech content for auto-speech scenarios, such as acknowledgments and proactive questions | ||
Auto-speech configuration for cases where the LLM response is delayed | ||
Configuration for agent questions when the user is silent | ||
Configuration module for back-channeling. When enabled, the agent randomly plays short acknowledgments at specific trigger points. |
Data Structure Details
Enum
ARTCAICallAgentType
Agent type
Enumeration Value | Value | Description |
VoiceAgent | 0 | Voice-only interaction with no visual representation |
AvatarAgent | 1 | Visual representation with support for voice and visual interaction |
VisionAgent | 2 | Focuses on visual information understanding and analysis |
VideoAgent | 3 | Bidirectional video call between the user and the agent |
ARTCAICallAgentState
Agent state
Enumeration Value | Value | Description |
Listening | 1 | Listening |
Thinking | 2 | Thinking |
Speaking | 3 | Speaking |
ARTCAICallAudioProfile
Audio encoding configuration
Enumeration Value | Value | Description |
LowQualityMode | 0x0000 | Low-quality audio mode. Default sample rate: 8000 Hz. Mono channel. Maximum encoding bitrate: 12 kbps |
BasicQualityMode | 0x0001 | Standard-quality audio mode. Default sample rate: 16000 Hz. Mono channel. Maximum encoding bitrate: 24 kbps |
HighQualityMode | 0x0010 | (Default) High-quality audio mode. Default sample rate: 48000 Hz. Mono channel. Maximum encoding bitrate: 64 kbps |
StereoHighQualityMode | 0x0011 | Stereo high-quality audio mode. Default sample rate: 48000 Hz. Stereo channel. Maximum encoding bitrate: 80 kbps |
SuperHighQualityMode | 0x0012 | Super-high-quality audio mode. Default sample rate: 48000 Hz. Mono channel. Maximum encoding bitrate: 96 kbps |
StereoSuperHighQualityMode | 0x0013 | Stereo super-high-quality audio mode. Default sample rate: 48000 Hz. Stereo channel. Maximum encoding bitrate: 128 kbps |
ARTCAICallAudioScenario
Audio scenario configuration
Enumeration Value | Value | Description |
DefaultMode | 0x0000 | Recommended for general real-time communication scenarios |
MusicMode | 0x0300 | High-fidelity music mode. Recommended for music instruction or other scenarios requiring high-quality music reproduction |
ARTCAICallAgentViewMode
Agent view rendering mode
Enumeration Value | Value | Description |
Auto | 0 | Auto mode |
Stretch | 1 | Stretch mode |
Fill | 2 | Fill mode |
Crop | 3 | Crop mode |
ARTCAICallAgentViewMirrorMode
Agent view mirror mode
Enumeration Value | Value | Description |
OnlyFrontCameraPreviewEnabled | 0 | Mirror only the front camera preview. Do not mirror other views. |
AllEnabled | 1 | Enable mirroring for all views |
AllDisabled | 2 | Disable mirroring for all views |
ARTCAICallAgentViewRotationMode
Agent view rotation mode
Enumeration Value | Value | Description |
Rotation_0 | 0 | Video view rotation angle: 0 degrees |
Rotation_90 | 1 | Video view rotation angle: 90 degrees |
Rotation_180 | 2 | Video view rotation angle: 180 degrees |
Rotation_270 | 3 | Video view rotation angle: 270 degrees |
ARTCAICallNetworkQuality
Network Status
Enumeration Value | Value | Description |
Excellent | 0 | Excellent network quality. Video and audio are smooth and clear |
Good | 1 | Good network quality. Smoothness and clarity are nearly identical to excellent |
Poor | 2 | Poor network quality. Minor issues with smoothness and clarity. Communication remains unaffected |
Bad | 3 | Poor network quality. Severe video stuttering. Audio remains usable for communication |
VeryBad | 4 | Very poor network quality. Communication is nearly impossible |
Disconnect | 5 | Network disconnected |
Unknow | 6 | Unknown |
ARTCAICallSpeakingInterruptedReason
Reason the agent’s speech was interrupted
Enumeration Value | Value | Description |
unknown | 0 | Unknown reason |
byWords | 1 | Specific words were detected |
byVoice | 2 | Voice interruption |
byInterruptSpeaking | 3 | The interruptSpeaking API was called |
bySpeechBroadCast | 4 | The voice broadcast was interrupted. |
byLlmQuery | 5 | An active LLM query was interrupted. |
ARTCAICallVoiceprintResult
VAD result
Enumeration Value | Value | Description |
Off | 0 | Voiceprint denoising VAD is disabled. AIVAD is also disabled |
Unregister | 1 | Voiceprint denoising VAD is enabled but voiceprint registration is incomplete |
DetectedSpeaker | 2 | Voiceprint denoising VAD is enabled and the main speaker is identified |
UndetectedSpeaker | 3 | Voiceprint denoising VAD is enabled but the main speaker is not identified |
DetectedSpeakerWithAIVad | 4 | AIVAD is enabled and the main speaker is identified |
UndetectedSpeakerWithAIVad | 5 | AIVAD is enabled but the main speaker is not identified |
Unknown | 100 | Unknown |
ARTCAICallErrorCode
Error code
Enumeration Value | Value | Description |
None | 0 | Success |
InvalidAction | -1 | Invalid action |
InvalidParames | -2 | Invalid parameter |
NetworkError | -3 | Network error |
InternalError | -4 | Internal error |
BeginCallFailed | -10000 | Failed to start the call |
ConnectionFailed | -10001 | Connection issue |
PublishFailed | -10002 | Failed to ingest the stream |
SubscribeFailed | -10003 | Failed to pull the stream |
TokenExpired | -10004 | Call authentication expired |
KickedByUserReplace | -10005 | Call failed due to same-name login |
KickedBySystem | -10006 | Call failed because the system kicked the user out |
KickedByChannelTerminated | -10007 | Call failed because the channel was destroyed |
LocalDeviceException | -10008 | Call failed due to local device issues |
AgentLeaveChannel | -10101 | The agent left the channel (call ended) |
AgentPullFailed | -10102 | Failed to pull the stream for the agent |
AgentASRFailed | -10103 | Agent ASR failed |
AvatarServiceFailed | -10201 | Failed to start the digital agent service |
AvatarRoutesExhausted | -10202 | Exceeded the maximum number of concurrent digital agent routes |
AgentSubscriptionRequired | -10203 | Call initiation exceeded the daily free trial quota |
AgentNotFound | -10204 | Agent not found (agent ID does not exist) |
ChatTextMessageSendFailed | -10301 | Failed to send the text message |
ChatTextMessageReceiveFailed | -10302 | Failed to receive the text message |
ChatVoiceRecordFailed | -10310 | Failed to record the voice message |
ChatVoiceMessageSendFailed | -10311 | Failed to send the voice message |
ChatVoiceMessageReceiveFailed | -10312 | Failed to receive the voice message |
ChatPlayMessageReceiveFailed | -10321 | Failed to receive the playback message |
ChatLogNotFound | -10331 | Chat log not found |
ChatAttachmentUploading | -10332 | The attachment is still uploading. Wait until upload completes before sending the message |
UnknowError | -40000 | Unknown error |
ARTCAICallTurnDetectionMode
Method to detect when user speech ends
Enumeration Value | Value | Description |
Normal | 0 | Normal mode. Does not use AI for semantic analysis. Uses ASR silence duration to detect speech end |
Semantic | Semantic mode. Uses AI to analyze context and semantics to detect speech end |
ARTCAICallConnectionStatus
Network connection status during a call
Enumeration Value | Value | Description |
Init | 0 | Initialization complete |
Disconnected | 1 | Network connection disconnected |
Connecting | 2 | Establishing network connection |
Connected | 3 | Network connected |
Reconnecting | 4 | Reconnecting to the network |
Failed | 5 | Network connection failed |
Class
ARTCAICallAgentInfo
Agent runtime information
Property Name | Type | Description |
agentId | String | Current agent ID |
agentType | Agent type | |
channelId | String | RTC channel ID where the agent resides |
uid | String | Unique identifier for the agent joining the RTC channel |
instanceId | String | Instance ID for the current agent runtime |
requestId | String | Request ID for starting the current agent |
region | String? | Region where the agent resides |
ARTCAICallAudioConfig
Specifies the audio configuration for a call.
Property Name | Type | Description |
audioProfile | Audio encoding configuration. Default: HighQualityMode | |
audioScenario | Audio scenario configuration. Default: ARTCAICallAudioSceneMusicMode |
ARTCAICallViewConfig
This class provides agent view configuration, allowing you to configure rendering for agents that require it, such as digital humans.
Property Name | Type | Description |
view | UIView | Rendering view |
viewMode | Image rendering mode | |
viewMirrorMode | Image mirror mode | |
viewRotationMode | Image rotation mode |
ARTCAICallVisionConfig
Specifies the runtime configuration for visual understanding agents.
Property Name | Type | Description |
preview | UIView? | Preview. Empty means no preview—only stream ingestion |
viewMode | Preview image rendering mode | |
viewMirrorMode | Preview image mirror mode | |
viewRotationMode | Preview image rotation mode | |
dimensions | CGSize | Stream ingestion resolution |
frameRate | Int | Stream ingestion frame rate |
bitrate | Int | Stream ingestion bitrate |
keyFrameInterval | Int | Stream ingestion keyframe interval (milliseconds) |
useHighQualityPreview | Bool | Use high-definition preview. Otherwise, the SDK adjusts automatically |
cameraCaptureFrameRate | Int | Preview resolution (default: 15 fps) |
ARTCAICallVisionCustomCaptureRequest
A request model that enables custom frame capture for vision understanding agents
Property Name | Type | Description |
text | String | Text parameter for multimodal large model requests |
enableASR | Bool | Pass ASR results as input to the large model |
isSingle | Bool | Single-frame capture |
eachDuration | UInt | Frame capture interval (seconds) |
num | UInt | Number of images per frame capture |
duration | UInt | Duration of continuous frame capture (seconds). Applies only for continuous capture. |
userData | String? | JSON string containing custom business information |
ARTCAICallSendTextToAgentRequest
A request model for sending text messages to an agent.
Property Name | Type | Description |
text | String | Text message to ask the agent, for example: "What is this?" |
ARTCAICallConfig
Specifies the configuration for starting an agent call.
Property Name | Type | Description |
agentId | String | Agent ID |
agentType | Agent type. Must match the agent ID’s type. Otherwise, agent startup fails | |
agentUserId | String? | Agent UID. If empty, the service assigns one |
region | String | Region where the agent service resides. Must match the region of the agent ID. Otherwise, agent startup fails |
userId | String | Current user ID |
userJoinToken | String | Current user’s join token |
userData | [String: Any]? | User-defined information passed to the agent |
agentConfig | agentConfig parameter used to start the call | |
audioConfig | Local audio configuration | |
videoConfig | Local video configuration. Applies only for VisionAgent or VideoAgent | |
chatSyncConfig | Associated chat agent configuration | |
templateConfig | Deprecated. Use |
ARTCAICallTemplateConfig (deprecated)
The TemplateConfig parameter is used to start a call.
This method is deprecated in versions 2.5 and later. Use ARTCAICallAgentConfig instead.
Property Name | Type | Description |
agentGreeting | String? | Agent greeting. Empty uses the agent’s default value. Maximum length: 100 characters |
userOnlineTimeout | Int32 | Time for the agent to wait before ending the task if the user does not join. Negative values use the server default: 60 seconds |
userOfflineTimeout | Int32 | Time for the agent to wait before ending the task after the user leaves. Negative values use the server default: 5 seconds |
workflowOverrideParams | [String: Any]? | Workflow override parameters |
bailianAppParams | [String: Any]? | Alibaba Cloud Model Studio application center parameters |
asrMaxSilence | Int32 | Voice segmentation threshold. Range: 200–1200 ms. Negative values use the server default: 400 ms |
volume | Int32 | Agent speech volume. Range: 0–400. Output volume = workflow speech output volume × volume ÷ 100. Negative values use the server default: 100 |
vadLevel | Int32 | VAD sensitivity setting. Default: 11. Valid range: [0, 11]
|
enableVoiceInterrupt | Bool | Enable intelligent interruption |
agentVoiceId | String? | Agent voice ID. Empty uses the agent’s default value |
enableIntelligentSegment | Bool | Enable intelligent sentence segmentation and merging |
useVoiceprint | Bool | Whether to apply voiceprint recognition with denoising to the current utterance. |
voiceprintId | String? | Voiceprint ID. Non-empty enables voiceprint denoising for this call |
agentMaxIdleTime | Int32 | Maximum idle time for the agent (seconds). Negative values use the server default: 600 seconds |
llmHistoryLimit | Int32 | Maximum history turns retained for LLM/multimodal LLM conversations. Negative values use the server default: 10 |
enablePushToTalk | Bool | Enable push-to-talk mode |
agentGracefulShutdown | Bool | Enable graceful shutdown: finish speaking the current sentence before stopping |
agentAvatarId | String? | Digital human model ID. Empty uses the agent’s default value |
asrLanguageId | String? | ASR language ID. Empty uses the agent’s default value. Options:
|
wakeUpQuery | String? | User command before call start. Used for immediate agent response after call starts |
llmSystemPrompt | String? | LLM system prompt, for example: “You are a friendly and helpful assistant…”Note: Not supported for LLM nodes using Alibaba Cloud Model Studio workflows |
asrHotWords | [String]? | ASR hotword list. Limit: up to 500 words. Each word: up to 10 characters |
interruptWords | [String]? | Specific words or phrases that trigger interruption, for example: “Hold on” or “I know” |
ARTCAICallChatSyncConfig
Configuration parameters for the associated chat agent session.
Property Name | Type | Description |
sessionId | String | Associated chat agent session ID |
agentId | String | Associated chat agent ID (must be in the same account and region) |
receiverId | String | User ID for the associated chat agent session |
ARTCAICallAgentShareConfig
Configuration information for agent sharing
Property Name | Type | Description |
shareId | String? | Agent share ID |
agentType | Agent workload type | |
expireTime | Date? | Time-to-live (TTL) |
region | String? | Region where the agent resides |
templateConfig | String? | Template configuration (JSON string) |
userData | [String: Any]? | User-defined information passed to the agent |
ARTCAICallVideoConfig
Configuration for the local video in a call
Property Name | Type | Description |
dimensions | CGSize | Stream ingestion resolution |
frameRate | Int | Stream ingestion frame rate |
bitrate | Int | Stream ingestion bitrate |
keyFrameInterval | Int | Stream ingestion keyframe interval (milliseconds) |
useHighQualityPreview | Bool | Use high-definition preview. Otherwise, the SDK adjusts automatically based on stream ingestion resolution |
cameraCaptureFrameRate | Int | Preview resolution |
useFrontCameraDefault | Bool | Start with the front camera by default |
ARTCAICallAgentConfig
Configuration for starting and running the call agent.
Property Name | Type | Description |
agentGreeting | String? | Agent greeting. Empty uses the agent’s default value |
wakeUpQuery | String? | User command before call start. Used for immediate agent response after call starts |
agentMaxIdleTime | Int32 | Maximum idle time for the agent (seconds). The agent shuts down automatically after timeout. Default: 600 seconds |
userOnlineTimeout | Int32 | Time for the agent to wait before ending the task if the user does not join. Default: 60 seconds |
userOfflineTimeout | Int32 | Time for the agent to wait before ending the task after the user leaves. Default: 5 seconds |
enablePushToTalk | Bool | Enable push-to-talk mode |
agentGracefulShutdown | Bool | Enable graceful shutdown |
volume | Int32 | Agent speech volume. Range: 0–400. Default: 100 |
workflowOverrideParams | [String: Any]? | Workflow override parameters |
enableIntelligentSegment | Bool | Smart sentence segmentation switch |
asrConfig | Speech recognition configuration | |
ttsConfig | Speech synthesis configuration | |
llmConfig | Large Language Model (LLM) configuration | |
avatarConfig | Digital human configuration | |
interruptConfig | Interrupt configuration | |
voiceprintConfig | Voiceprint denoising configuration | |
turnDetectionConfig | Turn detection configuration | |
experimentalConfig | Customized, non-production configuration | |
vcrConfig | VCR configuration | |
preConnectAudioUrl | String? | Sound effect to play after connection and before the greeting. Supports URL input. The greeting plays after the sound effect. |
ambientConfig | Environment configuration | |
backChannelingConfig | Configuration module for back-channeling. When configured, the system randomly plays short acknowledgments at specific trigger points. | |
autoSpeechForLlmPendingConfig | Auto-speech configuration for cases where the LLM response is delayed. | |
autoSpeechForUserIdleConfig | Configuration for agent questions when the user is silent. |
ARTCAICallAgentAmbientConfig
Call environment parameters
Property Name | Type | Description |
volume | Int32 | Background sound volume. Default: 100 |
resourceId | String? | Resource ID of the background sound registered in the console. An empty string disables it. |
ARTCAICallAgentAsrConfig
Speech recognition configuration
Property Name | Type | Description |
asrLanguageId | String? | ASR language ID. Empty uses the agent's default value. |
asrMaxSilence | Int32 | Voice segmentation threshold. Silence exceeding this duration is considered a sentence break. Default: 400 ms. Range: 200–1200 ms. |
asrHotWords | [String]? | ASR hotword list. Limit: up to 500 words. Each word: up to 10 characters. |
vadLevel | Int32 | VAD sensitivity setting. Default: 11. Valid range: [0, 11]
|
customParams | String? | Runtime parameters for custom ASR. Use URL parameter format, for example: "mode=fast&sample=16000&format=wav" |
vadDuration | Int32 | Minimum duration threshold for voice activity detection, used to adjust interruption sensitivity. Default: 0 (disabled). Valid range: 200–2000 ms. Common range: [200, 500], corresponding to 1 to 4 words. Negative values are not sent to the server (server default is disabled). |
asrMaxSilence | Int32 | Voice segmentation threshold. Silence exceeding this duration is considered a sentence break. Range: 200–1200 ms. Default: -1. Negative values use the agent's default configuration (console value). |
ARTCAICallAgentTtsConfig
Speech synthesis configuration
Property Name | Type | Description |
agentVoiceId | String? | Agent voice ID. Empty uses the agent's default value. |
pronunciationRules | [[String: Any]]? | Array of pronunciation rules. Up to 20 rules are supported. If nil or empty, no rules are used. Example: |
speechRate | Double | TTS playback speed. Supports all TTS types. Range: [0.5, 2.0]. Default: 1.0. Negative values are not sent to the server (uses console configuration). |
languageId | String? | TTS playback language code. Valid when TTS type is MiniMax. |
emotion | String? | TTS playback emotion type. Valid when TTS type is MiniMax. |
modelId | String? | TTS model ID. Currently only supports MiniMax. Options: speech-01-turbo, speech-02-turbo. |
speechRate | Double | TTS playback speed. Supports all TTS types. Range: [0.5, 2.0]. Default: -1. Negative values use the agent's default configuration (console value). |
ARTCAICallAgentLlmConfig
Large Language Model configuration
Property Name | Type | Description |
llmHistoryLimit | Int32 | Maximum history turns retained for LLM/multimodal LLM conversations. Default: -1. Negative values use the agent's default configuration (console value). |
llmSystemPrompt | String? | LLM system prompt. |
bailianAppParams | [String: Any]? | Parameters for the Model Studio Application Center. |
llmCompleteReply | boolean | Send the complete LLM result. Note When enabled, the complete LLM result is returned via the onLLMReplyCompleted event callback after generation. |
openAIExtraQuery | String? | Additional query parameters for OpenAI protocol LLMs. Note Parameters must be in key=value format, with multiple parameters joined by '&'. All values must be strings. |
outputMinLength | Int32 | Minimum text output length (characters). Text shorter than this is cached for concatenation. Range: [0, 100]. A value of 0 or less means no limit. Default: no limit. |
outputMaxDelay | Int32 | Maximum text output delay (milliseconds). Cached text is forcibly output after this time. Range: [1000, 10000]. A value of 0 or less means no limit. Default: no limit. |
historySyncWithTTS | boolean | Sync LLM message history with TTS playback content. Default: false. When enabled, the saved LLM message and TTS playback content are consistent, with minor discrepancies allowed. Note When a user interrupts the agent, the |
ARTCAICallAgentAvatarConfig
Digital human configuration
Property Name | Type | Description |
agentAvatarId | String? | Digital human model ID. Empty uses the agent's default value. |
ARTCAICallAgentInterruptConfig
Interrupt configuration
Property Name | Type | Description |
enableVoiceInterrupt | Bool | Enable intelligent interruption |
interruptWords | String? | Specific words or phrases that trigger interruption |
noInterruptMode | String? | Controls the ASR text processing policy for user speech when the agent is speaking and intelligent interruption is disabled. Valid values:
|
ARTCAICallAgentVoiceprintConfig
Voiceprint denoising configuration
Property Name | Type | Description |
useVoiceprint | Bool | Does the current sentence segmentation use voiceprint denoising detection? |
voiceprintId | String? | Voiceprint ID. Non-empty enables voiceprint denoising for this call. |
ARTCAICallAgentTurnDetectionConfig
Turn detection configuration
Property Name | Type | Description |
turnEndWords | [String]? | Specific words to end a turn, for example: "Done" or "I'm finished" |
mode | Method to detect when user speech ends. Default: Semantic, which uses AI for semantic analysis. | |
semanticWaitDuration | Int32 | Custom wait time for semantic segmentation (milliseconds). Range: [0, 10000]. Negative values are not sent to the server (uses server default of -1, where AI automatically determines the appropriate wait time). Note The semanticWaitDuration field is invalid in ARTCAICallTurnDetectionMode.Normal mode. |
eagerness | [String]? | This parameter is only effective when
|
ARTCAICallAgentVcrResult
VCR detection result
Property Name | Type | Description |
resultData | [String]? | All VCR detection results returned by the agent |
stillFrameMotionResult | FrameMotionResult? | VCR still frame detection result |
invalidFrameMotionResult | FrameMotionResult? | VCR invalid frame detection result |
peopleCountResult | PeopleCountResult? | VCR real-time people count detection result |
equipmentResult | EquipmentResult? | VCR electronic device detection result |
headMotionResult | HeadMotionResult? | VCR head motion detection result |
lookAwayResult | VCR gaze aversion detection result |
LookAwayResult
VCR gaze aversion detection result
Property Name | Type | Description |
count | Int32 | Total number of gaze aversions up to the current frame |
duration | Int32 | Total duration of gaze aversions up to the current frame (milliseconds) |
ARTCAICallAgentVcrConfig
VCR configuration
Property Name | Type | Description |
data | [String]? | Caches the JSON object passed by the user. This object is used later to generate a JSON string, allowing for custom extensions. |
stillFrameMotion | VCR still frame detection configuration | |
invalidFrameMotion | VCR invalid frame detection configuration | |
peopleCount | VCR real-time people count detection configuration | |
equipment | VCR electronic device detection configuration | |
headMotion | VCR head motion detection configuration | |
lookAway | VCR gaze aversion detection configuration |
ARTCAICallAgentVcrBaseConfig
Base VCR detection configuration
Property Name | Type | Description |
enable | Boolean | Enable this feature. Enabled by default. |
ARTCAICallAgentVcrFrameMotionConfig
VCR video frame detection configuration
Property Name | Type | Description |
callbackDelay | Int32 | Callback trigger delay in milliseconds. Default: 3000 ms |
ARTCAICallExperimentalConfig
Experimental parameters for controlling specific logic policies
Property Name | Type | Description |
rtcSdkParams | [String: Any]? | RTC SDK parameters |
commonParams | [String: Any]? | Common parameters |
ARTCAICallAgentAutoSpeechContent
Agent speech content for auto-speech scenarios (including acknowledgments, proactive questions, etc.)
Property Name | Type | Description |
probability | Double | Trigger probability. Range: 0.0–1.0 |
text | String | Prompt text, UTF-8 encoded. Example: "Are you still there?". Maximum length: 20 characters for acknowledgments, 100 characters for auto-replies. |
ARTCAICallAgentAutoSpeechLlmPending
Auto-speech configuration for cases where the LLM response is delayed
Property Name | Type | Description |
waitTime | Int32 | Wait time threshold in milliseconds. A prompt is triggered after this duration. Range: 500–10000 ms. Cannot be empty. |
messages | Collection of waiting prompts. Maximum 10 items. Each item ≤ 100 characters. Total probability must be 1.0. |
ARTCAICallAgentAutoSpeechUserIdle
Configuration for agent questions when the user is silent
Property Name | Type | Description |
waitTime | Int32 | Silence duration threshold in milliseconds. A question is triggered after this duration. Range: 5000–600000 ms. Recommended: 10000. |
maxRepeats | Int32 | Maximum number of questions. Range: 0–10. Recommended: 5. After exceeding, no more questions are triggered, and the call is ended. |
messages | Collection of waiting prompts. Maximum 10 items. Each item ≤ 100 characters. Total probability must be 1.0. |
ARTCAICallAgentBackChanneling
Back-channeling configuration module
Property Name | Type | Description |
enable | boolean | Is the Echo feature enabled? |
triggerStage | String | Back-channeling trigger timing |
probability | Double | Trigger probability. Range: 0.0–1.0 |
words | Collection of acknowledgment phrases. Maximum 10 items. Each item ≤ 20 characters. Total probability must be 1.0. |