All Products
Search
Document Center

Intelligent Media Services:Data Structures

Last Updated:Mar 19, 2026

Read this topic to learn about the data types used in the Android SDK.

Data Structures Overview

Note

The legacy SDK version includes deprecated parameters and methods. Upgrade to the latest SDK version. For more information, see Android User Guide.

Structure Type

Data Type

Description

Enum

ARTCAICallAgentType

AI agent type

ARTCAICallRobotState

Current robot state

AICallErrorCode

An error occurred during the AI call

VoicePrintStatusCode

ASR status code

ARTCAICallSpeakingInterruptedReason

Reason the agent speech was interrupted

ARTCAICallAudioProfile

Audio encoding configuration

ARTCAICallAudioScenario

Audio scenario configuration

ARTCAICallVideoRenderMode

Video rendering mode

ARTCAICallVideoRotationMode

Video rotation angle

ARTCAICallVideoRenderMirrorMode

Video rendering mirror mode

ARTCAICallConnectionStatus

Network connection status during the call

ARTCAICallTurnDetectionMode

Method to detect when user speech ends

Class

ARTCAICallConfig

Call configuration object

ARTCAICallVideoConfig

Video configuration parameters

ARTCAICallAudioConfig

Audio configuration parameters

ARTCAICallAgentTemplateConfig (deprecated)

Configurable parameters for the call

ARTCAICallChatSyncConfig

Configuration for synchronizing chat agent conversation history

ARTCAICallAgentInfo

Agent runtime information

ARTCAICallSendTextToAgentRequest

Request object to send text to the agent

ARTCAICallVisionCustomCaptureRequest

Request object for custom frame capture by a multimodal large language model

ARTCAICallVideoCanvas

Video rendering configuration object

ARTCAICallAgentConfig

Configurable parameters for the call

ARTCAICallAgentAsrConfig

Speech recognition configuration

ARTCAICallAgentTtsConfig

Speech synthesis configuration

ARTCAICallAgentLlmConfig

Large Language Model (LLM) configuration

ARTCAICallAgentAvatarConfig

Digital human configuration

ARTCAICallAgentInterruptConfig

Interrupt configuration

ARTCAICallAgentVoiceprintConfig

Voiceprint denoising configuration

ARTCAICallAgentTurnDetectionConfig

Turn detection configuration

ARTCAICallAgentVcrResult

VCR detection result

FrameMotionResult

Video frame detection result from VCR

PeopleCountResult

People count detection result from VCR

EquipmentResult

Electronic device detection result from VCR

HeadMotionResult

Head motion detection result from VCR

ARTCAICallAgentVcrConfig

VCR configuration

ARTCAICallAgentVcrBaseConfig

Base detection configuration for VCR

ARTCAICallAgentVcrFrameMotionConfig

Video frame detection configuration for VCR

ARTCAICallExperimentalConfig

Experimental parameters for controlling specific logic policies

ARTCAICallAgentAmbientConfig

Call environment parameters

ARTCAICallAgentAutoSpeechContent

Agent speech content for auto-speech scenarios, such as acknowledgments or proactive questions

ARTCAICallAgentAutoSpeechLlmPending

Auto-speech configuration for the agent when LLM response is delayed

ARTCAICallAgentAutoSpeechUserIdle

Configuration for agent queries when the user is silent

ARTCAICallAgentBackChanneling

Configuration module for acknowledgment speech. When enabled, the agent randomly speaks short acknowledgments at specific trigger times

Data Structure Details

Enum

ARTCAICallAgentType

The AI agent type.

Enumeration Name

Description

VoiceAgent

Voice-only call

AvatarAgent

Digital human call

VisionAgent

Visual understanding call

VideoAgent

Video call

ChatBot

Chat message interaction

ARTCAICallRobotState

The current state of the robot.

Enumeration name

Description

Listening

Listening

Thinking

Thinking

Speaking

Speaking

AICallErrorCode

Indicates an error that occurred during the AI call.

Enumeration Name

Description

None

None

InvalidAction

Invalid API call

InvalidParams

Invalid parameter passed to the API

StartFailed

Failed to start the call

AgentSubscriptionRequired

Call initiation failed due to exceeding the daily free trial quota

AgentNotFund

Agent not found

TokenExpired

Call authentication token expired

ConnectionFailed

Connection failed. Call disconnected

KickedByUserReplace

Same user ID joined the same call on different devices, preventing the call from proceeding

KickedBySystem

Kicked out by the system, preventing the call from proceeding

LocalDeviceException

Local device issue prevented the call from proceeding

AgentLeaveChannel

Agent left the channel. Call ended

AgentConcurrentLimit

Digital human agent reached concurrent limit

AgentAudioSubscribeFailed

Failed to subscribe to agent audio

AiAgentAsrUnavailable

Third-party ASR service failed to start

AvatarAgentUnavailable

Digital human service unavailable

ChatLogNotFound

Chat log not found

InternalError

Internal error

VoicePrintStatusCode

The status code for Automatic Speech Recognition (ASR).

Enumeration Name

Description

Disable

Voiceprint denoising VAD disabled and AIVAD disabled

EnableWithoutRegister

Voiceprint denoising VAD enabled but voiceprint registration incomplete

SpeakerRecognized

Voiceprint denoising VAD enabled and speaker recognized

SpeakerNotRecognized

Voiceprint denoising VAD enabled but speaker not recognized

DetectedSpeakerWithAIVad

AIVAD enabled and speaker detected

UndetectedSpeakerWithAIVad

AIVAD enabled but speaker not detected

Unknown

Unknown status

ARTCAICallSpeakingInterruptedReason

The reason why the agent's speech was interrupted.

Enumeration Name

Description

UnKnown

Unknown reason

ByWorks

Specific interrupt words detected

ByVoice

Voice interruption detected

ByInterruptSpeaking

Interrupted by calling the interruptSpeaking() API

BySpeechBroadCast

Interrupted by proactive speech broadcast

ByLlmQuery

Interrupted by proactive LLM query

ARTCAICallAudioProfile

The audio encoding configuration.

Enumeration Name

Description

ARTCAICallAudioLowQualityMode

Low-quality audio mode. Default sample rate: 8000 Hz. Mono. Max bitrate: 12 kbps

ARTCAICallAudioBasicQualityMode

Standard-quality audio mode. Default sample rate: 16000 Hz. Mono. Max bitrate: 24 kbps

ARTCAICallAudioHighQualityMode

(Default) High-quality audio mode. Default sample rate: 48000 Hz. Mono. Max bitrate: 64 kbps

ARTCAICallAudioStereoHighQualityMode

Stereo high-quality audio mode. Default sample rate: 48000 Hz. Stereo. Max bitrate: 80 kbps

ARTCAICallAudioSuperHighQualityMode

Super-high-quality audio mode. Default sample rate: 48000 Hz. Mono. Max bitrate: 96 kbps

ARTCAICallAudioStereoSuperHighQualityMode

Stereo super-high-quality audio mode. Default sample rate: 48000 Hz. Stereo. Max bitrate: 128 kbps

ARTCAICallAudioScenario

The audio scenario configuration.

Enumeration Name

Description

ARTCAICallAudioSceneDefaultMode

Recommended for general real-time communication scenarios

ARTCAICallAudioSceneMusicMode

High-fidelity music audio quality. Recommended for music instruction and other scenarios requiring high music quality

ARTCAICallVideoRenderMode

The video rendering mode.

Enumeration Name

Description

ARTCAICallVideoRenderModeAuto

Automatic mode

ARTCAICallVideoRenderModeStretch

Stretch tiling mode. Stretches input video to match the stream ingest aspect ratio. May distort the image

ARTCAICallVideoRenderModeFill

Cropping mode. Crops input video width or height to match the stream ingest aspect ratio. May lose image content

ARTCAICallVideoRenderModeNoChange

No change

ARTCAICallVideoRotationMode

The video rotation angle.

Enumeration Name

Description

ARTCAICallVideoRotationMode_0

0 degrees

ARTCAICallVideoRotationMode_90

90 degrees

ARTCAICallVideoRotationMode_180

180 degrees

ARTCAICallVideoRotationMode_270

270 degrees

ARTCAICallVideoRenderMirrorMode

The video rendering mirror mode.

Enumeration Name

Description

ARTCAICallVideoRenderMirrorModeOnlyFront

Mirror only front camera preview. No mirroring for others

ARTCAICallVideoRenderMirrorModeAllEnabled

Mirror all

ARTCAICallVideoRenderMirrorModeAllDisable

Do not mirror any

ARTCAICallTurnDetectionMode

The method used to detect the end of a user's speech.

Enumeration Name

Description

ARTCAICallTurnDetectionNormalMode

Normal mode. Does not use AI for semantic analysis. Uses ASR silence duration to detect speech end

ARTCAICallTurnDetectionSemanticMode

Semantic mode. Uses AI to analyze context and semantics to detect speech end

ARTCAICallConnectionStatus

The network connection status during the call.

Enumeration Name

Value

Description

ARTCAICallConnectionStatusInit

0

Initialization complete

ARTCAICallConnectionStatusDisconnected

1

Network connection disconnected

ARTCAICallConnectionStatusConnecting

2

Establishing network connection

ARTCAICallConnectionStatusConnected

3

Network connected

ARTCAICallConnectionStatusReconnecting

4

Re-establishing network connection

ARTCAICallConnectionStatusFailed

5

Network connection failed

Class

ARTCAICallConfig

The call configuration object.

Parameter

Type

Description

agentId

String

Agent ID

agentType

ARTCAICallAgentType

Agent type. Must match the type of agentId. Otherwise, agent startup fails

agentUserId

String

User ID for the agent. If empty, the agent service assigns a UID

region

String

Region where the agent service resides. Must match the region of agentId. Otherwise, agent startup fails

userData

String

User-defined information. Passed to the agent

enableAudioDelayInfo

boolean

Enable audio delay statistics. Enabled by default

agentConfig

ARTCAICallAgentConfig

agentConfig parameter used to start the call

audioConfig

ARTCAICallAudioConfig

Local audio configuration

videoConfig

ARTCAICallVideoConfig

Local video configuration. Applies only to VisionAgent or VideoAgent

chatSyncConfig

ARTCAICallChatSyncConfig

Associated chat agent configuration. If set, call history syncs to the chat agent during the call

mAiCallVideoConfig

ARTCAICallVideoConfig

Video-related configuration (deprecated starting with version 2.5)

mAliCallAudioConfig

ARTCAICallAudioConfig

Audio-related configuration (deprecated starting with version 2.5)

mAiCallAgentTemplateConfig

ARTCAICallAgentTemplateConfig (deprecated)

Configurable parameters for AI audio-video calls (deprecated starting with version 2.5)

mAiCallChatSyncConfig

ARTCAICallChatSyncConfig

Chat synchronization configuration. Integrates call agent and chat agent conversation content into one session (deprecated starting with version 2.5)

ARTCAICallVideoConfig

The video configuration parameters.

Parameter

Type

Meaning

useHighQualityPreview

boolean

Use local high-definition preview

useFrontCameraDefault

boolean

Start front camera by default

cameraCaptureFrameRate

int

Camera capture frame rate

useSurfaceView

boolean

Use SurfaceView for rendering. If false, use TextureView

videoEncoderWidth

int

Video encoding width

videoEncoderHeight

int

Video encoding height

videoEncoderFrameRate

int

Video encoding frame rate

videoEncoderBitRate

int

Video encoding bitrate

videoEncoderKeyFrameInterval

int

Keyframe interval in milliseconds

isCameraMute

boolean

Mute local video. Not muted by default

ARTCAICallAudioConfig

The audio configuration parameters.

Parameter

Type

Description

audioProfile

ARTCAICallAudioProfile

Audio encoding configuration. Default: ARTCAICallAudioHighQualityMode. Set this parameter to specify sample rate and number of sound channels

audioScenario

ARTCAICallAudioScenario

Audio scenario configuration. Default: ARTCAICallAudioSceneMusicMode. Use ARTCAICallAudioSceneDefaultMode for Bluetooth audio capture

enableSpeaker

boolean

Play audio through speaker or earpiece. Speaker enabled by default. Set to false to switch to earpiece

isMicrophoneOn

boolean

Enable microphone. Enabled by default

ARTCAICallAgentTemplateConfig (deprecated)

The configurable parameters for the call.

Important

This type is deprecated starting with version 2.5. Use ARTCAICallAgentConfig instead.

Parameter

Type

Description

aiAgentId

String

Agent ID. Set this field when initiating an agent call using the AICallKit call interface.

Note

Agent ID must be set and cannot be empty.

aiAgentRegion

String

Region where the agent service resides. Default: cn-shanghai.

Note

Set the region based on your agent.

aiAgentUserId

String

User ID associated with the agent. If not set, the AI server generates a random user ID

userExtendData

String

Business extension information. Must be a JSON string. Passed to the LLM.

aiAgentGreeting

String

Agent greeting. Spoken by the AI agent when the user joins the call

aiAgentUserOnlineTimeout

int

Time in seconds before the agent closes the task if the user does not join the call. Default: 60 seconds

aiAgentUserOfflineTimeout

int

Time in seconds before the agent closes the task after the user leaves the call. Default: 5 seconds

aiAgentWorkflowOverrideParams

String

Workflow override parameters. Default: none

aiAgentBailianAppParams

String

Bailian application center parameters. For more information, see Passing User Information to Bailian.

aiAgentAsrMaxSilence

int

Maximum silence duration for speech recognition in milliseconds. Range: 200–1200 ms. Default: 400 ms

aiAgentVolume

int

Agent speaking volume. Range: -1 to 100. Default: -1. If unset, uses Alibaba Cloud's recommended adaptive volume mode

enableVoiceInterrupt

boolean

Enable voice interruption. Default: true

enableIntelligentSegment

boolean

Enable intelligent sentence segmentation. Default: true

enableVoicePrint

boolean

Enable voiceprint recognition. Default: false. To enable, set enableVoicePrint to true and provide voiceprintId

voiceprintId

String

Voiceprint ID. If enableVoicePrint is true and voiceprintId is non-empty, voiceprint denoising is enabled. Empty means voiceprint denoising is disabled

aiAgentVoiceId

String

Agent voice ID

aiAgentMaxIdleTime

int

Maximum idle time in seconds before the agent goes offline automatically. Set to -1 to disable automatic logout. Default: 600 seconds

llmHistoryLimit

int

Maximum number of LLM/multimodal LLM conversation history turns to retain. Less than zero uses the server default: 10 turns

aiAgentGracefulShutdown

boolean

Enable graceful shutdown. Default: false

Note

Graceful shutdown means the agent finishes speaking its current sentence before stopping. Maximum duration: 10 seconds

enablePushToTalk

boolean

Enable push-to-talk mode. Default: false

aiAgentAvatarId

String

Digital human model ID. Specify when the agent type is AvatarAgent

asrLanguageId

String

ASR language ID. Empty uses the agent's default configuration. Options:

  • zh_mandarin: Chinese

  • en: English

  • zh_en: Chinese-English mixed

  • es: Spanish

  • jp: Japanese

wakeUpQuery

String

Wake-up phrase. Spoken by the user before the call starts. The agent responds immediately after the call starts. Example: "What's the weather like today?"

llmSystemPrompt

String

LLM system prompt. Example: "You are a friendly and helpful assistant focused on providing accurate information and advice."

Note

Not supported for LLM nodes configured as Bailian workflows.

interruptWords

List<String>

Trigger words for interrupting the conversation. Examples: "Hold on", "I know"

aiAgentLlmHistoryLimit

int

Maximum number of LLM/multimodal LLM conversation history turns to retain. Default: 10 turns

aiAgentVadLevel

int

VAD sensitivity setting for AIVAD. Higher values increase resistance to voice interference. Range: [0–11]. Default: 11

  • 0 disables VAD.

  • 1–10: Higher numbers make interruption harder.

  • 11 significantly improves voice preservation and noise resistance.

ARTCAICallChatSyncConfig

The configuration for synchronizing the chat agent's conversation history.

Parameter

Type

Description

sessionId

String

Unique identifier for a user-agent conversation. Default: empty

chatBotAgentId

String

Agent ID for the associated chat bot

receiverId

String

Receiver ID for the chat bot agent. This is the user's user ID

ARTCAICallAgentInfo

The agent's runtime information.

Parameter

Type

Description

agentId

String

Current agent ID

agentType

ARTCAICallAgentType

Agent type

agentUserId

String

User ID for the agent in the RTC channel

channelId

String

RTC channel ID where the agent resides

instanceId

String

Instance ID for the current agent. Assigned by the system when the agent starts. Used to identify and track the agent's full lifecycle and runtime status

requestId

String

Request ID for the current agent

ARTCAICallSendTextToAgentRequest

The request object for sending text to the agent.

Parameter

Type

Description

text

String

Text message sent to the agent. Example: "What is this?"

ARTCAICallVisionCustomCaptureRequest

The request object for custom frame capture by a multimodal large language model.

Parameter

Type

Description

text

String

Text parameter for the multimodal large language model request

enableASR

boolean

Include ASR results from user speech as input to the large language model. If true, use ASR results and captured frames. If false, use the text field and captured frames

Note

Applies only during continuous frame capture

isSingle

boolean

When set to true, initiates a single-frame capture and immediately exits the custom frame-capture state.

A value of false enables continuous frame capturing, which automatically stops after the specified duration.

eachDuration

int

Frame capture interval in seconds

num

int

Number of images captured per frame

duration

int

Duration for continuous frame capture in seconds. Ignored if isSingle=true. Custom frame capture ends when this duration expires

userData

String

JSON string containing custom business information. Sent to the large language model along with text and captured frames for enterprise processing

ARTCAICallVideoCanvas

The video rendering configuration object.

Parameter

Type

Description

renderMode

ARTCAICallVideoRenderMode

Rendering mode. Default: ARTCAICallVideoRenderModeAuto

mirrorMode

ARTCAICallVideoRenderMirrorMode

Mirror mode. Default: ARTCAICallVideoRenderMirrorModeOnlyFront

rotationMode

ARTCAICallVideoRotationMode

Rotation angle. Default: ARTCAICallVideoRotationMode_0

zOrderOnTop

boolean

Set whether SurfaceView should appear above all other windows. Default: true

zOrderMediaOverlay

boolean

Set whether SurfaceView should appear above MediaPlayer, Camera, and similar windows. Default: true

ARTCAICallAgentConfig

The configurable parameters for the call.

Parameter

Type

Description

agentGreeting

String

Agent greeting. Empty uses the agent's configuration value. Maximum length: 100 characters

wakeUpQuery

String

Instruction spoken by the user before the call starts. The agent responds immediately after the call starts

agentMaxIdleTime

int

Maximum idle time in seconds before the agent goes offline automatically. Default: 600 seconds

userOnlineTimeout

int

Time in seconds before the agent closes the task if the user does not join the call. Default: 60 seconds

userOfflineTimeout

int

Time in seconds before the agent closes the task after the user leaves the call. Default: 5 seconds

enablePushToTalk

boolean

Enable push-to-talk mode

agentGracefulShutdown

boolean

Enable graceful shutdown: finish speaking the current sentence before stopping

volume

int

Agent speaking volume. Range: 0–400. Default: 100

workflowOverrideParams

String

Workflow override parameters

enableIntelligentSegment

boolean

Enable intelligent sentence segmentation

asrConfig

ARTCAICallAgentAsrConfig

Speech recognition configuration

ttsConfig

ARTCAICallAgentTtsConfig

Speech synthesis configuration

llmConfig

ARTCAICallAgentLlmConfig

Large Language Model (LLM) configuration

avatarConfig

ARTCAICallAgentAvatarConfig

Digital human configuration

interruptConfig

ARTCAICallAgentInterruptConfig

Interrupt configuration

voiceprintConfig

ARTCAICallAgentVoiceprintConfig

Voiceprint denoising configuration

turnDetectionConfig

ARTCAICallAgentTurnDetectionConfig

Turn detection configuration

experimentalConfig

ARTCAICallExperimentalConfig

Non-production customization configuration

ARTCAICallAgentVcrConfig

ARTCAICallAgentVcrConfig

VCR configuration

preConnectAudioUrl

String

Audio effect played after connection and before the greeting. Supports URL input. Greeting still plays after the audio effect

ambientConfig

ARTCAICallAgentAmbientConfig

Ambient sound configuration

backChannelingConfig

List<ARTCAICallAgentBackChanneling>

Configuration module for acknowledgment speech. After configuration, the system randomly speaks short acknowledgments at specific trigger times

autoSpeechForLlmPendingConfig

ARTCAICallAgentAutoSpeechLlmPending

Auto-speech configuration for the agent when LLM response is delayed

autoSpeechForUserIdleConfig

ARTCAICallAgentAutoSpeechUserIdle

Configuration for agent queries when the user is silent

ARTCAICallAgentAsrConfig

The speech recognition configuration.

Parameter

Type

Description

asrLanguageId

String

ASR language ID. Empty uses the agent's configuration value

asrMaxSilence

int

Speech segmentation threshold. Speech ends if silence exceeds this duration. Default: 400 ms. Range: 200–1200 ms

asrHotWords

List<String>

ASR hot word list. Limit: upto 500 words. Each word: up to 10 characters

vadLevel

int

VAD sensitivity setting for AIVAD. Default: 11. Range: [0–11]

  • 0 disables VAD.

  • 1–10: Higher numbers make interruption harder.

  • 11 differs significantly from previous versions, reducing voice distortion during preprocessing and providing greater resistance to interference.

customParams

String

Runtime parameters for custom ASR integration. Use URL parameter format. Example: "mode=fast&sample=16000&format=wav"

vadDuration

int

Minimum duration threshold for voice activity detection. Controls interruption sensitivity. Default: 0 (disabled). Valid range: 200–2000 ms. Common values: [200, 500] (1–4 characters)

asrMaxSilence

Int32

Speech segmentation threshold. Speech ends if silence exceeds this duration. Range: 200–1200 ms. Default: -1 (uses agent's default configuration from the console)

ARTCAICallAgentTtsConfig

The speech synthesis configuration.

Parameter

Type

Description

agentVoiceId

String

Agent voice ID. Empty uses the agent's configuration value

pronunciationRules

List

Pronunciation rule array. Supports up to 20 rules. If nil or empty, no rules apply. Example:

 [
  {
      "Word": "overlap",                       // Target word
      "Pronunciation": "chǒng dié",              // Replacement pronunciation
      "Type": "replacement"                // Homograph rule
  },
  {
      "Word": "action",
      "Pronunciation": "háng dòng",
      "Type": "replacement"
  }
]

speechRate

double

TTS speech rate. Applies to all TTS types. Range: [0.5, 2.0]. Default: 1.0. Values less than 0 use console configuration

languageId

String

TTS language code. Valid only for MiniMax TTS type

emotion

String

TTS emotion type. Valid only for MiniMax TTS type

modelId

String

TTS model ID. Currently supports only minimax. Options: speech-01-turbo / speech-02-turbo

speechRate

Double

TTS speech rate. Applies to all TTS types. Range: [0.5, 2.0]. Default: -1. Values less than 0 use agent's default configuration (console configuration)

ARTCAICallAgentLlmConfig

The Large Language Model (LLM) configuration.

Parameter

Type

Description

llmHistoryLimit

int

Maximum number of LLM/multimodal LLM conversation history turns to retain. Default: -1. Values less than 0 use agent's default configuration (console configuration)

llmSystemPrompt

String

LLM system prompt

bailianAppParams

String

Bailian application center parameters

llmCompleteReply

boolean

Send complete LLM results

Note

When enabled, the full LLM result is returned via the onLLMReplyCompleted event callback after LLM generation completes

openAIExtraQuery

String

Additional OpenAI protocol LLM query parameters

Note

Parameters must use key=value format. Separate multiple parameters with &. All values must be strings

outputMinLength

int

Minimum text output length in characters. Text shorter than this is cached for concatenation. Range: [0, 100]. Values ≤ 0 mean no limit. Default: no limit

outputMaxDelay

int

Maximum text output delay in milliseconds. Force output cached text after this time. Range: [1000, 10000]. Values ≤ 0 mean no limit. Default: no limit

historySyncWithTTS

boolean

Sync large model message history with TTS playback content. Default: false. When enabled, message history matches TTS playback content with minor tolerance

Note

When the user interrupts the agent, the next message sent to the large model inserts the <ims_agent_interrupted> tag at the interruption point. Example:

[
  {"role": "user", "content": "Tell me a story."},
  {"role": "assistant", "content": "Sure, I'll tell you a story from Romance of the Three Kingdoms. Do you<ims_agent_interrupted> want to hear it?"},
  {"role": "user", "content": "Let's pick another one."}
]

ARTCAICallAgentAvatarConfig

The digital human configuration.

Parameter

Type

Meaning

agentAvatarId

String

Digital human model ID. Empty uses the agent's configuration value

ARTCAICallAgentInterruptConfig

The interrupt configuration.

Parameter

Type

Description

enableVoiceInterrupt

boolean

Enable intelligent interruption

interruptWords

List<String>

Specific words or phrases that trigger conversation interruption

noInterruptMode

String

ASR text handling strategy when the agent speaks and intelligent interruption is disabled. Options:

  • cache: Cache ASR text. Process it after the current turn ends

  • discard: Discard ASR text immediately

  • Other values (including empty): Use server default configuration

ARTCAICallAgentVoiceprintConfig

The voiceprint denoising configuration.

Parameter

Type

Description

useVoiceprint

boolean

Is voiceprint denoising detection used in current sentence segmentation?

voiceprintId

String

Voiceprint ID. Non-empty means voiceprint denoising is enabled

ARTCAICallAgentTurnDetectionConfig

The turn detection configuration.

Parameter

Type

Description

turnEndWords

List<String>

Words that signal speech end. Examples: "Done", "I'm finished"

mode

ARTCAICallTurnDetectionMode

Method to detect when user speech ends. Default: semantic mode. Uses AI to analyze context and semantics to detect speech end. Default value: ARTCAICallTurnDetectionSemanticMode

semanticWaitDuration

int

Custom wait time for semantic speech end detection in milliseconds. Range: [0, 10000]. Default: -1 (AI selects optimal wait time)

Note

Ignored if mode is ARTCAICallTurnDetectionNormalMode

eagerness

[String]?

Applies only when mode = "Semantic". Overrides SemanticWaitDuration. Controls how quickly AI responds after detecting user pause:

  • Low: Wait up to 6 seconds. Reduces false interruptions

  • Medium: Wait up to 4 seconds. Balanced for most scenarios

  • High: Wait up to 2 seconds. Faster interaction but higher false interruption risk

  • Other values (including empty): Use server default configuration

ARTCAICallAgentVcrResult

The VCR detection result.

Parameter

Type

Description

resultData

Object

All VCR detection results returned by the agent

stillFrameMotionResult

FrameMotionResult

VCR still frame detection results

invalidFrameMotionResult

FrameMotionResult

VCR Invalid Frame Detection Results

peopleCountResult

PeopleCountResult

VCR real-time people count detection result

equipmentResult

EquipmentResult

VCR electronic device detection result

headMotionResult

HeadMotionResult

VCR head motion detection result

lookAwayResult

LookAwayResult

VCR gaze deviation detection result

LookAwayResult

The VCR gaze deviation detection result.

Property Name

Type

Description

count

int

Total number of gaze deviations detected up to the current frame

duration

int

Total duration of gaze deviation up to the current frame in milliseconds

FrameMotionResult

The VCR video frame detection result.

Parameter

Type

Meaning

duration

int

Time elapsed since sending this frame in milliseconds

PeopleCountResult

The VCR people count detection result.

Parameter

Type

Description

count

int

Number of people detected by VCR

EquipmentResult

The VCR electronic device detection result.

Parameter

Type

Description

mobilePhoneCount

int

Total number of devices

watchCount

int

Number of watches

headPhoneCount

int

Number of headphones

HeadMotionResult

The VCR head motion detection result.

Parameter

Type

Description

nodDetected

boolean

OK

shakeDetected

boolean

Shake your head

ARTCAICallAgentVcrConfig

The VCR configuration.

Parameter

Type

Description

data

JSONObject

JSON object passed by the user. Cached for later use when generating JSON strings. Enables custom extension

stillFrameMotion

ARTCAICallAgentVcrFrameMotionConfig

VCR Silent Frame Detection Configuration

invalidFrameMotion

ARTCAICallAgentVcrFrameMotionConfig

VCR invalid frame detection configuration

peopleCount

ARTCAICallAgentVcrBaseConfig

VCR real-time people count detection configuration

equipment

ARTCAICallAgentVcrBaseConfig

VCR electronic device detection configuration

headMotion

ARTCAICallAgentVcrBaseConfig

VCR head motion detection configuration

lookAway

ARTCAICallAgentVcrBaseConfig

VCR gaze deviation detection configuration

ARTCAICallAgentVcrBaseConfig

The VCR base detection configuration.

Parameter

Type

Description

enable

boolean

Do you want to enable this?

ARTCAICallAgentVcrFrameMotionConfig

The VCR video frame detection configuration.

Parameter

Type

Description

callbackDelay

int

Delay in milliseconds before triggering the callback

ARTCAICallExperimentalConfig

The experimental parameters for controlling specific logic policies.

Parameter

Type

Description

rtcSdkParams

JSONObject

RTC SDK parameters

commonParams

JSONObject

Common parameters

ARTCAICallAgentAmbientConfig

The call environment parameters.

Property Name

Type

Description

volume

int

Ambient sound volume. Default: 100

resourceId

String

Resource ID registered for ambient sound in the console. Empty string disables ambient sound

ARTCAICallAgentAutoSpeechContent

The agent speech content for auto-speech scenarios, such as acknowledgments or proactive questions.

Property Name

Type

Description

probability

Double

Trigger probability. Range: 0.0–1.0

text

String

Prompt text. UTF-8 encoded. Example: "Are you still there?" Maximum length: 20 characters for acknowledgment speech. Maximum length: 100 characters for auto-reply scenarios

ARTCAICallAgentAutoSpeechLlmPending

The auto-speech configuration for the agent when the LLM response is delayed.

Property Name

Type

Description

waitTime

int

Wait threshold in milliseconds. Trigger prompt after exceeding this duration. Range: 500–10000 ms. Required

messages

List<ARTCAICallAgentAutoSpeechContent>

Collection of waiting prompts. Maximum: 10 items. Each item ≤ 100 characters. Total probability must equal 1.0

ARTCAICallAgentAutoSpeechUserIdle

The configuration for agent queries when the user is silent.

Property Name

Type

Description

waitTime

int

Silence threshold in milliseconds. Trigger query after exceeding this duration. Range: 5000–600000 ms. Recommended: 10000 ms

messages

List<ARTCAICallAgentAutoSpeechContent>

Collection of waiting prompts. Maximum: 10 items. Each item ≤ 100 characters. Total probability must equal 1.0

maxRepeats

int

Maximum number of queries. Range: 0–10. Recommended: 5. After exceeding this, stop querying and end the call

ARTCAICallAgentBackChanneling

The configuration module for acknowledgment speech.

Property Name

Type

Description

enable

boolean

Is the Echo Feature enabled?

triggerStage

String

Timing of Associated Triggers

probability

double

Trigger probability. Range: 0.0–1.0

words

List<ARTCAICallAgentAutoSpeechContent>

Collection of acknowledgment speech content. Maximum: 10 items. Each item ≤ 20 characters. Total probability must equal 1.0

IARTCAICallService Details

generateAIAgentShareCall

Requests to start a shared agent call.

/**
 * Request to start a shared agent call
 * @param userId User ID of the currently logged-in user
 * @param aiAgentId Agent ID
 * @param aiAgentType Agent type
 * @param artcaiCallConfig Agent configuration
 * @param callback Callback for the request
 */
void generateAIAgentShareCall(String userId, String aiAgentId, ARTCAICallEngine.ARTCAICallAgentType aiAgentType, ARTCAICallEngine.ARTCAICallConfig artcaiCallConfig, IARTCAICallServiceCallback callback);

ARTCAIAgentUtil Details

parseAiAgentShareInfo

Parses shared agent information.

/**
 * Parse shared agent information
 * @param shareInfoText Shared agent information text
 * @return Structured configuration for the shared agent
 */
public static ARTCAIAgentShareInfo parseAiAgentShareInfo(String shareInfoText);

parseAiAgentInfo

Parses agent startup response information.

/**
 * Parse agent startup response information
 * @param jsonObject Agent startup response information
 * @return Structured information for the agent startup response
 */
public static ARTCAIAgentInfo parseAiAgentInfo(JSONObject jsonObject);