All Products
Search
Document Center

Intelligent Media Services:Data structures

Last Updated:Nov 13, 2025

This topic describes the data structures used by the Android software development kit (SDK).

Data structure overview

Note

Older versions of the SDK contain deprecated parameters and methods. We recommend that you upgrade to the latest version. For more information, see Android User Guide.

Structure type

Data type

Description

Enum

ARTCAICallAgentType

AI agent type

ARTCAICallRobotState

The current state of the robot

AICallErrorCode

An error occurred during the current AI call.

VoicePrintStatusCode

The status code returned by automatic speech recognition (ASR).

ARTCAICallSpeakingInterruptedReason

The reason why the agent's speech was interrupted.

ARTCAICallAudioProfile

Audio encoding configuration

ARTCAICallAudioScenario

Audio scenario configuration

ARTCAICallVideoRenderMode

Video rendering mode

ARTCAICallVideoRotationMode

Video rotation angle

ARTCAICallVideoRenderMirrorMode

Video rendering image mode

ARTCAICallTurnDetectionMode

The mode for determining whether the user has finished speaking.

Class

ARTCAICallConfig

Call configuration object

ARTCAICallVideoConfig

Video configuration parameters

ARTCAICallAudioConfig

Audio configuration parameters

ARTCAICallAgentTemplateConfig (Deprecated)

Configurable parameters for a call

ARTCAICallChatSyncConfig

Configuration for synchronizing agent chat records in a message-based conversation.

ARTCAICallAgentInfo

Agent runtime information

ARTCAICallSendTextToAgentRequest

Request object for sending text to an agent

ARTCAICallVisionCustomCaptureRequest

Request object for custom frame capture for the visual large model.

ARTCAICallVideoCanvas

Video rendering configuration object

ARTCAICallAgentConfig

Configurable parameters for a call

ARTCAICallAgentAsrConfig

Speech recognition configuration

ARTCAICallAgentTtsConfig

Speech synthesis configuration

ARTCAICallAgentLlmConfig

Large language model configuration

ARTCAICallAgentAvatarConfig

Digital human configuration

ARTCAICallAgentInterruptConfig

Interruption configuration

ARTCAICallAgentVoiceprintConfig

Voiceprint-based noise reduction configuration

ARTCAICallAgentTurnDetectionConfig

Turn detection configuration

ARTCAICallAgentVcrResult

VCR detection result

FrameMotionResult

Video frame detection result from VCR

PeopleCountResult

People count detection result from VCR

EquipmentResult

Electronic device detection result from VCR

HeadMotionResult

Head motion detection result from VCR

ARTCAICallAgentVcrConfig

VCR configuration

ARTCAICallAgentVcrBaseConfig

Basic VCR detection configuration

ARTCAICallAgentVcrFrameMotionConfig

VCR video frame detection configuration

ARTCAICallExperimentalConfig

Experimental parameters used to control specific logic policies.

Data structure details

Enum

ARTCAICallAgentType

The type of AI agent.

Enumeration name

Description

VoiceAgent

Voice-only call

AvatarAgent

Digital human call

VisionAgent

Visual understanding call

VideoAgent

Video call

ChatBot

Message-based conversation

ARTCAICallRobotState

The current state of the robot.

Enumeration name

Description

Listening

Listening

Thinking

Thinking

Speaking

Speaking

AICallErrorCode

An error code for the current AI call.

Enumeration name

Description

None

None

InvalidAction

Invalid API call

InvalidParams

The parameters passed to the API are invalid.

StartFailed

Failed to start the call.

AgentSubscriptionRequired

Failed to start the call. The daily free trial quota is exceeded.

AgentNotFund

The agent was not found.

TokenExpired

The call authentication token expired.

ConnectionFailed

The connection failed and the call was interrupted.

KickedByUserReplace

The call failed because different devices using the same user ID joined the same call.

KickedBySystem

The call failed because the user was removed by the system.

LocalDeviceException

The call failed due to an on-premises device issue.

AgentLeaveChannel

The agent left the channel and the call ended.

AgentConcurrentLimit

The concurrency limit for the digital human agent is reached.

AgentAudioSubscribeFailed

Failed to subscribe to the agent's audio.

AiAgentAsrUnavailable

Failed to start the third-party ASR service.

AvatarAgentUnavailable

The digital human service is unavailable.

ChatLogNotFound

The chat log was not found.

InternalError

An internal error occurred.

VoicePrintStatusCode

The status code returned by ASR.

Enumeration name

Description

Disable

Voiceprint-based noise reduction VAD is not enabled, and AIVad is disabled.

EnableWithoutRegister

Voiceprint-based noise reduction VAD is enabled, but voiceprint registration is not complete.

SpeakerRecognized

Voiceprint-based noise reduction VAD is enabled, and the main speaker is recognized.

SpeakerNotRecognized

Voiceprint-based noise reduction VAD is enabled, but the main speaker is not recognized.

DetectedSpeakerWithAIVad

AIVad is enabled, and the main speaker is recognized.

UndetectedSpeakerWithAIVad

AIVad is enabled, but the main speaker is not recognized.

Unknown

Unknown state

ARTCAICallSpeakingInterruptedReason

The reason why the agent's speech was interrupted.

Enumeration name

Description

UnKnown

Unknown reason

ByWorks

Interrupted because specific interruption words were recognized.

ByVoice

A speech interruption was detected.

ByInterruptSpeaking

Interrupted because the interruptSpeaking() API method was called.

BySpeechBroadCast

Interrupted by a proactive voice broadcast.

ByLlmQuery

Interrupted by a proactive large language model (LLM) query.

ARTCAICallAudioProfile

Audio encoding configuration.

Enumeration name

Description

ARTCAICallAudioLowQualityMode

Low-quality audio mode. Default audio sampling rate: 8000 Hz. Single sound channel. Maximum encoding bitrate: 12 kbps.

ARTCAICallAudioBasicQualityMode

Standard-quality audio mode. Default audio sampling rate: 16000 Hz. Single sound channel. Maximum encoding bitrate: 24 kbps.

ARTCAICallAudioHighQualityMode

(Default) High-quality audio mode. Default audio sampling rate: 48000 Hz. Single sound channel. Maximum encoding bitrate: 64 kbps.

ARTCAICallAudioStereoHighQualityMode

Stereo high-quality audio mode. Default audio sampling rate: 48000 Hz. Dual sound channels. Maximum encoding bitrate: 80 kbps.

ARTCAICallAudioSuperHighQualityMode

Super high-quality audio mode. Default audio sampling rate: 48000 Hz. Single sound channel. Maximum encoding bitrate: 96 kbps.

ARTCAICallAudioStereoSuperHighQualityMode

Stereo super high-quality audio mode. Default audio sampling rate: 48000 Hz. Dual sound channels. Maximum encoding bitrate: 128 kbps.

ARTCAICallAudioScenario

Audio scenario configuration.

Enumeration name

Description

ARTCAICallAudioSceneDefaultMode

Recommended for general Real-Time Communication scenarios.

ARTCAICallAudioSceneMusicMode

Audio scenario configuration. Recommended for scenarios that require high-fidelity music quality, such as musical instrument instruction.

ARTCAICallVideoRenderMode

Video rendering mode.

Enumeration name

Description

ARTCAICallVideoRenderModeAuto

Automatic mode

ARTCAICallVideoRenderModeStretch

Stretch tile mode. If the aspect ratio of the input video is different from the aspect ratio set for stream ingest, the input video is stretched to match the stream ingest ratio, which distorts the image.

ARTCAICallVideoRenderModeFill

Crop mode. If the aspect ratio of the input video is different from the aspect ratio set for stream ingest, the width or height of the input video is cropped, which causes a loss of image content.

ARTCAICallVideoRenderModeNoChange

No change

ARTCAICallVideoRotationMode

Video rotation angle.

Enumeration name

Description

ARTCAICallVideoRotationMode_0

0 degrees

ARTCAICallVideoRotationMode_90

90 degrees

ARTCAICallVideoRotationMode_180

180 degrees

ARTCAICallVideoRotationMode_270

270 degrees

ARTCAICallVideoRenderMirrorMode

The mirroring mode for video rendering.

Enumeration name

Description

ARTCAICallVideoRenderMirrorModeOnlyFront

Only the front camera preview is mirrored. Other views are not mirrored.

ARTCAICallVideoRenderMirrorModeAllEnabled

All views are mirrored.

ARTCAICallVideoRenderMirrorModeAllDisable

No views are mirrored.

ARTCAICallTurnDetectionMode

The mode for determining whether the user has finished speaking.

Enumeration name

Description

ARTCAICallTurnDetectionNormalMode

NAT mode. AI is not used to determine whether the user has finished speaking based on semantics. The ASR silence duration is used instead.

ARTCAICallTurnDetectionSemanticMode

Semantic mode. AI is used to determine whether the user has finished speaking based on contextual semantics.

Class

ARTCAICallConfig

The call configuration object.

Parameter

Type

Description

agentId

String

The agent ID.

agentType

ARTCAICallAgentType

The agent type. This must match the type of the specified agentId. Otherwise, an error occurs when the agent starts.

agentUserId

String

The user ID of the agent. If you leave this empty, the agent service assigns a user ID.

region

String

The region where the agent service is located. This must be the same region as the specified agentId. Otherwise, an error occurs when the agent starts.

userData

String

Custom user information. This information is passed to the agent.

enableAudioDelayInfo

boolean

Specifies whether to enable statistics on conversation latency. Default value: true.

agentConfig

ARTCAICallAgentConfig

The agentConfig parameter used to start the call.

audioConfig

ARTCAICallAudioConfig

Local audio configuration

videoConfig

ARTCAICallVideoConfig

Local video configuration. This parameter takes effect only when the agent type is VisionAgent or VideoAgent.

chatSyncConfig

ARTCAICallChatSyncConfig

The configuration of the associated chat agent. If you configure this parameter, call records are synchronized to the chat agent during the call.

mAiCallVideoConfig

ARTCAICallVideoConfig

Video-related configuration (deprecated in versions 2.5 and later).

mAliCallAudioConfig

ARTCAICallAudioConfig

Audio-related configuration (deprecated in versions 2.5 and later).

mAiCallAgentTemplateConfig

ARTCAICallAgentTemplateConfig (Deprecated)

Configurable parameters for creating an AI audio or video call (deprecated in versions 2.5 and later).

mAiCallChatSyncConfig

ARTCAICallChatSyncConfig

Chat synchronization configuration. It is used to integrate the chat content of call-based agents and message-based agents into a single session (deprecated in versions 2.5 and later).

ARTCAICallVideoConfig

Video configuration parameters.

Parameter

Type

Description

useHighQualityPreview

boolean

Specifies whether to use high-definition local preview.

useFrontCameraDefault

boolean

Specifies whether to start the front camera by default.

cameraCaptureFrameRate

int

The camera capture frame rate.

useSurfaceView

boolean

Specifies whether to use SurfaceView for rendering. If you set this to false, TextureView is used for rendering.

videoEncoderWidth

int

The video encoding width.

videoEncoderHeight

int

The video encoding height.

videoEncoderFrameRate

int

The video encoding frame rate.

videoEncoderBitRate

int

The video encoding bitrate.

videoEncoderKeyFrameInterval

int

The Group of Pictures (GOP) interval. Unit: milliseconds.

isCameraMute

boolean

Specifies whether to mute the local video. Default value: false.

ARTCAICallAudioConfig

Audio configuration parameters.

Parameter

Type

Description

audioProfile

ARTCAICallAudioProfile

The audio encoding configuration. The default value is ARTCAICallAudioHighQualityMode. You can set this parameter to specify the audio sample rate and the number of sound channels.

audioScenario

ARTCAICallAudioScenario

The audio scenario configuration. The default value is ARTCAICallAudioSceneMusicMode. To specify Bluetooth for audio capture, use ARTCAICallAudioSceneDefaultMode.

enableSpeaker

boolean

Specifies whether to use the speaker or the earpiece for playback. By default, the speaker is used. Set this to false to switch to the earpiece.

isMicrophoneOn

boolean

Specifies whether to enable the microphone. Default value: true.

ARTCAICallAgentTemplateConfig (Deprecated)

Configurable parameters for a call.

Important

This type is deprecated in versions 2.5 and later. Use ARTCAICallAgentConfig instead.

Parameter

Type

Description

aiAgentId

String

The agent ID. Set this parameter when you call the call method of AICallKit to initiate an agent call.

Note

The agent ID is required and cannot be empty.

aiAgentRegion

String

The region where the agent service is located. Default value: cn-shanghai.

Note

Specify a region based on your agent.

aiAgentUserId

String

The user ID corresponding to the agent. If you do not set this, the AI Server generates a random user ID for the agent.

userExtendData

String

Business extension information in a JSON string. This information is passed to the LLM.

aiAgentGreeting

String

The agent's welcome message. The AI agent speaks this message after the user joins the call.

aiAgentUserOnlineTimeout

int

The timeout period for the agent to shut down the task if the user does not join the call. Unit: seconds. Default value: 60.

aiAgentUserOfflineTimeout

int

The timeout period for the agent to shut down the task after the user leaves the call. Unit: seconds. Default value: 5.

aiAgentWorkflowOverrideParams

String

Workflow override parameters. Default value: none.

aiAgentBailianAppParams

String

The parameters of the application center of Alibaba Cloud Model Studio. For more information, see Pass user information through to Alibaba Cloud Model Studio.

aiAgentAsrMaxSilence

int

The maximum duration of silence for speech recognition. Unit: milliseconds. Valid values: 200 to 1200. Default value: 400.

aiAgentVolume

int

The agent's speaking volume. Valid values: -1 to 100. Default value: -1. If you do not set this parameter, the recommended automatic volume mode is used.

enableVoiceInterrupt

boolean

Specifies whether to support voice interruption. Default value: true.

enableIntelligentSegment

boolean

Specifies whether to enable intelligent sentence segmentation. Default value: true.

enableVoicePrint

boolean

Specifies whether to use voiceprint recognition. Default value: false. To enable voiceprint recognition, set enableVoicePrint to true and specify voiceprintId.

voiceprintId

String

The voiceprint ID. If you set enableVoicePrint to true and specify a non-empty voiceprintId, voiceprint-based noise reduction is enabled for the current call. If this is empty, voiceprint-based noise reduction is not enabled.

aiAgentVoiceId

String

The voice ID for the agent's speech.

aiAgentMaxIdleTime

int

The maximum idle time for the agent. Unit: seconds. If the time is exceeded, the agent automatically goes offline. Set this to -1 to prevent the agent from exiting when idle. Default value: 600.

llmHistoryLimit

int

The maximum number of conversation rounds to retain in the LLM/MLLM history. If the value is less than 0, the server's default value of 10 is used.

aiAgentGracefulShutdown

boolean

Specifies whether to enable graceful shutdown. Default value: false.

Note

Graceful shutdown means that when the agent is stopped, for example, when the call is hung up, it finishes speaking the current sentence before stopping. This process can last for a maximum of 10 seconds.

enablePushToTalk

boolean

Specifies whether to enable push-to-talk mode. Default value: false.

aiAgentAvatarId

String

The digital human model ID. You can specify this when the agent type is AvatarAgent.

asrLanguageId

String

The ASR language ID. If you leave this empty, the agent's default configuration is used. Valid values:

  • zh_mandarin: Chinese

  • en: English

  • zh_en: Chinese-English mixed

  • es: Spanish

  • jp: Japanese

wakeUpQuery

String

A wake-up query. This is an instruction from the user before the call starts, which the agent responds to immediately after the call begins. Example: "What's the weather like today?".

llmSystemPrompt

String

The system prompt for the LLM. Example: "You are a friendly and helpful assistant focused on providing users with accurate information and advice."

Note

This is not supported if the LLM node is a Model Studio workflow type.

interruptWords

List<String>

Trigger words for interrupting the conversation. Examples: "Let me interrupt" and "I see."

aiAgentLlmHistoryLimit

int

The maximum number of conversation rounds to retain in the LLM/MLLM history. Default value: 10.

aiAgentVadLevel

int

The sensitivity parameter for AIVad. This parameter helps resist human voice interference. Valid values: 0 to 11. AIVad is enabled on the client by default, and the value is 11.

  • 0: Disables the VAD feature.

  • 1 to 10: A higher value makes interruption more difficult.

  • 11: Significantly different from other values. It causes less damage to the pre-processed conversation audio and provides stronger interference resistance.

ARTCAICallChatSyncConfig

Configuration for synchronizing agent chat records in a message-based chat.

Parameter

Type

Description

sessionId

String

A unique identifier for a conversation between a user and an agent. Default value: empty.

chatBotAgentId

String

The ID of the agent associated with the message-based conversation.

receiverId

String

The receiver ID for the message-based agent. This is the user's user ID.

ARTCAICallAgentInfo

Agent runtime information.

Parameter

Type

Description

agentId

String

The current agent ID.

agentType

ARTCAICallAgentType

The agent type.

agentUserId

String

The user ID of the agent in the RTC channel.

channelId

String

The RTC channel ID where the agent is located.

instanceId

String

The instance ID of the currently running agent. When an agent starts, the system assigns a unique instance ID to identify and track its entire lifecycle and running state.

requestId

String

The request ID of the currently running agent.

ARTCAICallSendTextToAgentRequest

A request object for sending text to an agent.

Parameter

Type

Description

text

String

The text message to send to the agent. Example: "What is this?".

ARTCAICallVisionCustomCaptureRequest

A request object for custom frame capture for the large vision model.

Parameter

Type

Description

text

String

The text parameter for requesting the multi-modal large model.

enableASR

boolean

Specifies whether to use the ASR result of the human voice as input for the large model. If true, the ASR result and the captured frame are sent to the large model. Otherwise, the text field and the captured frame are sent to the large model.

Note

This takes effect only during continuous frame capture.

isSingle

boolean

If true, it indicates a single frame capture. The custom frame capture state is exited immediately after the frame is captured.

If false, it indicates continuous frame capture. The custom frame capture state is automatically exited after the specified duration.

eachDuration

int

The frame capture interval. Unit: seconds.

num

int

The number of images to capture each time.

duration

int

The duration of continuous frame capture. Unit: seconds. If isSingle is true, this is ignored. Otherwise, the custom frame capture state ends when this duration is reached.

userData

String

A JSON string for custom business information. It is passed to the large model along with the text and image frames for business processing.

ARTCAICallVideoCanvas

Video rendering configuration object.

Parameter

Type

Description

renderMode

ARTCAICallVideoRenderMode

The rendering mode. Default value: ARTCAICallVideoRenderModeAuto.

mirrorMode

ARTCAICallVideoRenderMirrorMode

The image mode. Default value: ARTCAICallVideoRenderMirrorModeOnlyFront.

rotationMode

ARTCAICallVideoRotationMode

The rotation angle. Default value: ARTCAICallVideoRotationMode_0.

zOrderOnTop

boolean

Specifies whether the SurfaceView should be placed on a display layer on top of all other windows. Default value: true.

zOrderMediaOverlay

boolean

Specifies whether the SurfaceView should be placed on a display layer on top of windows such as MediaPlayer and Camera. Default value: true.

ARTCAICallAgentConfig

Configurable parameters for a call.

Parameter

Type

Description

agentGreeting

String

The agent's welcome message. If you leave this empty, the agent's configured value is used. Maximum length: 100 characters.

wakeUpQuery

String

An instruction from the user before the call starts, which the agent responds to immediately after the call begins.

agentMaxIdleTime

int

The maximum idle time for the agent. Unit: seconds. If the time is exceeded, the agent automatically goes offline. Default value: 600.

userOnlineTimeout

int

The timeout period for the agent to shut down the task if the user does not join the call. Default value: 60.

userOfflineTimeout

int

The timeout period for the agent to shut down the task after the user leaves the call. Default value: 5.

enablePushToTalk

boolean

Specifies whether to enable push-to-talk mode.

agentGracefulShutdown

boolean

Specifies whether to enable graceful shutdown. This allows the agent to finish speaking the current sentence before stopping.

volume

int

The agent's speaking volume. Valid values: 0 to 400. Default value: 100.

workflowOverrideParams

String

Workflow override parameters

enableIntelligentSegment

boolean

Specifies whether to enable intelligent sentence segmentation.

asrConfig

ARTCAICallAgentAsrConfig

Speech recognition configuration

ttsConfig

ARTCAICallAgentTtsConfig

Speech synthesis configuration

llmConfig

ARTCAICallAgentLlmConfig

Large language model configuration

avatarConfig

ARTCAICallAgentAvatarConfig

Digital human configuration

interruptConfig

ARTCAICallAgentInterruptConfig

Interruption configuration

voiceprintConfig

ARTCAICallAgentVoiceprintConfig

Voiceprint-based noise reduction configuration

turnDetectionConfig

ARTCAICallAgentTurnDetectionConfig

Turn detection configuration

experimentalConfig

ARTCAICallExperimentalConfig

Customized, non-productized configuration

ARTCAICallAgentVcrConfig

ARTCAICallAgentVcrConfig

VCR configuration

ARTCAICallAgentAsrConfig

Speech recognition configuration.

Parameter

Type

Description

asrLanguageId

String

The ASR language ID. If you leave this empty, the agent's configured value is used.

asrMaxSilence

int

The threshold for detecting sentence breaks in speech. A silence duration exceeding this threshold is considered a sentence break. Default value: 400 ms. Valid values: 200 to 1200 ms.

asrHotWords

List<String>

A list of ASR hot words. Limit: 500 words or fewer. Each word can have a maximum of 10 characters.

vadLevel

int

The sensitivity parameter for AIVad. Default value: 11. Valid values: 0 to 11.

  • 0: Disables the VAD feature.

  • 1 to 10: A higher value makes interruption more difficult.

  • 11: Significantly different from other values. It causes less damage to the pre-processed conversation audio and provides stronger interference resistance.

customParams

String

When using self-managed ASR, pass runtime parameters in URL parameter format. Example: "mode=fast&sample=16000&format=wav".

vadDuration

int

The minimum duration threshold for voice activity detection, used to control interruption sensitivity. The default value of 0 disables this feature. Valid range: 200 to 2000 milliseconds. A common range is 200 to 500, corresponding to 1 to 4 characters.

ARTCAICallAgentTtsConfig

Speech synthesis configuration.

Parameter

Type

Description

agentVoiceId

String

The voice ID for the agent's speech. If you leave this empty, the agent's configured value is used.

pronunciationRules

List

An array of pronunciation rules. A maximum of 20 rules are supported. If the array is nil or empty, no rules are used. Example:

 [
  {
      "Word": "overlap",                       // Target word
      "Pronunciation": "chongdie",              // Replacement pronunciation
      "Type": "replacement"                // Polyphone rule
  },
  {
      "Word": "action",
      "Pronunciation": "hangdong",
      "Type": "replacement"
  }
]

speechRate

double

The speech rate for text-to-speech (TTS) playback. This is supported for all TTS types. Valid values: 0.5 to 2.0. Default value: 1.0. If the value is less than 0, it is not sent to the server, and the configuration in the console is used.

languageId

String

The language code for TTS playback. This is valid when the TTS type is MiniMax.

emotion

String

The emotion type for TTS playback. This is valid when the TTS type is MiniMax.

modelId

String

The TTS model ID. Currently, only MiniMax is supported. Valid values: speech-01-turbo, speech-02-turbo.

ARTCAICallAgentLlmConfig

Large language model configuration.

Parameter

Type

Description

llmHistoryLimit

int

The maximum number of conversation rounds to retain in the history. Default value: 10.

llmSystemPrompt

String

The system prompt for the LLM.

bailianAppParams

String

Parameters for Model Studio Application Center.

llmCompleteReply

boolean

Specifies whether to send the complete LLM result.

Note

If enabled, the complete LLM result is returned through the onLLMReplyCompleted event callback after the result is generated.

openAIExtraQuery

String

Additional query parameters for the OpenAI protocol-based LLM.

Note

Parameters must be in the key=value format. Use the ampersand (&) to separate multiple parameters. All values must be strings.

ARTCAICallAgentAvatarConfig

Digital human configuration.

Parameter

Type

Description

agentAvatarId

String

The digital human model ID. If you leave this empty, the agent's configured value is used.

ARTCAICallAgentInterruptConfig

Interruption configuration.

Parameter

Type

Description

enableVoiceInterrupt

boolean

Specifies whether to enable intelligent interruption.

interruptWords

List<String>

Specific words or phrases that trigger a conversation interruption.

ARTCAICallAgentVoiceprintConfig

Voiceprint-based noise reduction configuration.

Parameter

Type

Description

useVoiceprint

boolean

Specifies whether to use voiceprint-based noise reduction for the current sentence break.

voiceprintId

String

The voiceprint ID. If this is not empty, voiceprint-based noise reduction is enabled for the current call.

ARTCAICallAgentTurnDetectionConfig

Turn detection configuration.

Parameter

Type

Description

turnEndWords

List<String>

Specific words that indicate the end of a sentence. Examples: "Over" and "I'm done talking."

mode

ARTCAICallTurnDetectionMode

The mode for determining whether the user has finished speaking. The default is semantic sentence segmentation mode, where AI determines if the user has finished speaking based on contextual semantics. Default value: ARTCAICallTurnDetectionSemanticMode.

semanticWaitDuration

int

The custom wait time for semantic sentence segmentation. Unit: milliseconds. Valid values: 0 to 10000. The default value of -1 indicates that the AI automatically determines the appropriate wait time.

Note

The semanticWaitDuration parameter is invalid in ARTCAICallTurnDetectionNormalMode mode.

ARTCAICallAgentVcrResult

VCR detection result.

Parameter

Type

Description

resultData

Object

All VCR detection results returned by the agent.

stillFrameMotionResult

FrameMotionResult

The still frame detection result from VCR.

invalidFrameMotionResult

FrameMotionResult

The invalid frame detection result from VCR.

peopleCountResult

PeopleCountResult

The real-time people count detection result from VCR.

equipmentResult

EquipmentResult

The electronic device detection result from VCR.

headMotionResult

HeadMotionResult

The head motion detection result from VCR.

FrameMotionResult

The VCR result for video frame detection.

Parameter

Type

Description

duration

int

How long ago it was sent. Unit: milliseconds.

PeopleCountResult

The VCR result for people count detection.

Parameter

Type

Description

count

int

The number of people detected by VCR.

EquipmentResult

The VCR result for electronic device detection.

Parameter

Type

Description

mobilePhoneCount

int

The total number of devices.

watchCount

int

The number of watches.

headPhoneCount

int

The number of headphones.

HeadMotionResult

The VCR result for head motion detection.

Parameter

Type

Description

nodDetected

boolean

Nodding detected

shakeDetected

boolean

Shaking head detected

ARTCAICallAgentVcrConfig

VCR configuration.

Parameter

Type

Description

data

JSONObject

When a user passes a JSON object, it is cached. This object is then used to generate a JSON string, allowing for custom extensions.

stillFrameMotion

ARTCAICallAgentVcrFrameMotionConfig

The still frame detection configuration for VCR.

invalidFrameMotion

ARTCAICallAgentVcrFrameMotionConfig

The invalid frame detection configuration for VCR.

peopleCount

ARTCAICallAgentVcrBaseConfig

The real-time people count detection configuration for VCR.

equipment

ARTCAICallAgentVcrBaseConfig

The electronic device detection configuration for VCR.

headMotion

ARTCAICallAgentVcrBaseConfig

The head motion detection configuration for VCR.

ARTCAICallAgentVcrBaseConfig

Basic VCR detection configuration.

Parameter

Type

Description

enable

boolean

Specifies whether to enable the feature.

ARTCAICallAgentVcrFrameMotionConfig

VCR video frame detection configuration.

Parameter

Type

Description

callbackDelay

int

The delay after which the callback is triggered. Unit: milliseconds.

ARTCAICallExperimentalConfig

Experimental parameters that control specific logic.

Parameter

Type

Description

rtcSdkParams

JSONObject

RTC SDK parameters

commonParams

JSONObject

Common parameters

IARTCAICallService details

generateAIAgentShareCall

Starts a shared agent call.

/**
 * Starts a shared agent call.
 * @param userId The ID of the currently logged-on user.
 * @param aiAgentId The agent ID.
 * @param aiAgentType The agent type.
 * @param artcaiCallConfig The agent configuration.
 * @param callback The request callback.
 */
void generateAIAgentShareCall(String userId, String aiAgentId, ARTCAICallEngine.ARTCAICallAgentType aiAgentType, ARTCAICallEngine.ARTCAICallConfig artcaiCallConfig, IARTCAICallServiceCallback callback);

ARTCAIAgentUtil details

parseAiAgentShareInfo

Parses the information of a shared agent.

/**
 * Parses the information of a shared agent.
 * @param shareInfoText
 * @return The structured configuration of the shared agent.
 */
public static ARTCAIAgentShareInfo parseAiAgentShareInfo(String shareInfoText);

parseAiAgentInfo

Parses the response information for starting an agent.

/**
 * Parses the response information for starting an agent.
 * @param jsonObject The response information for starting the agent.
 * @return The structured information in the response for starting the agent.
 */
public static ARTCAIAgentInfo parseAiAgentInfo(JSONObject jsonObject);