All Products
Search
Document Center

Intelligent Media Services:API operation details

Last Updated:Dec 05, 2025

This topic describes the details of the API operations in the iOS SDK.

API overview

Note

Older SDK versions contain deprecated parameters and methods. Upgrade the software development kit (SDK) to the latest version. For more information, see iOS usage guide.

Class/Protocol

API

Description

ARTCAICallEngineInterface

Engine interface definition

userId

Gets the User ID of the current call.

isOnCall

Indicates whether a call is in progress.

agentInfo

Gets the information about the current agent.

agentState

Gets the current state of the agent.

delegate

Sets and gets callback events.

call[1/2]

Starts a call.

call[2/2]

Starts a call using call configurations.

handup

Hangs up a call.

audioConfig

Audio configurations. This includes encoding settings (such as sample rate, the number of sound channels, and bitrate) and scenario settings (such as default and music scenarios).

videoConfig

Video configurations.

setLocalViewConfig

Sets the rendering view and configurations for the local camera.

setAgentViewConfig

Sets the rendering view configurations for the agent. This is required when the agent has image rendering. This operation is valid only for AvatarAgent and VideoAgent.

interruptSpeaking

Interrupts the agent's speech.

enableVoiceInterrupt

Enables or disables smart interruption.

switchVoiceId

Switches the voice.

enableSpeaker

Enables or disables the speaker.

enablePushToTalk

Enables or disables push-to-talk mode.

startPushToTalk

In push-to-talk mode, starts speaking.

finishPushToTalk

In push-to-talk mode, finishes speaking.

cancelPushToTalk

In push-to-talk mode, cancels the current speech.

muteMicrophone

Mutes or unmutes the microphone.

muteAgentAudioPlaying

Stops or resumes the playback of the agent's audio stream.

visionConfig

Parameter settings for visual understanding calls.

muteLocalCamera

Disables or enables the camera.

switchCamera

Switches between the front and rear cameras.

parseShareAgentCall

Parses shared agent information.

generateShareAgentCall

Starts a shared agent call.

getRTCInstance

Gets the RTC engine.

sendTextToAgent

Sends a text message to the agent.

sendCustomMessageToServer

Sends a custom message to the server. This must be called after the call is connected.

updateLlmSystemPrompt

Updates the system prompt for the LLM. This must be called after the call is connected.

updateBailianAppParams

Updates Model Studio application center parameters.

updateVcrConfig

Updates the VCR configuration.

startVisionCustomCapture

For a visual understanding agent, starts custom frame capture. After starting, you cannot talk to the agent through voice. This must be called after the call is connected.

stopVisionCustomCapture

For a visual understanding agent, ends custom frame capture. This must be called after the call is connected.

destroy

Releases resources.

ARTCAICallEngineDelegate

Engine callback events

onErrorOccurs

An error occurred.

onAgentStarted

The call agent has started.

onCallBegin

The call started.

onCallEnd

The call ended.

onAgentVideoAvailable

Indicates whether the agent's video is available.

onAgentAudioAvailable

Indicates whether the agent's audio is available.

onRTCEngineCreated

The Alibaba Real-Time Communication (ARTC) engine is created.

onPushToTalk

Indicates whether push-to-talk mode is enabled for the current call.

onAgentWillLeave

The current agent is about to leave (end the current call).

onReceivedAgentCustomMessage

A custom message is received from the current agent.

onAgentStateChanged

The agent state changed.

onNetworkStatusChanged

The network status changed.

onVoiceVolumeChanged

The volume changed.

onUserSubtitleNotify

A notification for the result of the agent's recognition of the user's question.

onVoiceAgentSubtitleNotify

A notification for the agent's answer.

onLLMReplyCompleted

The LLM has finished replying in the current call.

onVoiceIdChanged

The voice for the current call changed.

onVoiceInterrupted

Indicates whether voice interruption is enabled for the current call.

onSpeakingInterrupted

The agent's current speech is interrupted. This callback is supported only for interruptions by specific words.

onVisionCustomCapture

Indicates whether custom frame capture mode is enabled for the current visual understanding call.

onAgentAvatarFirstFrameDrawn

The first frame of the agent's digital human is rendered.

onHumanTakeoverWillStart

A human is about to take over from the current agent.

onHumanTakeoverConnected

The human takeover is connected.

onAgentEmotionNotify

A notification for the agent's emotion result.

onAgentDataChannelAvailable

A callback for the availability of the agent's message channel.

onReceivedAgentVcrResult

A VCR result is received from the current agent.

onConnectionStatusChange

The connection status changed during the call.

onAudioDelayInfo

Audio loopback latency.

onAudioAccompanyStateChanged

If music accompaniment is played through the RTC instance during the current call, this playback callback is triggered.

ARTCAICallEngineFactory

Engine creation factory

createEngine

Creates a default engine instance.

API details

ARTCAICallEngineInterface details

userId

Retrieves the user ID of the current call.

var userId: String? {get}

isOnCall

Indicates whether a call is in progress. The value is true from the time the call is connected until it is hung up or an error occurs. Otherwise, the value is false.

var isOnCall: Bool { get }

agentInfo

Retrieves information about the current agent. This includes the agent type, channel ID, UID of the agent in the channel, and the running instance ID of the agent. For more information, see ARTCAICallAgentInfo.

var agentInfo: ARTCAICallAgentInfo? { get }

agentState

Retrieves the current state of the agent. States include listening, thinking, and speaking. For more information, see ARTCAICallAgentState.

var agentState: ARTCAICallAgentState { get }

delegate

Sets and retrieves callback events. For more information about callback events, see ARTCAICallEngineDelegate.

weak var delegate: ARTCAICallEngineDelegate? { get set }

call[1/2]

Starts a call.

Note

This operation is called by the server to initiate a call with an agent. For more information, see Initiate a call using server-side interfaces.

func call(userId: String, token: String, agentInfo: ARTCAICallAgentInfo, completed:((_ error: NSError?) -> Void)?)

Parameter details:

Parameter

Type

Description

userId

String

The UID of the current user.

token

String

The token for joining the call.

agentInfo

ARTCAICallAgentInfo

Agent information.

completed

((_ error: NSError?) -> Void)?

The completion callback.

call[2/2]

Starts a call using call configurations.

Note

This API operation is called by the client to call the agent. This is the default calling method. For more information, see iOS usage guide.

func call(config: ARTCAICallConfig) -> Bool

Parameter details:

Parameter

Type

Description

config

ARTCAICallConfig

Call configurations.

handup

Exits the call.

 func handup(_ stopAIAgent: Bool)

Parameter details:

Parameter

Type

Description

stopAIAgent

Bool

Specifies whether to end the current agent task at the same time.

audioConfig

Specifies the audio configurations. This includes encoding settings, such as sample rate, the number of sound channels, and bitrate, and scenario settings, such as default and music scenarios. For more information, see ARTCAICallAudioConfig. This parameter takes effect only when it is set before the call.

var audioConfig: ARTCAICallAudioConfig { set get }

videoConfig

Specifies the video configurations. This parameter takes effect only when it is set before the call. This is valid only for VisionAgent and VideoAgent. For more information, see ARTCAICallVideoConfig.

var videoConfig: ARTCAICallVideoConfig { set get }

setLocalViewConfig

Sets the rendering view and configurations for the local camera. This is valid only for VisionAgent and VideoAgent. For more information, see ARTCAICallViewConfig.

func setLocalViewConfig(viewConfig: ARTCAICallViewConfig?)

Parameter details:

Parameter

Type

Description

viewConfig

ARTCAICallViewConfig

The view configuration. If this is empty, no rendering view is needed.

setAgentViewConfig

Sets the rendering view configurations for the agent. This is required when the agent performs image rendering. This is valid only for VisionAgent and VideoAgent. For more information, see ARTCAICallViewConfig.

func setAgentViewConfig(viewConfig: ARTCAICallViewConfig?)

Parameter details:

Parameter

Type

Description

viewConfig

ARTCAICallViewConfig?

The rendering view configuration. This includes the rendering view, rendering mode, image mode, and rotation mode.

interruptSpeaking

Interrupts the agent's speech.

func interruptSpeaking() -> Bool

enableVoiceInterrupt

Enables or disables smart interruption.

func enableVoiceInterrupt(enable: Bool) -> Bool

Parameter details:

Parameter

Type

Description

enable

Bool

Enables or disables the feature.

switchVoiceId

Switches the voice.

func switchVoiceId(voiceId: String) -> Bool

Parameter details:

Parameter

Type

Description

voiceId

String

The voice ID.

enableSpeaker

Enables or disables the speaker.

func enableSpeaker(enable: Bool) -> Bool

Parameter details:

Parameter

Type

Description

enable

Bool

Enables or disables the speaker.

enablePushToTalk

Enables or disables push-to-talk mode. In push-to-talk mode, the agent reports the result only after finishPushToTalk is called.

func enablePushToTalk(enable: Bool) -> Bool

Parameter details:

Parameter

Type

Description

enable

Bool

Disables or enables the mode.

startPushToTalk

In push-to-talk mode, starts speaking.

func startPushToTalk() -> Bool

finishPushToTalk

In push-to-talk mode, finishes speaking.

func finishPushToTalk() -> Bool

cancelPushToTalk

In push-to-talk mode, cancels the current speech.

func cancelPushToTalk() -> Bool

muteMicrophone

Mutes or unmutes the microphone.

func muteMicrophone(mute: Bool) -> Bool

Parameter details:

Parameter

Type

Description

mute

Bool

Mutes or unmutes the microphone.

muteAgentAudioPlaying

Stops or resumes the playback of the agent's audio stream.

func muteAgentAudioPlaying(mute: Bool) -> Bool

Parameter details:

Parameter

Type

Description

mute

Bool

Specifies whether to mute.

visionConfig

This must be set when you use a vision agent and takes effect only when it is set before the call. The visual configurations include resolution and frame rate. For more information, see ARTCAICallVisionConfig.

var visionConfig: ARTCAICallVisionConfig { set get }

muteLocalCamera

Disables or enables the camera.

func muteLocalCamera(mute: Bool) -> Bool

Parameter details:

Parameter

Type

Description

mute

Bool

Shut down or enable

switchCamera

Switches between the front and rear cameras.

func switchCamera() -> Bool

parseShareAgentCall

Parses shared agent information. If the parsing is successful, an instance of the ARTCAICallAgentShareConfig type is returned. You can use this instance to start a shared agent call.

func parseShareAgentCall(shareInfo: String) -> ARTCAICallAgentShareConfig?

Parameter details:

Parameter

Type

Description

shareInfo

String

The shared agent information. You can generate this in the console.

generateShareAgentCall

Starts a shared agent call.

func generateShareAgentCall(shareConfig: ARTCAICallAgentShareConfig, userId: String, completed: ((_ rsp: ARTCAICallAgentInfo?, _ token: String?, _ error: NSError?, _ reqId: String) -> Void)?)

Parameter details:

Parameter

Type

Description

shareConfig

ARTCAICallAgentShareConfig

The shared agent configuration information. This includes the share ID, agent type, time-to-live (TTL), template configuration, and region. You can view the definition in the SDK.

userId

String

The ID of the currently logged-on user.

completed

((_ rsp: ARTCAICallAgentInfo?, _ token: String?, _ error: NSError?, _ reqId: String) -> Void)?

The event for when the operation is complete.

getRTCInstance

Retrieves the ARTC engine.

func getRTCInstance() -> AnyObject?

sendTextToAgent

Sends a text message to the agent.

func sendTextToAgent(req: ARTCAICallSendTextToAgentRequest) -> Bool

Parameter details:

Parameter

Type

Description

req

ARTCAICallSendTextToAgentRequest

The message struct to send.

sendCustomMessageToServer

Sends a custom message to the server. This method must be called after the call is connected.

func sendCustomMessageToServer(msg: String) -> Bool

Parameter details:

Parameter

Type

Description

msg

String

The content to send.

updateLlmSystemPrompt

Updates the system prompt for the Large Language Model (LLM). This method must be called after the call is connected.

func updateLlmSystemPrompt(prompt: String) -> Bool

Parameter details:

Parameter

Type

Description

prompt

String

The prompt.

updateBailianAppParams

Updates Model Studio application parameters.

func updateBailianAppParams(params: [String: Any]) -> Bool

Parameter details:

Parameter

Type

Description

params

String

Model Studio application center parameters.

updateVcrConfig

Updates the Video Cassette Recording (VCR) configuration.

 func updateVcrConfig(vcrConfig: ARTCAICallAgentVcrConfig) -> Bool

Parameter details:

Parameter

Type

Description

vcrConfig

ARTCAICallAgentVcrConfig

The VCR configuration.

startVisionCustomCapture

For a visual understanding agent, starts custom frame capture. After this is started, you cannot talk to the agent using voice. This method must be called after the call is connected.

func startVisionCustomCapture(req: ARTCAICallVisionCustomCaptureRequest) -> Bool

Parameter details:

Parameter

Type

Description

req

ARTCAICallVisionCustomCaptureRequest

Configuration information.

stopVisionCustomCapture

For a visual understanding agent, ends custom frame capture. This method must be called after the call is connected.

func stopVisionCustomCapture() -> Bool

destroy

Releases resources.

func destroy()

ARTCAICallEngineDelegate details

onErrorOccurs

An error occurred during the current call.

@objc optional func onErrorOccurs(code: ARTCAICallErrorCode)

Parameter details:

Parameter

Type

Description

code

ARTCAICallErrorCode

The error type.

onAgentStarted

The call agent has started.

@objc optional func onAgentStarted()

onCallBegin

The call has started.

@objc optional func onCallBegin()

onCallEnd

The call has ended.

@objc optional func onCallEnd()

onAgentVideoAvailable

Indicates whether the agent's video is available.

@objc optional func onAgentVideoAvailable(available: Bool)

Parameter details:

Parameter

Type

Description

available

Bool

Indicates whether the video is available.

onAgentAudioAvailable

Indicates whether the agent's audio is available.

@objc optional func onAgentAudioAvailable(available: Bool)

Parameter details:

Parameter

Type

Description

available

Bool

Indicates whether the audio is available.

onRTCEngineCreated

The ARTC engine is created. You can call getRTCInstance in this callback to retrieve the ARTC engine instance.

@objc optional func onRTCEngineCreated()

onPushToTalk

Indicates whether push-to-talk mode is enabled for the current call.

@objc optional func onPushToTalk(enable: Bool)

Parameter details:

Parameter

Type

Description

enable

Bool

Indicates whether the mode is enabled.

onAgentWillLeave

The current agent is about to leave, which ends the current call.

@objc optional func onAgentWillLeave(reason: Int32, message: String)

Parameter details:

Parameter

Type

Description

reason

Int32

The reason for leaving: 2001 (idle exit) or 0 (other).

message

String

A description of the reason for leaving.

onReceivedAgentCustomMessage

A custom message is received from the current agent.

@objc optional func onReceivedAgentCustomMessage(data: [String: Any]?)

Parameter details:

Parameter

Type

Description

data

[String: Any]?

The message content.

onAgentStateChanged

The agent state has changed.

@objc optional func onAgentStateChanged(state: ARTCAICallAgentState)

Parameter details:

Parameter

Type

Description

state

ARTCAICallAgentState

The current state of the agent: listening, thinking, or speaking.

onNetworkStatusChanged

The network status has changed.

@objc optional func onNetworkStatusChanged(uid: String, quality: ARTCAICallNetworkQuality)

Parameter details:

Parameter

Type

Description

uid

String

The ID of the current speaker.

quality

ARTCAICallNetworkQuality

The network quality. Values include excellent, good, fair, poor, very poor, disconnected, and unknown.

onVoiceVolumeChanged

A notification is sent when the volume changes.

@objc optional func onVoiceVolumeChanged(uid: String, volume: Int32)

Parameter details:

Parameter

Type

Description

uid

String

The UID of the current speaker.

volume

Int32

The volume, from 0 to 255.

onUserSubtitleNotify

A notification that contains the result of the agent's recognition of the user's question.

@objc optional func onUserSubtitleNotify(text: String, isSentenceEnd: Bool, sentenceId: Int)

Parameter details:

Parameter

Type

Description

text

String

The question text recognized by the agent.

isSentenceEnd

Bool

Indicates whether the current text is the final result for this sentence.

sentenceId

Int

The ID of the sentence to which the current text belongs.

onVoiceAgentSubtitleNotify

A notification that contains the agent's answer.

@objc optional func onVoiceAgentSubtitleNotify(text: String, isSentenceEnd: Bool, userAsrSentenceId: Int)

Parameter details:

Parameter

Type

Description

text

String

The text of the agent's answer.

isSentenceEnd

Bool

Indicates whether the current text is the last sentence of this answer.

userAsrSentenceId

Int

The ID of the sentence that answers the user's question.

onLLMReplyCompleted

The LLM has finished replying in the current call.

@objc optional func onLLMReplyCompleted(text: String, userAsrSentenceId: Int)

Parameter details:

Parameter

Type

Description

text

String

The text output by the LLM.

userAsrSentenceId

Int

The ID of the sentence that answers the user's question.

onVoiceIdChanged

The voice for the current call has changed.

@objc optional func onVoiceIdChanged(voiceId: String)

Parameter details:

Parameter

Type

Description

voiceId

String

The current voice ID.

onVoiceInterrupted

Indicates whether voice interruption is enabled for the current call.

@objc optional func onVoiceInterrupted(enable: Bool)

Parameter details:

Parameter

Type

Description

enable

Bool

Indicates whether it is enabled.

onSpeakingInterrupted

The agent's current speech is interrupted.

@objc optional func onSpeakingInterrupted(reason: ARTCAICallSpeakingInterruptedReason)

Parameter details:

Parameter

Type

Description

reason

ARTCAICallSpeakingInterruptedReason

The reason, such as interruption by a specific word.

onVisionCustomCapture

Indicates whether custom frame capture mode is enabled for the current visual understanding call.

@objc optional func onVisionCustomCapture(enable: Bool)

Parameter details:

Parameter

Type

Description

enable

Bool

Indicates whether it is enabled.

onAgentAvatarFirstFrameDrawn

The first frame of the agent's digital human is rendered.

@objc optional func onAgentAvatarFirstFrameDrawn()

onHumanTakeoverWillStart

A human is about to take over from the current agent.

@objc optional func onHumanTakeoverWillStart(takeoverUid: String, takeoverMode: Int)

Parameter details:

Parameter

Type

Description

takeoverUid

String

The UID of the human.

takeoverMode

Int

1: The human's voice is used for output. 0: The agent's voice is used for output.

onHumanTakeoverConnected

The human takeover is connected.

@objc optional func onHumanTakeoverConnected(takeoverUid: String)

Parameter details:

Parameter

Type

Description

takeoverUid

String

The UID of the human.

onAgentEmotionNotify

A notification that contains the agent's emotion result.

@objc optional func onAgentEmotionNotify(emotion: String, userAsrSentenceId: Int)

Parameter details:

Parameter

Type

Description

emotion

String

The emotion label, such as neutral, happy, angry, or sad.

userAsrSentenceId

Int

The ID of the sentence that answers the user's question.

onAgentDataChannelAvailable

A callback that indicates the availability of the agent's message channel. After this callback is triggered, you can send messages to the agent.

@objc optional func onAgentDataChannelAvailable()

onReceivedAgentVcrResult

A VCR result is received from the current agent. For more information, see ARTCAICallAgentVcrResult.

@objc optional func onReceivedAgentVcrResult(result: ARTCAICallAgentVcrResult)

onConnectionStatusChange

The connection status has changed during the call.

@objc optional func onConnectionStatusChange(status: ARTCAICallConnectionStatus, reason: Int32)

Parameter details:

Parameter

Type

Description

status

ARTCAICallConnectionStatus

The status code:

  • Init: Initialization is complete.

  • Disconnected: The network connection is disconnected.

  • Connecting: A network connection is being established.

  • Connected: The network is connected.

  • Reconnecting: The network connection is being re-established.

  • Failed: The network connection failed.

reason

Int

Cause

onAudioDelayInfo

Provides the audio loopback latency.

@objc optional func onAudioDelayInfo(sentenceId: Int32, delayMs: Int64)

Parameter details:

Parameter

Type

Description

sentenceId

Int32

The sentence ID.

delayMs

Int64

The audio loopback latency in milliseconds.

onAudioAccompanyStateChanged

If music accompaniment is played through the RTC instance during the current call, this playback callback is triggered.

 @objc optional func onAudioAccompanyStateChanged(state: ARTCAICallAudioAccompanyState, errorCode: ARTCAICallAudioAccompanyErrorCode)

Parameter details:

Parameter

Type

Description

state

ARTCAICallAudioAccompanyState

The playback state of the music accompaniment:
• ARTCAICallAudioAccompanyStarted (100): Playback started.
• ARTCAICallAudioAccompanyStopped (101): Playback stopped.
• ARTCAICallAudioAccompanyPaused (102): Playback paused.
• ARTCAICallAudioAccompanyResumed (103): Playback resumed.
• ARTCAICallAudioAccompanyEnded (104): Playback finished.
• ARTCAICallAudioAccompanyBuffering (105): Buffering.
• ARTCAICallAudioAccompanyBufferingEnd (106): Buffering ended.
• ARTCAICallAudioAccompanyFailed (107): Playback failed.






















errorCode

ARTCAICallAudioAccompanyErrorCode

The playback error code:
• ARTCAICallAudioAccompanyNoError (0): No error.
• ARTCAICallAudioAccompanyUnknowError (-1): An unknown error occurred.
• ARTCAICallAudioAccompanyOpenFailed (-100): Failed to open the file.
• ARTCAICallAudioAccompanyDecodeFailed (-101): Failed to decode the file.










ARTCAICallEngineFactory details

createEngine

Creates a default engine instance.

public static func createEngine() -> ARTCAICallEngineInterface