This topic describes how to integrate AICallKit SDK into your iOS app.
Environment requirements
Xcode 16.0 or later. We recommend that you use the latest official version.
CocoaPods 1.9.3 or later.
A physical device that runs iOS 10.0 or later.
Integrate the SDK
target 'Your target' do
# Integrate ApsaraVideo MediaBox SDK for Alibaba Real-Time Communication (ARTC). You can integrate AliVCSDK_ARTC, AliVCSDK_Standard, or AliVCSDK_InteractiveLive.
pod 'AliVCSDK_ARTC', '~> x.x.x'
# Integrate AICallKit SDK.
pod 'ARTCAICallKit', '~> 1.6.0'
...
end
You can go to the official website to download ARTC SDK of the latest version.
Configure the project
Open the info.Plist file of your project and add the NSMicrophoneUsageDescription and NSCameraUsageDescription permissions.
In the project settings, enable Background Modes on the Signing & Capabilities tab. We recommend that you enable Background Modes. Otherwise, calls that are switched to the background cannot continue. In this case, you must call the handup() method to end the calls.
Sample code for using the SDK
// Integrate the SDK.
import ARTCAICallKit
// Create an engine instance.
var engine: ARTCAICallEngineInterface = {
return ARTCAICallEngineFactory.createEngine()
}()
// Configure callback events.
self.engine.delegate = self
// Start a call after an intelligent agent is started.
let agentInfo = ARTCAICallAgentInfo(agentType: workflow_type, channelId: channel_id, uid: ai_agent_user_id, instanceId: agent_instance_id)
self.engine.call(userId: self.userId, token: rtc_auth_token, agentInfo: agentInfo) { [weak self] error in
if let error = error {
// Handle the error.
}
else {
// The call is successful.
}
}
// End the call.
self.engine.handup()
// For more information about how to call other methods, see the "SDK reference" section of this topic.
// The callback events. The following examples include only callback events that do not require specialized main code.
public func onErrorOccurs(code: ARTCAICallErrorCode) {
// An error occurred.
self.engine.handup()
}
public func onCallBegin() {
// The call starts.
}
public func onCallEnd() {
// The call ends.
}
public func onAgentStateChanged(state: ARTCAICallAgentState) {
// The status of the intelligent agent changes.
}
public func onUserSubtitleNotify(text: String, isSentenceEnd: Bool, sentenceId: Int) {
// The question of the user is recognized by the intelligent agent.
}
public func onVoiceAgentSubtitleNotify(text: String, isSentenceEnd: Bool, userAsrSentenceId: Int) {
// The intelligent agent returns an answer.
}
public func onVoiceIdChanged(voiceId: String) {
// The voice of the current call changes.
}
public func onVoiceInterrupted(enable: Bool) {
// Indicate whether voice interruption is enabled for the current call.
}
API reference
API overview
Class or protocol | API | Description |
ARTCAICallEngineInterface an engine instance | Queries the ID of the user on the current call. | |
Indicates whether a call is in progress. | ||
Queries the information about the current intelligent agent. | ||
Queries the status of the current intelligent agent. | ||
Configures and queries callback events. | ||
Starts a call. | ||
Ends a call. | ||
Configures the rendering view for the intelligent agent. This setting is valid only for the digital human agent. | ||
Interrupts the speech of the intelligent agent. | ||
Enables or disables intelligent interruption. | ||
Changes the voice. | ||
Enables or disables the speaker. | ||
Enables or disables intercom mode. | ||
Starts speaking in intercom mode. | ||
Finishes speaking in intercom mode. | ||
Cancels speaking in intercom mode. | ||
Mutes or unmutes the microphone. | ||
The parameter configuration for visual understanding calls. | ||
Turns on or off a camera. | ||
Switches between the front and rear cameras. | ||
Parses the information about a shared intelligent agent. | ||
Starts a call with a shared intelligent agent. | ||
Queries the information about a Real-Time Communication (RTC) engine instance. | ||
Releases resources. | ||
ARTCAICallEngineDelegate the callback events of an engine instance | An error occurred. | |
A call starts. | ||
A call ends. | ||
Indicates whether the ingested video stream of the intelligent agent is available. | ||
Indicates whether the ingested audio stream of the intelligent agent is available. | ||
Indicates whether intercom mode is enabled for the current call. | ||
The intelligent agent is about to end the current call. | ||
A custom message is received from the current intelligent agent. | ||
The status of the intelligent agent changes. | ||
The network status changes. | ||
The volume changes. | ||
The intelligent agent recognizes the question of the user. | ||
The intelligent agent returns an answer. | ||
The voice of the intelligent agent changes. | ||
Indicates whether voice interruption is enabled for the current call. | ||
The first video frame of the digital human agent is rendered. | ||
A human agent is stepping in to take over from the current intelligent agent. | ||
A human agent has taken over from the current intelligent agent. | ||
The intelligent agent detects an emotion. | ||
ARTCAICallEngineFactory an engine factory | Creates a default engine instance. |
ARTCAICallEngineInterface
userId
Queries the ID of the user on the current call.
var userId: String? {get}
isOnCall
Indicates whether a call is in progress. Valid values: true and false.
var isOnCall: Bool { get }
agentInfo
Queries the information about the current intelligent agent. The information includes the type, channel ID, UID of the intelligent agent in the channel, and instance ID of the intelligent agent.
var agentInfo: ARTCAICallAgentInfo? { get }
agentState
The status of the intelligent agent. The agent can be listening, thinking, or speaking.
var agentState: ARTCAICallAgentState { get }
delegate
Configures and queries callback events.
weak var delegate: ARTCAICallEngineDelegate? { get set }
call
Starts a call.
func call(userId: String, token: String, agentInfo: ARTCAICallAgentInfo, completed:((_ error: NSError?) -> Void)?)
Parameters
Parameter | Type | Description |
userId | String | The UID of the current user. |
token | String | The token that is used to join a meeting. |
agentInfo | ARTCAICallAgentInfo | The information about the intelligent agent. |
completed | ((_ error: NSError?) -> Void)? | The callback that is invoked once the process is finished. |
handup
Ends a call.
func handup(_ stopAIAgent: Bool)
Parameters
Parameter | Type | Description |
stopAIAgent | Bool | Specifies whether to stop the intelligent agent. |
setAgentViewConfig
Configures the rendering view for the intelligent agent. This setting is valid only for the digital human agent.
func setAgentViewConfig(viewConfig: ARTCAICallViewConfig?)
Parameters
Parameter | Type | Description |
viewConfig | ARTCAICallViewConfig? | The rendering view settings, including the rendering view, rendering mode, mirror mode, and rotation mode. |
interruptSpeaking
Interrupts the speech of the intelligent agent.
func interruptSpeaking() -> Bool
enableVoiceInterrupt
Enables or disables intelligent interruption.
func enableVoiceInterrupt(enable: Bool) -> Bool
Parameters
Parameter | Type | Description |
enable | Bool | Specifies whether to enable intelligent interruption. |
switchVoiceId
Changes the voice.
func switchVoiceId(voiceId: String) -> Bool
Parameters
Parameter | Type | Description |
voiceId | String | The voice ID. |
enableSpeaker
Enables or disables the speaker.
func enableSpeaker(enable: Bool) -> Bool
Parameters
Parameter | Type | Description |
enable | Bool | Specifies whether to enable the speaker. |
enablePushToTalk
Enables or disables intercom mode. In intercom mode, the intelligent agent returns a result only after the finishPushToTalk operation is called.
func enablePushToTalk(enable: Bool) -> Bool
Parameters
Parameter | Type | Description |
enable | Bool | Specifies whether to enable intercom mode. |
startPushToTalk
Starts speaking in intercom mode.
func startPushToTalk() -> Bool
finishPushToTalk
Finishes speaking in intercom mode.
func finishPushToTalk() -> Bool
cancelPushToTalk
Cancels speaking in intercom mode.
func cancelPushToTalk() -> Bool
muteMicrophone
Mutes or unmutes the microphone.
func muteMicrophone(mute: Bool) -> Bool
Parameters
Parameter | Type | Description |
mute | Bool | Specifies whether to mute the microphone. |
visionConfig
The visual configuration, including the resolution and frame rate. For more information, see ARTCAICallVisionConfig. The configuration is required if you use a visual intelligent agent. The configuration takes effect only if you set it before the call.
var visionConfig: ARTCAICallVisionConfig { set get }
muteLocalCamera
Turns on or off a camera.
func muteLocalCamera(mute: Bool) -> Bool
Parameters
Parameter | Type | Description |
mute | Bool | Specifies whether to turn on a camera. |
switchCamera
Switches between the front and rear cameras.
func switchCamera() -> Bool
parseShareAgentCall
Parses the information about a shared intelligent agent. If the parsing is successful, an instance of the ARTCAICallAgentShareConfig type is returned, which can be used to start a call with the intelligent agent.
func parseShareAgentCall(shareInfo: String) -> ARTCAICallAgentShareConfig?
Parameters
Parameter | Type | Description |
shareInfo | String | The shared intelligent agent information, which can be generated in the console. |
generateShareAgentCall
Starts a call with a shared intelligent agent.
func generateShareAgentCall(shareConfig: ARTCAICallAgentShareConfig, userId: String, completed: ((_ rsp: ARTCAICallAgentInfo?, _ token: String?, _ error: NSError?, _ reqId: String) -> Void)?)
Parameters
Parameter | Type | Description |
shareConfig | ARTCAICallAgentShareConfig | The share configurations, such as the share ID, intelligent agent type, expiration time, template settings, and region. You can use the SDK to view their definitions. |
userId | String | The ID of the logon user. |
completed | ((_ rsp: ARTCAICallAgentInfo?, _ token: String?, _ error: NSError?, _ reqId: String) -> Void)? | The callback that is invoked once the process is finished. |
getRTCInstance
Queries the information about an RTC engine instance.
func getRTCInstance() -> AnyObject?
destroy
Releases resources.
func destroy()
ARTCAICallEngineDelegate
onErrorOccurs
An error occurred during the current call.
@objc optional func onErrorOccurs(code: ARTCAICallErrorCode)
Parameters
Parameter | Type | Description |
code | ARTCAICallErrorCode | The error code. |
onCallBegin
A call starts.
@objc optional func onCallBegin()
onCallEnd
A call ends.
@objc optional func onCallEnd()
onAgentVideoAvailable
Indicates whether the ingested video stream of the intelligent agent is available.
@objc optional func onAgentVideoAvailable(available: Bool)
Parameters
Parameter | Type | Description |
available | Bool | Indicates whether the ingested video stream of the intelligent agent is available. |
onAgentAudioAvailable
Indicates whether the ingested audio stream of the intelligent agent is available.
@objc optional func onAgentAudioAvailable(available: Bool)
Parameters
Parameter | Type | Description |
available | Bool | Indicates whether the ingested audio stream of the intelligent agent is available. |
onPushToTalk
Indicates whether intercom mode is enabled for the current call.
@objc optional func onPushToTalk(enable: Bool)
Parameters
Parameter | Type | Description |
enable | Bool | Indicates whether intercom mode is enabled for the current call. |
onAgentWillLeave
The intelligent agent is about to end the current call.
@objc optional func onAgentWillLeave(reason: Int32, message: String)
Parameters
Parameter | Type | Description |
reason | Int32 | The reason why the agent is leaving. A value of 2001 indicates idle timeout. A value of 0 indicates other reasons. |
message | String | The description of the reason. |
onReceivedAgentCustomMessage
A custom message is received from the current intelligent agent.
@objc optional func onReceivedAgentCustomMessage(data: [String: Any]?)
Parameters
Parameter | Type | Description |
data | [String: Any]? | The message content. |
onAgentStateChanged
The status of the intelligent agent changes.
@objc optional func onAgentStateChanged(state: ARTCAICallAgentState)
Parameters
Parameter | Type | Description |
state | ARTCAICallAgentState | The status of the intelligent agent. The agent can be listening, thinking, or speaking. |
onNetworkStatusChanged
The network status changes.
@objc optional func onNetworkStatusChanged(uid: String, quality: ARTCAICallNetworkQuality)
Parameters
Parameter | Type | Description |
uid | String | The ID of the current speaker. |
quality | ARTCAICallNetworkQuality | The network quality. Valid values: Excellent, Good, Moderate, Poor, Extremely Poor, Interrupted, and Unknown. |
onVoiceVolumeChanged
The volume changes.
@objc optional func onVoiceVolumeChanged(uid: String, volume: Int32)
Parameters
Parameter | Type | Description |
uid | String | The UID of the current speaker. |
volume | Int32 | The volume. Valid values: 0 to 255. |
onUserSubtitleNotify
The intelligent agent recognizes the question of the user.
@objc optional func onUserSubtitleNotify(text: String, isSentenceEnd: Bool, sentenceId: Int)
Parameters
Parameter | Type | Description |
text | String | The text that is recognized by the intelligent agent. |
isSentenceEnd | Bool | Indicates whether the current text is the end of the sentence. |
sentenceId | Int | The ID of the sentence to which the current text belongs. |
onVoiceAgentSubtitleNotify
The intelligent agent returns an answer.
@objc optional func onVoiceAgentSubtitleNotify(text: String, isSentenceEnd: Bool, userAsrSentenceId: Int)
Parameters
Parameter | Type | Description |
text | String | The text of the answer provided by the intelligent agent. |
isSentenceEnd | Bool | Indicates whether the current text is the last sentence of the answer. |
userAsrSentenceId | Int | The ID of the sentence to which the question of the user recognized by the intelligent agent belongs. |
onVoiceIdChanged
The voice of the intelligent agent changes.
@objc optional func onVoiceIdChanged(voiceId: String)
Parameters
Parameter | Type | Description |
voiceId | String | The voice ID. |
onVoiceInterrupted
Indicates whether voice interruption is enabled for the current call.
@objc optional func onVoiceInterrupted(enable: Bool)
Parameters
Parameter | Type | Description |
enable | Bool | Indicates whether intelligent interruption is enabled for the current call. |
onAgentAvatarFirstFrameDrawn
The first video frame of the digital human agent is rendered.
@objc optional func onAgentAvatarFirstFrameDrawn()
onHumanTakeoverWillStart
A human agent is stepping in to take over from the current intelligent agent.
@objc optional func onHumanTakeoverWillStart(takeoverUid: String, takeoverMode: Int)
Parameters
Parameter | Type | Description |
takeoverUid | String | The UID of the human agent. |
takeoverMode | Int | 1 indicates the use of human voice. 0 indicates the use of intelligent agent voice. |
onHumanTakeoverConnected
A human agent has taken over from the current intelligent agent.
@objc optional func onHumanTakeoverConnected(takeoverUid: String)
Parameters
Parameter | Type | Description |
takeoverUid | String | The UID of the human agent. |
onAgentEmotionNotify
The intelligent agent detects an emotion.
@objc optional func onAgentEmotionNotify(emotion: String, userAsrSentenceId: Int)
Parameters
Parameter | Type | Description |
emotion | String | The emotion tag. Valid values: neutral, happy, angry, and sad. |
userAsrSentenceId | Int | The ID of the sentence to which the question of the user recognized by the intelligent agent belongs. |
ARTCAICallEngineFactory
createEngine
Creates a default engine instance.
public static func createEngine() -> ARTCAICallEngineInterface