This topic describes how to integrate an audio/video agent into your Android application using the AICallKit software development kit (SDK).
Prerequisites
Android Studio plugin version 4.1.3
Gradle 7.0.2
JDK 11, which comes with Android Studio
Flowchart
Your app can obtain an RTC token from your AppServer, and then call the call(config) method to start a call. During the call, you can call AICallKit APIs to implement interactive features such as live subtitles and interruptions for the AI agent. AICallKit depends on real-time audio and video capabilities, so the features of ApsaraVideo Real-Time Communication (ARTC) have been integrated into AICallKit SDK. If your business scenario requires live streaming and VOD capabilities, consider using ApsaraVideo MediaBox SDK. For more information, see Select and download SDKs.
Integrate the SDK
Add the Alibaba Cloud Maven repository to the project-level build.gradle file.
allprojects { repositories { google() jcenter() maven { url 'https://maven.aliyun.com/repository/google' } maven { url 'https://maven.aliyun.com/repository/public' } } }In the corresponding build.gradle file, add the ARTCAICallKit dependency.
dependencies { implementation 'com.aliyun.aio:AliVCSDK_ARTC:x.x.x' // Replace x.x.x with the version that is compatible with your project. implementation 'com.aliyun.auikits.android:ARTCAICallKit:x.x.x' implementation 'com.alivc.live.component:PluginAEC:2.0.0' }NoteLatest ARTC SDK version: 7.9.1
Latest AICallKit SDK version: 2.9.1.
SDK development guide
Step 1: Request audio and video permissions for the app
You can check for microphone and camera permissions. If the permissions are not granted, you can prompt the user for authorization. You must implement this feature in your application. For sample code, see PermissionUtils.java.
PermissionX.init(this)
.permissions(PermissionUtils.getPermissions())
.request((allGranted, grantedList, deniedList) -> {
});Step 2: Create and initialize the engine
You can create and initialize the ARTCAICallEngine. The following code provides an example:
String userId = "123"; // Use the user ID from your app's logon system for userId.
ARTCAICallEngineImpl engine = new ARTCAICallEngineImpl(this, userId);
// If the agent is a digital human, configure the view container for the digital human.
if (aiAgentType == AvatarAgent) {
ViewGroup avatarlayer;
engine.setAgentView(
avatarlayer,
new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT)
);
}
// If the agent is a visual understanding agent, configure the view container for the local video preview.
else if (aiAgentType == VisionAgent) {
ViewGroup previewLayer;
engine.setLocalView(previewLayer,
new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT)
);
} else if(aiAgentType == VideoAgent) {
ARTCAICallEngine.ARTCAICallVideoCanvas remoteCanvas = new ARTCAICallEngine.ARTCAICallVideoCanvas();
remoteCanvas.zOrderOnTop = false;
remoteCanvas.zOrderMediaOverlay = false;
ViewGroup avatarlayer;
engine.setAgentView(
avatarlayer,
new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT), remoteCanvas
);
ViewGroup previewLayer;
engine.setLocalView(previewLayer,
new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT)
);
}Step 3: Implement callback methods
You can implement engine callbacks as needed. For more information about the engine callback API operations, see API operation details.
protected ARTCAICallEngine.IARTCAICallEngineCallback mCallEngineCallback = new ARTCAICallEngine.IARTCAICallEngineCallback() {
@Override
public void onErrorOccurs(ARTCAICallEngine.AICallErrorCode errorCode) {
// An error occurred. End the call.
engine.handup();
}
@Override
public void onCallBegin() {
// The call starts (user joins the session).
}
@Override
public void onCallEnd() {
// The call ends (user leaves the session).
}
@Override
public void onAICallEngineRobotStateChanged(ARTCAICallEngine.ARTCAICallRobotState oldRobotState, ARTCAICallEngine.ARTCAICallRobotState newRobotState) {
// Agent state synchronization.
}
@Override
public void onUserSpeaking(boolean isSpeaking) {
// Callback for when the user is speaking.
}
@Override
public void onUserAsrSubtitleNotify(String text, boolean isSentenceEnd, int sentenceId, VoicePrintStatusCode voicePrintStatusCode) {
}
@Override
public void onAIAgentSubtitleNotify(String text, boolean end, int userAsrSentenceId) {
// Sync the agent's response.
}
@Override
public void onNetworkStatusChanged(String uid, ARTCAICallEngine.ARTCAICallNetworkQuality quality) {
// Callback for network status.
}
@Override
public void onVoiceVolumeChanged(String uid, int volume) {
// Volume change.
}
@Override
public void onVoiceIdChanged(String voiceId) {
// The voice timbre for the current call has changed.
}
@Override
public void onVoiceInterrupted(boolean enable) {
// The voice interruption setting for the current call has changed.
}
@Override
public void onAgentVideoAvailable(boolean available) {
// Whether the agent's video is available (stream ingest).
}
@Override
public void onAgentAudioAvailable(boolean available) {
// Whether the agent's audio is available (stream ingest).
}
@Override
public void onAgentAvatarFirstFrameDrawn() {
// The first video frame of the digital human is rendered.
}
@Override
public void onUserOnLine(String uid) {
// Callback for when a user comes online.
}
};
engine.setEngineCallback(mCallEngineCallback);Step 4: Create and initialize ARTCAICallConfig
For more information about ARTCAICallConfig, see ARTCAICallConfig.
ARTCAICallEngine.ARTCAICallConfig artcaiCallConfig = new ARTCAICallEngine.ARTCAICallConfig();
artcaiCallConfig.agentId = "XXX"; // The agent ID. This parameter is required.
artcaiCallConfig.region = "cn-shanghai";// The agent region. This parameter is required.
artcaiCallConfig.agentType = VoiceAgent;// The agent type: voice-only, digital human, visual understanding, or video call.
engine.init(artcaiCallConfig);Region name | Region ID |
China (Hangzhou) | cn-hangzhou |
China (Shanghai) | cn-shanghai |
China (Beijing) | cn-beijing |
China (Shenzhen) | cn-shenzhen |
Singapore | ap-southeast-1 |
Step 5: Initiate a call to the agent
You can invoke the call() API operation to initiate a call to the agent. To obtain an authentication token, see Generate an ARTC authentication token. After the call starts, you can process captions, interrupt the agent, and perform other actions as needed. For more information, see Implement features.
engine.call(token);
// After the call is connected, the following callback is triggered.
public void onCallBegin() {
// The call starts (user joins the session).
}Step 6: Implement in-call features
After the call starts, you can process captions, interrupt the agent, and perform other actions as needed. For more information, see Implement features.
Step 7: End the call
You can invoke the handup() API operation to end the call with the agent.
engine.handup();