This topic describes how to integrate an audio and video agent into your Android application using the AICallKit software development kit (SDK).
Environment requirements
Android Studio plug-in version 4.1.3
Gradle 7.0.2
JDK 11, which is included with Android Studio
Flowchart
Your app can obtain an RTC token from your AppServer, and then call the call(config) method to start a call. During the call, you can call AICallKit APIs to implement interactive features such as live subtitles and interruptions for the AI agent. AICallKit depends on real-time audio and video capabilities, so the features of ApsaraVideo Real-Time Communication (ARTC) have been integrated into AICallKit SDK. If your business scenario requires live streaming and VOD capabilities, consider using ApsaraVideo MediaBox SDK. For more information, see Select and download SDKs.
Integrate the SDK
Add the Alibaba Cloud Maven repository to the project-level build.gradle file.
allprojects { repositories { google() jcenter() maven { url 'https://maven.aliyun.com/repository/google' } maven { url 'https://maven.aliyun.com/repository/public' } } }Add the ARTCAICallKit dependency to the corresponding build.gradle file.
dependencies { implementation 'com.aliyun.aio:AliVCSDK_ARTC:7.5.0' // Replace x.x.x with the version compatible with your project. implementation 'com.aliyun.auikits.android:ARTCAICallKit:2.8.0' }NoteFor the latest compatible version number of the Alibaba Real-Time Communication (ARTC) SDK, see Download and integrate the SDK.
SDK development guide
Step 1: Request audio and video permissions for the app
Your application must check for microphone and camera permissions. If the permissions are not granted, a dialog box must appear to prompt the user for authorization. For sample code, see PermissionUtils.java.
PermissionX.init(this)
.permissions(PermissionUtils.getPermissions())
.request((allGranted, grantedList, deniedList) -> {
});Step 2: Create and initialize the engine
Create and initialize the ARTCAICallEngine instance. The following code provides an example:
String userId = "123"; // Use the ID of the user who has logged on to your app as the userId.
ARTCAICallEngineImpl engine = new ARTCAICallEngineImpl(this, userId);
// If the agent is a digital human, configure the view container for displaying the digital human.
if (aiAgentType == AvatarAgent) {
ViewGroup avatarlayer;
engine.setAgentView(
avatarlayer,
new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT)
);
}
// If the agent is for visual understanding, configure the view container for displaying the local video preview.
else if (aiAgentType == VisionAgent) {
ViewGroup previewLayer;
engine.setLocalView(previewLayer,
new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT)
);
} else if(aiAgentType == VideoAgent) {
ARTCAICallEngine.ARTCAICallVideoCanvas remoteCanvas = new ARTCAICallEngine.ARTCAICallVideoCanvas();
remoteCanvas.zOrderOnTop = false;
remoteCanvas.zOrderMediaOverlay = false;
ViewGroup avatarlayer;
engine.setAgentView(
avatarlayer,
new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT), remoteCanvas
);
ViewGroup previewLayer;
engine.setLocalView(previewLayer,
new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT)
);
}Step 3: Implement callback methods
Implement the necessary engine callback methods. For more information about the engine callback API operations, see API reference.
protected ARTCAICallEngine.IARTCAICallEngineCallback mCallEngineCallback = new ARTCAICallEngine.IARTCAICallEngineCallback() {
@Override
public void onErrorOccurs(ARTCAICallEngine.AICallErrorCode errorCode) {
// An error occurred. End the call.
engine.handup();
}
@Override
public void onCallBegin() {
// The call starts (the user joins the session).
}
@Override
public void onCallEnd() {
// The call ends (the user leaves the session).
}
@Override
public void onAICallEngineRobotStateChanged(ARTCAICallEngine.ARTCAICallRobotState oldRobotState, ARTCAICallEngine.ARTCAICallRobotState newRobotState) {
// Robot status synchronization.
}
@Override
public void onUserSpeaking(boolean isSpeaking) {
// Callback for when the user is speaking.
}
@Override
public void onUserAsrSubtitleNotify(String text, boolean isSentenceEnd, int sentenceId, VoicePrintStatusCode voicePrintStatusCode) {
}
@Override
public void onAIAgentSubtitleNotify(String text, boolean end, int userAsrSentenceId) {
// Synchronize the agent's response.
}
@Override
public void onNetworkStatusChanged(String uid, ARTCAICallEngine.ARTCAICallNetworkQuality quality) {
// Callback for network status.
}
@Override
public void onVoiceVolumeChanged(String uid, int volume) {
// Volume change.
}
@Override
public void onVoiceIdChanged(String voiceId) {
// The timbre of the current call has changed.
}
@Override
public void onVoiceInterrupted(boolean enable) {
// The voice interruption setting for the current call has changed.
}
@Override
public void onAgentVideoAvailable(boolean available) {
// Whether the agent's video is available (stream ingest).
}
@Override
public void onAgentAudioAvailable(boolean available) {
// Whether the agent's audio is available (stream ingest).
}
@Override
public void onAgentAvatarFirstFrameDrawn() {
// Rendering of the first video frame of the digital human.
}
@Override
public void onUserOnLine(String uid) {
// Callback for when the user is online.
}
};
engine.setEngineCallback(mCallEngineCallback);Step 4: Create and initialize ARTCAICallConfig
For more information about ARTCAICallConfig, see ARTCAICallConfig.
ARTCAICallEngine.ARTCAICallConfig artcaiCallConfig = new ARTCAICallEngine.ARTCAICallConfig();
artcaiCallConfig.agentId = "XXX"; // The agent ID. This parameter is required.
artcaiCallConfig.region = "cn-shanghai"; // The agent region. This parameter is required.
artcaiCallConfig.agentType = VoiceAgent; // The agent type: voice-only, digital human, visual understanding, or video call.
engine.init(artcaiCallConfig);Region name | Region ID |
China (Hangzhou) | cn-hangzhou |
China (Shanghai) | cn-shanghai |
China (Beijing) | cn-beijing |
China (Shenzhen) | cn-shenzhen |
Singapore | ap-southeast-1 |
Step 5: Initiate a call to the agent
Call the call() method to initiate a call to the agent. To obtain an authentication token, see Generate an ARTC authentication token. After the call starts, you can handle captions, interrupt the agent's speech, and perform other actions as needed. For more information, see Implement features.
engine.call(token);
// After the call is connected, the following callback is triggered.
public void onCallBegin() {
// The call starts (the user joins the session).
}Step 6: Implement in-call features
After the call starts, you can handle captions, interrupt the agent's speech, and perform other actions as needed. For more information, see Implement features.
Step 7: End the call
Call the hangup() method to end the call with the agent.
engine.handup();