All Products
Search
Document Center

Intelligent Media Services:Initiate a call using server-side API

Last Updated:Dec 15, 2025

This topic describes how to initiate an agent call using server-side API.

Feature introduction

When your business requires real-time monitoring or recording of every call, you can use the server-side GenerateAIAgentCall API to initiate calls. This API must be called from your server, which then passes the result to the client. The client uses this information to join the call.

Workflow

image

After the application starts the AI agent, it can call call() to join the call. During the call, you can use the AICallKit APIs to implement interactive features such as live subtitles and interruption. AICallKit depends on ARTC capabilities and has the functionality of the AliVCSDK_ARTC SDK built in. If your business also requires live streaming or VOD capabilities, you can use a composite SDK such as AliVCSDK_Standard or AliVCSDK_InteractiveLive. For specific recommendations, refer to Select and download SDKs.

Server-side process

The server provides the endpoint generateAIAgentCall to start the call. Its core function is to receive a request from the client, call the GenerateAIAgentCall API, and then pass the result back to the client.

// Example of calling GenerateAIAgentCall
// This file is auto-generated, don't edit it. Thanks.
package com.aliyun.rtc;

import com.alibaba.fastjson.JSON;
import com.aliyun.ice20201109.models.GenerateAIAgentCallResponse;
import com.aliyun.tea.*;

public class SampleGenerateaiagentcall {

    /**
     * <b>description</b> :
     * <p>Initialize the client using an AccessKey pair.</p>
     * @return Client
     *
     * @throws Exception
     */
    public static com.aliyun.ice20201109.Client createClient() throws Exception {
        // Leaking your source code may expose your AccessKey pair and compromise the security of all your resources.
        // The following code is for reference only. We recommend using a more secure method like STS. For more information, see https://help.alibabacloud.com/document_detail/378657.html.
        com.aliyun.teaopenapi.models.Config config = new com.aliyun.teaopenapi.models.Config()
                // Required. Ensure the ALIBABA_CLOUD_ACCESS_KEY_ID environment variable is set.
                .setAccessKeyId("YourAccessKeyId")
                // Required. Ensure the ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variable is set.
                .setAccessKeySecret("YourAccessKeySecret");
        // The service endpoint. Refer to https://api.alibabacloud.com/product/ICE.
        config.endpoint = "ice.cn-shanghai.aliyuncs.com";
        return new com.aliyun.ice20201109.Client(config);
    }



    public static void main(String[] args) throws Exception {
        generateAIAgentCall();
    }


    public static void generateAIAgentCall() throws Exception {
        com.aliyun.ice20201109.Client client = createClient();
        com.aliyun.ice20201109.models.AIAgentTemplateConfig.AIAgentTemplateConfigVoiceChat AIAgentTemplateConfigVoiceChat = new com.aliyun.ice20201109.models.AIAgentTemplateConfig.AIAgentTemplateConfigVoiceChat()
                .setGreeting("Hello!");

        com.aliyun.ice20201109.models.AIAgentTemplateConfig AIAgentTemplateConfig = new com.aliyun.ice20201109.models.AIAgentTemplateConfig()
                .setVoiceChat(AIAgentTemplateConfigVoiceChat);

        com.aliyun.ice20201109.models.GenerateAIAgentCallRequest generateAIAgentCallRequest = new com.aliyun.ice20201109.models.GenerateAIAgentCallRequest()
                .setAIAgentId("YOUR_AIAGENTID")
                .setTemplateConfig(AIAgentTemplateConfig);
        com.aliyun.teautil.models.RuntimeOptions runtime = new com.aliyun.teautil.models.RuntimeOptions();
        try {
            // In a real application, you would handle the API's return value.
            GenerateAIAgentCallResponse resp = client.generateAIAgentCallWithOptions(generateAIAgentCallRequest, runtime);
            System.out.println(JSON.toJSONString(resp));
        } catch (TeaException error) {
            // This is for demonstration only. In production, handle exceptions carefully.
            // Error message
            System.out.println(error.getMessage());
            // Diagnostic address
            System.out.println(error.getData().get("Recommend"));
            com.aliyun.teautil.Common.assertAsString(error.message);
        } catch (Exception _error) {
            TeaException error = new TeaException(_error.getMessage(), _error);
            // This is for demonstration only. In production, handle exceptions carefully.
            // Error message
            System.out.println(error.getMessage());
            // Diagnostic address
            System.out.println(error.getData().get("Recommend"));
            com.aliyun.teautil.Common.assertAsString(error.message);
        }
    }
}

Client-side process

Step 1: Create and initialize the call engine

The following code shows how to create and initialize the call engine using the AICallKit SDK:

Android
ARTCAICallEngine mARTCAICallEngine = null;

// Create an engine instance.
void initEngine(Context context, String userId) {
    // Initialize the engine
    // context -> Android Context
    // userId -> The user ID for joining the RTC channel
    mARTCAICallEngine = new ARTCAICallEngineImpl(context, userId);

    // Specify the agent type: Voice, Avatar, or Vision.
    ARTCAICallEngine.ARTCAICallAgentType aiAgentType = VoiceAgent;
    mARTCAICallEngine.setAICallAgentType(aiAgentType);

    // For Avatar agents, configure the view container for the avatar.
    if (aiAgentType == AvatarAgent) {
        ViewGroup avatarlayer;
        engine.setAgentView(
                avatarlayer,
                new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                        ViewGroup.LayoutParams.MATCH_PARENT));
    }
    // For Vision agents, configure the view container for the local video preview.
    else if (aiAgentType == VisionAgent) {
        ViewGroup previewLayer;
        engine.setLocalView(previewLayer,
                new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                        ViewGroup.LayoutParams.MATCH_PARENT));
    } else if (aiAgentType == VideoAgent) {
        ARTCAICallEngine.ARTCAICallVideoCanvas remoteCanvas = new ARTCAICallEngine.ARTCAICallVideoCanvas();
        remoteCanvas.zOrderOnTop = false;
        remoteCanvas.zOrderMediaOverlay = false;
        ViewGroup avatarlayer;
        engine.setAgentView(
                avatarlayer,
                new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                        ViewGroup.LayoutParams.MATCH_PARENT),
                remoteCanvas);

        ViewGroup previewLayer;
        engine.setLocalView(previewLayer,
                new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                        ViewGroup.LayoutParams.MATCH_PARENT));
    }
}

// Set callbacks.
void initCallback() {
    mARTCAICallEngine.setEngineCallback(mCallEngineCallbackWrapper);
}

// Handle callbacks (example of core callbacks only).
ARTCAICallEngine.IARTCAICallEngineCallback mCallEngineCallbackWrapper = new ARTCAICallEngine.IARTCAICallEngineCallback() {
    @Override
    public void onErrorOccurs(ARTCAICallEngine.AICallErrorCode errorCode) {
        // An error occurred. End the call.
        mARTCAICallEngine.handup();
    }

    @Override
    public void onCallBegin() {
        // Call has started (joined the channel)
    }

    @Override
    public void onCallEnd() {
        // Call has ended (left the channel)
    }

    @Override
    public void onAICallEngineRobotStateChanged(ARTCAICallEngine.ARTCAICallRobotState oldRobotState,
            ARTCAICallEngine.ARTCAICallRobotState newRobotState) {
        // Agent state changed
    }

    @Override
    public void onUserSpeaking(boolean isSpeaking) {
        // User speaking status changed
    }

    @Override
    public void onUserAsrSubtitleNotify(String text, boolean isSentenceEnd, int sentenceId) {
        // ASR result for the user's speech
    }

    @Override
    public void onAIAgentSubtitleNotify(String text, boolean end, int userAsrSentenceId) {
        // Subtitle for the agent's response
    }

    @Override
    public void onNetworkStatusChanged(String uid, ARTCAICallEngine.ARTCAICallNetworkQuality quality) {
        // Network status changed
    }

    @Override
    public void onVoiceVolumeChanged(String uid, int volume) {
        // Voice volume changed
    }

    @Override
    public void onVoiceIdChanged(String voiceId) {
        // The voice ID for the current call has changed
    }

    @Override
    public void onVoiceInterrupted(boolean enable) {
        // Voice interruption setting for the current call has changed
    }

    @Override
    public void onAgentVideoAvailable(boolean available) {
        // Agent's video stream availability changed
    }

    @Override
    public void onAgentAudioAvailable(boolean available) {
        // Agent's audio stream availability changed
    }

    @Override
    public void onAgentAvatarFirstFrameDrawn() {
        // First frame of the avatar's video has been rendered
    }

    @Override
    public void onUserOnLine(String uid) {
        // User came online
    }

};
iOS
// Create an engine instance
let engine = ARTCAICallEngineFactory.createEngine()
let agentType: ARTCAICallAgentType

// Initialize the engine instance
public func setup() {
    // Set the delegate
    self.engine.delegate = self

    // For Avatar agents, configure the view for the avatar
    if self.agentType == .AvatarAgent {
        let agentViewConfig = ARTCAICallViewConfig(view: self.avatarAgentView)
        self.engine.setAgentViewConfig(viewConfig: agentViewConfig)
    }
    // or Vision agents, configure the view for the local camera preview
    else if self.agentType == .VisionAgent {
        let cameraViewConfig = ARTCAICallViewConfig(view: self.cameraView)
        self.engine.setLocalViewConfig(viewConfig: cameraViewConfig)
    }
    // For Video agents, configure both the avatar view and the local camera preview
    else if self.agentType == .VisionAgent {
        let agentViewConfig = ARTCAICallViewConfig(view: self.avatarAgentView)
        self.engine.setAgentViewConfig(viewConfig: agentViewConfig)

        let cameraViewConfig = ARTCAICallViewConfig(view: self.cameraView)
        self.engine.setLocalViewConfig(viewConfig: cameraViewConfig)
    }
}

// Handle callbacks (example of core callbacks only)
public func onErrorOccurs(code: ARTCAICallErrorCode) {
    // An error occurred
    self.engine.handup()
}

public func onCallBegin() {
    // Call has started
}

public func onCallEnd() {
    // Call has ended
}

public func onAgentStateChanged(state: ARTCAICallAgentState) {
    // Agent state changed
}

public func onUserSubtitleNotify(text: String, isSentenceEnd: Bool, sentenceId: Int) {
    // Notification with ASR result of user's speech
}

public func onVoiceAgentSubtitleNotify(text: String, isSentenceEnd: Bool, userAsrSentenceId: Int) {
    // Notification with agent's response subtitle
}

public func onVoiceIdChanged(voiceId: String) {
    // The voice ID for the current call has changed
}

public func onVoiceInterrupted(enable: Bool) {
    // Voice interruption setting for the current call has changed
}
Web
// Import the SDK
import AICallEngine, { AICallErrorCode, AICallAgentState, AICallAgentType } from 'aliyun-auikit-aicall';

// Create an engine instance
const engine = new AICallEngine();

// For other method calls, refer to the API reference.

// Handle callbacks (example of core callbacks only)
engine.on('errorOccurred', (code) => {
  // An error occurred
  engine.handup();
});

engine.on('callBegin', () => {
  // Call has started
});

engine.on('callEnd', () => {
  // Call has ended
});

engine.on('agentStateChanged', (state) => {
  // Agent state changed
});

engine.on('userSubtitleNotify', (subtitle) => {
  // Notification with ASR result of user's speech
});

engine.on('agentSubtitleNotify', (subtitle) => {
  // Notification with agent's response subtitle
});

engine.on('voiceIdChanged', (voiceId) => {
  // The voice ID for the current call has changed
});

engine.on('voiceInterruptChanged', (enable) => {
  // Voice interruption setting for the current call has changed
});

// Initialize the engine
await engine.init(agentType);

Step 2: Start a call

The following code example shows how to start a call using the AICallKit SDK:

Android
// Start a call after an agent is started
void call() {
    // Configure the startup parameters for the engine
    ARTCAICallEngine.ARTCAICallConfig artcaiCallConfig = new ARTCAICallEngine.ARTCAICallConfig();
    // Specify a temporary agent ID, if required
    artcaiCallConfig.agentId = aiAgentId;
    artcaiCallConfig.region = "cn-shanghai";// Agent region, required
    artcaiCallConfig.agentType = VoiceAgent;// Specify the type of the agent, such as voice, avatar, or vision agent
    mARTCAICallEngine.init(artcaiCallConfig);

    // Obtain aIAgentInstanceId, rtcAuthToken, aIAgentUserId, and channelId from the server
    String aIAgentInstanceId = "XXX";
    String rtcAuthToken = "XXX";
    String aIAgentUserId = "XXX";
    String channelId = "XXX";
    mARTCAICallEngine.call(rtcAuthToken, aIAgentInstanceId, 
                           aIAgentUserId, channelId);
}
iOS
public func call() {
    // Obtain agent_instance_id, rtc_auth_token, ai_agent_user_id, and channel_id from the server
    let agentInfo = ARTCAICallAgentInfo(agentType: self.agentType, channelId: channel_id, uid: ai_agent_user_id, instanceId: agent_instance_id)
    self.engine.call(userId: self.userId, token: rtc_auth_token, agentInfo: agentInfo) { [weak self] error in
        if let error = error {
            // Handle the error
        }
        else {
            // The call is successful
        }
    }
}
Web
const userId = 'xxx';  // The ID of the user who initiates the call

// Obtain the agent information from the server
// You need to deploy your own App Server and call GenerateAIAgentCall to generate a call instance
// The return value is agentInfo
// agentInfo.instanceId: the ID of the agent instance
// agentInfo.channelId: the ID of the ARTC channel
// agentInfo.userId: the ID of the ARTC user
// agentInfo.rtcToken: the ARTC token
// agentInfo.reqId: the request ID
const agentInfo = await fetchAgentInfo();
// It can be handled together with fetchAgentInfo. agentType indicates the agent type. For more information, see AICallAgentType.
await engine.init(agentType);

try {
  // Start the call after the agent is started
  engine.call(userId, agentInfo);
} catch (error) {}

Step 3: End a call

The following code example shows how to end a call using the AICallKit SDK:

Android
// End the call
void handup() {
    mARTCAICallEngine.handup();
}
iOS
public func handup() {
    // End the call
    self.engine.handup()
}
Web
// End the call
engine.handup();

Reference

After starting a call and before hanging up, you can handle features such as subtitles and agent interruption based on your business needs. For more information, see Basic features.