All Products
Search
Document Center

ApsaraVideo Live:Common audio operations and configurations

Last Updated:Dec 17, 2025

This topic describes common audio operations and configurations in the ARTC SDK.

Feature introduction

The ARTC SDK provides various audio configuration and operation features, such as setting audio encoding and scenario modes, managing local audio capture and playback, controlling remote audio playback, applying in-ear monitoring, and setting the audio route.

Sample code

Android: Android/ARTCExample/BasicUsage/src/main/java/com/aliyun/artc/api/basicusage/AudioBasicUsage/AudioBasicUsageActivity.java

iOS: iOS/ARTCExample/BasicUsage/AudioBasicUsage/AudioBasicUsageVC.swift

Before you begin

Make sure you meet the following requirements:

Implementation

1. Set audio encoding and scenario modes (before joining a channel)

image

The ARTC SDK provides the setAudioProfile API to set different audio encoding modes and scenario modes. This allows developers to fine-tune audio quality based on different scenarios and user needs.

Note
  • setAudioProfile can only be called before joining a channel and cannot be reset after joining.

  • We recommend using AliRtcEngineHighQualityMode (high-quality audio encoding mode) and AliRtcSceneMusicMode (music scenario).

1.1. Audio encoding modes (AliRtcAudioProfile)

Note
  • High-quality mode (AliRtcEngineHighQualityMode) is recommended.

  • If interoperability with the Web is required, choose a mode with a 48 kHz sample rate.

  • For stereo sound, set the mode to AliRtcEngineStereoHighQualityMode.

Enumeration

Description

Sample rate

Channels

Max bitrate

AliRtcEngineLowQualityMode

Low-quality audio mode

8000 Hz

Mono

12 kbps

AliRtcEngineBasicQualityMode

Standard-quality audio mode

16000 Hz

Mono

24 kbps

AliRtcEngineHighQualityMode

High-quality audio mode

48000 Hz

Mono

64 kbps

AliRtcEngineStereoHighQualityMode

Stereo high-quality audio mode

48000 Hz

Stereo

80 kbps

AliRtcEngineSuperHighQualityMode

Super-high-quality audio mode

48000 Hz

Mono

96 kbps

AliRtcEngineStereoSuperHighQualityMode

Stereo super-high-quality audio mode

48000 Hz

Stereo

128 kbps

1.2. Audio scenario modes (AliRtcAudioScenario)

Enumeration

Description

AliRtcSceneDefaultMode

Uses hardware 3A algorithm and allows audio capture from Bluetooth devices. Set to this mode if you need Bluetooth capture.

AliRtcSceneMusicMode

(Recommended) Music scenario. Uses software 3A algorithm and captures audio from the mobile phone.

1.3. Sample code

The following provides example settings for audio encoding and scenario modes for common scenarios.

For Bluetooth capture

Android

// Must be set to AliRtcSceneDefaultMode scenario
mAliRtcEngine.setAudioProfile(AliRtcEngineHighQualityMode, AliRtcSceneDefaultMode);

iOS

// Must be set to AliRtcSceneDefaultMode scenario.
engine.setAudioProfile(AliRtcAudioProfile.engineHighQualityMode, audio_scene: AliRtcAudioScenario.sceneDefaultMode)

For interoperability with Web

Android

// Must be set to an encoding mode with a 48k sample rate, such as AliRtcEngineHighQualityMode.
mAliRtcEngine.setAudioProfile(AliRtcEngineHighQualityMode, AliRtcSceneMusicMode);

iOS

// Must be set to an encoding mode with a 48k sample rate, such as AliRtcEngineHighQualityMode.
engine.setAudioProfile(AliRtcAudioProfile.engineHighQualityMode, audio_scene: AliRtcAudioScenario.sceneMusicMode)

For stereo sound

Android

// Set to a mode with Stereo, such as AliRtcEngineStereoHighQualityMode.
mAliRtcEngine.setAudioProfile(AliRtcEngineStereoHighQualityMode, AliRtcSceneMusicMode);

iOS

// Set to a mode with Stereo, such as AliRtcEngineStereoHighQualityMode.
engine.setAudioProfile(AliRtcAudioProfile.engineStereoHighQualityMode, audio_scene: AliRtcAudioScenario.sceneMusicMode)

2. Configure local audio capture

This section describes how to control local audio capture, such as muting the microphone or stopping microphone capture. The main APIs and their differences are as follows:

API

muteLocalMic

stopAudioCapture/startAudioCapture

How it works

Sends silent frames.

Stops/starts microphone capture.

When to call

Can be called before or after joining a channel.

Called after joining a channel.

Releases microphone resources?

No

Yes

2.1. Mute the microphone

image

ARTC provides the muteLocalMic API to mute the microphone and external audio input. This API can be called before or after joining a channel.

Note

Unlike stopAudioCapture, calling the muteLocalMic API does not release microphone resources. The microphone capture and encoding modules continue to run, but they send silent frames at a very low bitrate.

The supported AliRtcMuteLocalAudioMode modes are as follows:

AliRtcMuteAudioModeDefault

Default mode. Behaves the same as AliRtcMuteAllAudioMode.

AliRtcMuteAllAudioMode

Mutes all audio. Stops publishing audio from both microphone capture and external PCM input.

AliRtcMuteOnlyMicAudioMode

Mutes microphone only. Stops publishing audio from microphone capture only.

Sample code:

// Mute all
mAliRtcEngine.muteLocalMic(true, AliRtcEngine.AliRtcMuteLocalAudioMode.AliRtcMuteAllAudioMode);
// Unmute all
mAliRtcEngine.muteLocalMic(false, AliRtcEngine.AliRtcMuteLocalAudioMode.AliRtcMuteAllAudioMode);
// Mute only the microphone
mAliRtcEngine.muteLocalMic(true, AliRtcEngine.AliRtcMuteLocalAudioMode.AliRtcMuteOnlyMicAudioMode);

2.2. Stop/Resume microphone capture

@startuml
autonumber
actor "Developer Client" as userA #cyan
participant "ARTC SDK" as artcsdk #orange

userA -> artcsdk: Create the ARTC engine
userA -> artcsdk: Call joinChannel to join a channel and push the audio stream
userA -[#red]> artcsdk: <color:#red>Call stopAudioCapture to disable microphone capture
userA -[#red]> artcsdk: <color:#red>Call startAudioCapture to resume microphone capture
@enduml

By default, the SDK enables microphone capture when joining a channel. To disable it, you can call stopAudioCapture, which releases microphone resources and stops audio capture. To resume microphone capture, call the startAudioCapture API.

// Stop microphone capture
mAliRtcEngine.stopAudioCapture();
// Resume microphone capture
mAliRtcEngine.startAudioCapture();

3. Configure remote audio playback

This section describes how to control the playback of remote users' audio.

3.1. Mute a remote user

ARTC provides the muteRemoteAudioPlaying API to stop or resume playback of a specific remote user's audio. The API is defined as follows:

public abstract int muteRemoteAudioPlaying(String uid, boolean mute);
  • Muting here does not affect audio stream pulling or decoding and can be set before or after joining a channel.

  • This only affects the playback of the remote user's audio on the local device and does not affect the remote user's capture.

3.2. Set the playback volume for a specified remote user

ARTC provides the setPlayoutVolume API to control the local playback volume.

/**
 * @brief Sets the playback volume.
 * @param volume The playback volume. The value ranges from 0 to 400.
 * - 0: Mute
 * - <100: Decrease the volume.
 * - >100: Increase the volume.
 * @return
 * - 0: Success
 * - A non-zero value: Failure
 */
public abstract int setPlayoutVolume(int volume);

ARTC provides the setRemoteAudioVolume API to control the playback volume of a specific remote user. Passing a volume parameter of 0 is equivalent to calling the muteRemoteAudioPlaying API.

/**
 * @brief Adjusts the volume of a specified remote user's audio played on the local client.
 * @param uid The user ID. This is a unique identifier assigned by the app server.
 * @param volume The playback volume. The value ranges from 0 to 100. 0 means mute. 100 means the original volume.
 * @return
 * - 0: Success
 * - A non-zero value: Failure
 */
public abstract int setRemoteAudioVolume(String uid, int volume);

4. Use in-ear monitoring

In-ear monitoring allows you to listen to the sound captured by the microphone through your headphones.

4.1. Enable in-ear monitoring

You can call the enableEarBack API before or after joining a channel to enable the feature. To disable it, call enableEarBack again with the parameter set to false.

Note

Enable in-ear monitoring only when wearing headphones.

Android

rtcEngine.enableEarBack(true);

iOS

engine.enableEarBack(true)

4.2. Set the in-ear monitoring volume

Call the setEarBackVolume API to adjust the volume. The volume parameter represents the volume level, ranging from 0 to 100, where 0 is mute and 100 is the normal volume. The default value is 100.

Android

rtcEngine.setEarBackVolume(60);

iOS

rtcEngine?.setEarBackVolume(volume)

5. User volume and active speaker callbacks

image

ARTC provides callbacks for user volume and the current active speaker, allowing your application to be aware of users' speaking status in real time.

Note

This feature is disabled by default. You need to call the enableAudioVolumeIndication API to enable it. When enabled, the system periodically reports the real-time volume of each user and the current speaker at the set frequency, which developers can use for UI interactions.

5.1. Enable callbacks

Call enableAudioVolumeIndication to enable the feature and set parameters such as frequency and smoothing factor:

  • interval: The callback interval in milliseconds. A value of 300 to 500 ms is recommended. The minimum value is 10 ms. A negative value disables this feature.

  • smooth: The smoothing factor. A higher value results in more smoothing, while a lower value provides better real-time performance. A value of 3 is recommended. The range is [0-9].

  • reportVad: The active speaker detection switch. 0 means disabled, 1 means enabled.

Android

mAliRtcEngine.enableAudioVolumeIndication(500, 3,1);

iOS

// User volume callback and active speaker detection
engine.enableAudioVolumeIndication(500, smooth: 3, reportVad: 1)

5.2. Implement and register the relevant callbacks

Call the registerAudioVolumeObserver API to register the callbacks. The system then triggers them at the set interval:

  • The onAudioVolume callback provides periodic audio volume information to track the speaking intensity of each user. The system reports the volume levels of all detected users (both local and remote) at regular intervals. Developers can use this for UI feedback such as soundwave animations, volume indicators, or mute detection. When mUserId is 0, it represents the local capture volume; when it is 1, it represents the mixed volume of all remote users. Other values represent the volume of a specific user. totalVolume represents the overall mixed volume of all remote users.

  • The onActiveSpeaker is the callback for the active speaker, triggered by Voice Activity Detection (VAD). When the system detects that a user has become the most active speaker (their speaking volume and duration exceed a threshold), this callback notifies the application. Developers can use this event to implement interactive experiences such as speaker focus, where the speaker's video window is enlarged in a conference scenario.

Android

private final AliRtcEngine.AliRtcAudioVolumeObserver mAliRtcAudioVolumeObserver = new AliRtcEngine.AliRtcAudioVolumeObserver() {
    // User volume callback
    @Override
    public void onAudioVolume(List<AliRtcEngine.AliRtcAudioVolume> speakers, int totalVolume){
        handler.post(() -> {
            if(!speakers.isEmpty()) {
                for(AliRtcEngine.AliRtcAudioVolume volume : speakers) {
                    if("0".equals(volume.mUserId)) {
                        // Volume of the current local user

                    } else if ("1".equals(volume.mUserId)) {
                        // Overall volume of remote users

                    } else {
                        // Volume of a remote user

                    }
                }
            }
        });
    }

    // Active speaker detection callback
    @Override
    public void onActiveSpeaker(String uid){
        // Active speaker
        handler.post(() -> {
            String mag = "onActiveSpeaker uid:" + uid;
            ToastHelper.showToast(AudioBasicUsageActivity.this, mag, Toast.LENGTH_SHORT);
        });
    }
};
// Register the callback
mAliRtcEngine.registerAudioVolumeObserver(mAliRtcAudioVolumeObserver);

iOS

Note

On iOS, you do not need to call an API to register callbacks. You only need to implement the relevant callback methods:

  • 0nAudioVolumeCallback

  • onActiveSpeaker

func onAudioVolumeCallback(_ array: [AliRtcUserVolumeInfo]?, totalVolume: Int32) {
    // User volume callback
    "onAudioVolumeCallback, totalVolume: \(totalVolume)".printLog()
}

func onActiveSpeaker(_ uid: String) {
    // Active speaker callback
    "onActiveSpeaker, uid: \(uid)".printLog()
}

6. Set audio routing

Audio routing is important for audio device management. It determines and manages the audio device used for sound playback during a call. The main device types are as follows:

  • Built-in playback devices: These usually include the speaker and the earpiece.

    • When the audio is routed to the speaker, the sound is played at a high volume. You can hear it without holding the phone to your ear. This provides a hands-free experience.

    • When the audio is routed to the earpiece, the sound is played at a low volume. You must hold the phone close to your ear to hear clearly. This provides better privacy and is suitable for phone calls.

  • External devices: These include external audio devices such as wired headphones and Bluetooth headsets, along with professional audio interfaces such as external sound cards.

The SDK has a predefined priority for audio routes. The SDK automatically switches routes based on the connection status of peripherals. The following flowchart shows the switching process:

image

6.1. Default audio route

The default audio route is used to set the default audio playback device (either earpiece or speaker) before joining a channel. If not set, the speaker is used by default.

Note
  • When other peripherals such as Bluetooth or wired headsets are disconnected, the device set by this function is used for playback.

  • When no external device is connected and the user has not set the current device, the SDK's default setting is used. The SDK defaults to the speaker. To change this default, call setDefaultAudioRoutetoSpeakerphone.

/**
* @brief Sets whether the default audio output is the speaker. The default is the speaker.
* @param defaultToSpeakerphone
* - true: Speaker (default)
* - false: Earpiece
* @return
* - 0: Success
* - <0: Failure
*/
public int setDefaultAudioRoutetoSpeakerphone(boolean defaultToSpeakerphone);

6.2. Current audio route

The current audio route is used to set the current playback device (either earpiece or speaker) during a call. If not set, the device specified by the default audio route is used.

Note

This method has no effect when a wired or Bluetooth headset is connected.

When no external device is connected, call enableSpeakerphone to set whether to use the speaker. Setting it to false uses the earpiece. To check if the current audio device is the speaker or earpiece, call isSpeakerOn.

/**
 * @brief Sets the audio output to the earpiece or speaker.
 * @param enable   true: Speaker (default); false: Earpiece
 * @return
 * - 0: Success
 * - <0: Failure
 */
public int enableSpeakerphone(boolean enable);
/**
* @brief Gets whether the current audio output is the earpiece or speaker.
* @return 
* - true: Speaker
* - false: Earpiece
*/
public boolean isSpeakerOn();

6.3. Audio route device change callback

To receive callbacks when the audio playback device changes, you need to register and listen for the following callback.

public abstract class AliRtcEngineEventListener {
    /**
     * @brief Warning notification.
     * @details If a warning occurs in the engine, the app is notified through this callback.
     * @param warn The warning type.
     * @param message The warning message.
     */
    public void onOccurWarning(int warn, String message);
}

The table below shows the mapping between the warn return value and the device type:

Return value

Device

1

Wired headset with microphone

2

Earpiece

3

Wired headset without microphone

4

Speaker

6

SCO Bluetooth device

7

A2DP Bluetooth device