This topic describes common audio operations and configurations in the ARTC SDK.
Feature introduction
The ARTC SDK provides various audio configuration and operation features, such as setting audio encoding and scenario modes, managing local audio capture and playback, controlling remote audio playback, applying in-ear monitoring, and setting the audio route.
Sample code
Android: Android/ARTCExample/BasicUsage/src/main/java/com/aliyun/artc/api/basicusage/AudioBasicUsage/AudioBasicUsageActivity.java
iOS: iOS/ARTCExample/BasicUsage/AudioBasicUsage/AudioBasicUsageVC.swift
Before you begin
Make sure you meet the following requirements:
Create an ARTC application and obtain the AppID and AppKey from the ApsaraVideo Live console.
Integrate the ARTC SDK into your project and implement basic audio and video call features.
Implementation
1. Set audio encoding and scenario modes (before joining a channel)
The ARTC SDK provides the setAudioProfile API to set different audio encoding modes and scenario modes. This allows developers to fine-tune audio quality based on different scenarios and user needs.
setAudioProfilecan only be called before joining a channel and cannot be reset after joining.We recommend using
AliRtcEngineHighQualityMode(high-quality audio encoding mode) andAliRtcSceneMusicMode(music scenario).
1.1. Audio encoding modes (AliRtcAudioProfile)
High-quality mode (
AliRtcEngineHighQualityMode) is recommended.If interoperability with the Web is required, choose a mode with a 48 kHz sample rate.
For stereo sound, set the mode to
AliRtcEngineStereoHighQualityMode.
Enumeration | Description | Sample rate | Channels | Max bitrate |
AliRtcEngineLowQualityMode | Low-quality audio mode | 8000 Hz | Mono | 12 kbps |
AliRtcEngineBasicQualityMode | Standard-quality audio mode | 16000 Hz | Mono | 24 kbps |
AliRtcEngineHighQualityMode | High-quality audio mode | 48000 Hz | Mono | 64 kbps |
AliRtcEngineStereoHighQualityMode | Stereo high-quality audio mode | 48000 Hz | Stereo | 80 kbps |
AliRtcEngineSuperHighQualityMode | Super-high-quality audio mode | 48000 Hz | Mono | 96 kbps |
AliRtcEngineStereoSuperHighQualityMode | Stereo super-high-quality audio mode | 48000 Hz | Stereo | 128 kbps |
1.2. Audio scenario modes (AliRtcAudioScenario)
Enumeration | Description |
AliRtcSceneDefaultMode | Uses hardware 3A algorithm and allows audio capture from Bluetooth devices. Set to this mode if you need Bluetooth capture. |
AliRtcSceneMusicMode | (Recommended) Music scenario. Uses software 3A algorithm and captures audio from the mobile phone. |
1.3. Sample code
The following provides example settings for audio encoding and scenario modes for common scenarios.
For Bluetooth capture
Android
// Must be set to AliRtcSceneDefaultMode scenario
mAliRtcEngine.setAudioProfile(AliRtcEngineHighQualityMode, AliRtcSceneDefaultMode);iOS
// Must be set to AliRtcSceneDefaultMode scenario.
engine.setAudioProfile(AliRtcAudioProfile.engineHighQualityMode, audio_scene: AliRtcAudioScenario.sceneDefaultMode)For interoperability with Web
Android
// Must be set to an encoding mode with a 48k sample rate, such as AliRtcEngineHighQualityMode.
mAliRtcEngine.setAudioProfile(AliRtcEngineHighQualityMode, AliRtcSceneMusicMode);iOS
// Must be set to an encoding mode with a 48k sample rate, such as AliRtcEngineHighQualityMode.
engine.setAudioProfile(AliRtcAudioProfile.engineHighQualityMode, audio_scene: AliRtcAudioScenario.sceneMusicMode)For stereo sound
Android
// Set to a mode with Stereo, such as AliRtcEngineStereoHighQualityMode.
mAliRtcEngine.setAudioProfile(AliRtcEngineStereoHighQualityMode, AliRtcSceneMusicMode);iOS
// Set to a mode with Stereo, such as AliRtcEngineStereoHighQualityMode.
engine.setAudioProfile(AliRtcAudioProfile.engineStereoHighQualityMode, audio_scene: AliRtcAudioScenario.sceneMusicMode)2. Configure local audio capture
This section describes how to control local audio capture, such as muting the microphone or stopping microphone capture. The main APIs and their differences are as follows:
API | muteLocalMic | stopAudioCapture/startAudioCapture |
How it works | Sends silent frames. | Stops/starts microphone capture. |
When to call | Can be called before or after joining a channel. | Called after joining a channel. |
Releases microphone resources? | No | Yes |
2.1. Mute the microphone
ARTC provides the muteLocalMic API to mute the microphone and external audio input. This API can be called before or after joining a channel.
Unlike stopAudioCapture, calling the muteLocalMic API does not release microphone resources. The microphone capture and encoding modules continue to run, but they send silent frames at a very low bitrate.
The supported AliRtcMuteLocalAudioMode modes are as follows:
AliRtcMuteAudioModeDefault | Default mode. Behaves the same as AliRtcMuteAllAudioMode. |
AliRtcMuteAllAudioMode | Mutes all audio. Stops publishing audio from both microphone capture and external PCM input. |
AliRtcMuteOnlyMicAudioMode | Mutes microphone only. Stops publishing audio from microphone capture only. |
Sample code:
// Mute all
mAliRtcEngine.muteLocalMic(true, AliRtcEngine.AliRtcMuteLocalAudioMode.AliRtcMuteAllAudioMode);
// Unmute all
mAliRtcEngine.muteLocalMic(false, AliRtcEngine.AliRtcMuteLocalAudioMode.AliRtcMuteAllAudioMode);
// Mute only the microphone
mAliRtcEngine.muteLocalMic(true, AliRtcEngine.AliRtcMuteLocalAudioMode.AliRtcMuteOnlyMicAudioMode);2.2. Stop/Resume microphone capture
@startuml
autonumber
actor "Developer Client" as userA #cyan
participant "ARTC SDK" as artcsdk #orange
userA -> artcsdk: Create the ARTC engine
userA -> artcsdk: Call joinChannel to join a channel and push the audio stream
userA -[#red]> artcsdk: <color:#red>Call stopAudioCapture to disable microphone capture
userA -[#red]> artcsdk: <color:#red>Call startAudioCapture to resume microphone capture
@endumlBy default, the SDK enables microphone capture when joining a channel. To disable it, you can call stopAudioCapture, which releases microphone resources and stops audio capture. To resume microphone capture, call the startAudioCapture API.
// Stop microphone capture
mAliRtcEngine.stopAudioCapture();
// Resume microphone capture
mAliRtcEngine.startAudioCapture();3. Configure remote audio playback
This section describes how to control the playback of remote users' audio.
3.1. Mute a remote user
ARTC provides the muteRemoteAudioPlaying API to stop or resume playback of a specific remote user's audio. The API is defined as follows:
public abstract int muteRemoteAudioPlaying(String uid, boolean mute);Muting here does not affect audio stream pulling or decoding and can be set before or after joining a channel.
This only affects the playback of the remote user's audio on the local device and does not affect the remote user's capture.
3.2. Set the playback volume for a specified remote user
ARTC provides the setPlayoutVolume API to control the local playback volume.
/**
* @brief Sets the playback volume.
* @param volume The playback volume. The value ranges from 0 to 400.
* - 0: Mute
* - <100: Decrease the volume.
* - >100: Increase the volume.
* @return
* - 0: Success
* - A non-zero value: Failure
*/
public abstract int setPlayoutVolume(int volume);ARTC provides the setRemoteAudioVolume API to control the playback volume of a specific remote user. Passing a volume parameter of 0 is equivalent to calling the muteRemoteAudioPlaying API.
/**
* @brief Adjusts the volume of a specified remote user's audio played on the local client.
* @param uid The user ID. This is a unique identifier assigned by the app server.
* @param volume The playback volume. The value ranges from 0 to 100. 0 means mute. 100 means the original volume.
* @return
* - 0: Success
* - A non-zero value: Failure
*/
public abstract int setRemoteAudioVolume(String uid, int volume);4. Use in-ear monitoring
In-ear monitoring allows you to listen to the sound captured by the microphone through your headphones.
4.1. Enable in-ear monitoring
You can call the enableEarBack API before or after joining a channel to enable the feature. To disable it, call enableEarBack again with the parameter set to false.
Enable in-ear monitoring only when wearing headphones.
Android
rtcEngine.enableEarBack(true);iOS
engine.enableEarBack(true)4.2. Set the in-ear monitoring volume
Call the setEarBackVolume API to adjust the volume. The volume parameter represents the volume level, ranging from 0 to 100, where 0 is mute and 100 is the normal volume. The default value is 100.
Android
rtcEngine.setEarBackVolume(60);iOS
rtcEngine?.setEarBackVolume(volume)5. User volume and active speaker callbacks
ARTC provides callbacks for user volume and the current active speaker, allowing your application to be aware of users' speaking status in real time.
This feature is disabled by default. You need to call the enableAudioVolumeIndication API to enable it. When enabled, the system periodically reports the real-time volume of each user and the current speaker at the set frequency, which developers can use for UI interactions.
5.1. Enable callbacks
Call enableAudioVolumeIndication to enable the feature and set parameters such as frequency and smoothing factor:
interval: The callback interval in milliseconds. A value of 300 to 500 ms is recommended. The minimum value is 10 ms. A negative value disables this feature.smooth: The smoothing factor. A higher value results in more smoothing, while a lower value provides better real-time performance. A value of 3 is recommended. The range is [0-9].reportVad: The active speaker detection switch. 0 means disabled, 1 means enabled.
Android
mAliRtcEngine.enableAudioVolumeIndication(500, 3,1);iOS
// User volume callback and active speaker detection
engine.enableAudioVolumeIndication(500, smooth: 3, reportVad: 1)5.2. Implement and register the relevant callbacks
Call the registerAudioVolumeObserver API to register the callbacks. The system then triggers them at the set interval:
The
onAudioVolumecallback provides periodic audio volume information to track the speaking intensity of each user. The system reports the volume levels of all detected users (both local and remote) at regular intervals. Developers can use this for UI feedback such as soundwave animations, volume indicators, or mute detection. WhenmUserIdis 0, it represents the local capture volume; when it is 1, it represents the mixed volume of all remote users. Other values represent the volume of a specific user.totalVolumerepresents the overall mixed volume of all remote users.The
onActiveSpeakeris the callback for the active speaker, triggered by Voice Activity Detection (VAD). When the system detects that a user has become the most active speaker (their speaking volume and duration exceed a threshold), this callback notifies the application. Developers can use this event to implement interactive experiences such as speaker focus, where the speaker's video window is enlarged in a conference scenario.
Android
private final AliRtcEngine.AliRtcAudioVolumeObserver mAliRtcAudioVolumeObserver = new AliRtcEngine.AliRtcAudioVolumeObserver() {
// User volume callback
@Override
public void onAudioVolume(List<AliRtcEngine.AliRtcAudioVolume> speakers, int totalVolume){
handler.post(() -> {
if(!speakers.isEmpty()) {
for(AliRtcEngine.AliRtcAudioVolume volume : speakers) {
if("0".equals(volume.mUserId)) {
// Volume of the current local user
} else if ("1".equals(volume.mUserId)) {
// Overall volume of remote users
} else {
// Volume of a remote user
}
}
}
});
}
// Active speaker detection callback
@Override
public void onActiveSpeaker(String uid){
// Active speaker
handler.post(() -> {
String mag = "onActiveSpeaker uid:" + uid;
ToastHelper.showToast(AudioBasicUsageActivity.this, mag, Toast.LENGTH_SHORT);
});
}
};
// Register the callback
mAliRtcEngine.registerAudioVolumeObserver(mAliRtcAudioVolumeObserver);iOS
On iOS, you do not need to call an API to register callbacks. You only need to implement the relevant callback methods:
0nAudioVolumeCallback
onActiveSpeaker
func onAudioVolumeCallback(_ array: [AliRtcUserVolumeInfo]?, totalVolume: Int32) {
// User volume callback
"onAudioVolumeCallback, totalVolume: \(totalVolume)".printLog()
}
func onActiveSpeaker(_ uid: String) {
// Active speaker callback
"onActiveSpeaker, uid: \(uid)".printLog()
}6. Set audio routing
Audio routing is important for audio device management. It determines and manages the audio device used for sound playback during a call. The main device types are as follows:
Built-in playback devices: These usually include the speaker and the earpiece.
When the audio is routed to the speaker, the sound is played at a high volume. You can hear it without holding the phone to your ear. This provides a hands-free experience.
When the audio is routed to the earpiece, the sound is played at a low volume. You must hold the phone close to your ear to hear clearly. This provides better privacy and is suitable for phone calls.
External devices: These include external audio devices such as wired headphones and Bluetooth headsets, along with professional audio interfaces such as external sound cards.
The SDK has a predefined priority for audio routes. The SDK automatically switches routes based on the connection status of peripherals. The following flowchart shows the switching process:
6.1. Default audio route
The default audio route is used to set the default audio playback device (either earpiece or speaker) before joining a channel. If not set, the speaker is used by default.
When other peripherals such as Bluetooth or wired headsets are disconnected, the device set by this function is used for playback.
When no external device is connected and the user has not set the current device, the SDK's default setting is used. The SDK defaults to the speaker. To change this default, call
setDefaultAudioRoutetoSpeakerphone.
/**
* @brief Sets whether the default audio output is the speaker. The default is the speaker.
* @param defaultToSpeakerphone
* - true: Speaker (default)
* - false: Earpiece
* @return
* - 0: Success
* - <0: Failure
*/
public int setDefaultAudioRoutetoSpeakerphone(boolean defaultToSpeakerphone);6.2. Current audio route
The current audio route is used to set the current playback device (either earpiece or speaker) during a call. If not set, the device specified by the default audio route is used.
This method has no effect when a wired or Bluetooth headset is connected.
When no external device is connected, call enableSpeakerphone to set whether to use the speaker. Setting it to false uses the earpiece. To check if the current audio device is the speaker or earpiece, call isSpeakerOn.
/**
* @brief Sets the audio output to the earpiece or speaker.
* @param enable true: Speaker (default); false: Earpiece
* @return
* - 0: Success
* - <0: Failure
*/
public int enableSpeakerphone(boolean enable);
/**
* @brief Gets whether the current audio output is the earpiece or speaker.
* @return
* - true: Speaker
* - false: Earpiece
*/
public boolean isSpeakerOn();6.3. Audio route device change callback
To receive callbacks when the audio playback device changes, you need to register and listen for the following callback.
public abstract class AliRtcEngineEventListener {
/**
* @brief Warning notification.
* @details If a warning occurs in the engine, the app is notified through this callback.
* @param warn The warning type.
* @param message The warning message.
*/
public void onOccurWarning(int warn, String message);
}The table below shows the mapping between the warn return value and the device type:
Return value | Device |
1 | Wired headset with microphone |
2 | Earpiece |
3 | Wired headset without microphone |
4 | Speaker |
6 | SCO Bluetooth device |
7 | A2DP Bluetooth device |