All Products
Search
Document Center

ApsaraVideo Live:Overview

Last Updated:Jun 13, 2025

Traditional live streaming focuses on one-way content delivery, resulting in low audience engagement and conversion rates. ApsaraVideo Real-time Communication (ARTC) transforms passive viewers into active participants through interactive features, such as voice chat, video co-streaming, and game interaction.

Architecture

Based on advanced technical architecture and algorithms, ARTC provides developers and enterprises with efficient, stable, and easy-to-use SDKs and APIs. It supports seamless integration across platforms, including iOS, Android, Web, and Windows. Additionally, you can combine ARTC with other Alibaba Cloud services to meet real-time communication needs in diverse business scenarios.

image

Scenarios

Voice chat

It supports up to 50 simultaneous speakers with latency less than 300ms. Various audio features are available, including voice changer, reverberation, and voice enhancement. If you require content compliance, it provides content moderation services and third-party integration capabilities. For scenario-based use cases, see Voice chat room, Online karaoke, One-on-one audio and video calls.

Co-streaming

ARTC supports collaborative live streaming, which means viewers can chat with streamers alongside live content, and streamers from different rooms can engage in popularity-based battles. The end-to-end latency is less than 300ms, allowing viewers to seamlessly join and leave the live stream. Additionally, it supports both standard streaming and Real-Time Streaming (RTS), enabling concurrent viewership of over 100,000 viewers. For more information, see Co-streaming.

Real-time Conversational AI

Real-time Conversational AI helps enterprises quickly build applications for audio and video interactions between AI agents and end users. You can build a dedicated agent within 10 minutes by following the instructions on the GUI. The agent can interact with end users in real time over the Global Realtime Transport Network (GRTN). For more information, see What is AI real-time interaction.

Voice call

lQDPJxfI6jx3dVXNC6zNBaCwBmqQRhnIAYIHiTmemZYXAA_1440_2988

Avatar call

lQDPJwMuwU90JFXNC6zNBaCwNbn8uKeIjbgHiTmd5-WQAA_1440_2988

Vision call

lQDPJwpRBT4ppFXNC6zNBaCwzODP1_m-L7MHiTmc7Nh_AA_1440_2988

Features

Feature

Description

Scenarios

Billing

Video call

Supports one-on-one or group video calls with high-definition quality from 480P to 1080P.

Personal calls, conferences, video customer service

Billing of ARTC

Voice call

Supports one-on-one or group audio calls.

Personal calls, group chat, voice chat

Video interaction

Supports multi-person video interaction with resolutions from 480P to 1080P, and end-to-end latency less than 300ms.

RTS, cross-channel streamer battle

Voice interaction

Supports high audio quality of 48 kHz.

Voice chat room, Online karaoke, Co-streaming

Recording

Records audio and video streams and stores them in Object Storage Service (OSS) or ApsaraVideo VOD.

Archiving, compliance review

Billing of live stream recording

Transcoding

Transcodes streams to ensure audio and video content can be smoothly transmitted and played across various platforms without compromising quality.

Recording format conversion

Billing of live stream transcoding

Stream mixing and relay

Mixes multiple streams into a single one based on specific rules. The mixed stream can then be relayed to ApsaraVideo Live or a third party platform.

Multi-view live streaming, large-scale multi-party conferences, multi-teacher collaborative teaching

Billing of stream relay

Audio moderation

Reviews audio content by accessing the audio moderation capability provided by Alibaba Cloud or a third party.

Business security checks, content compliance

Billing of automated review

Video moderation

Reviews video content by accessing the video moderation capability provided by Alibaba Cloud or a third party.

Business security checks, content compliance

Face retouching

Provides multiple retouching effects.

Video calls, interactive streaming, online classes

Billing of Queen SDK

Reverberation

Supports various reverberation effects such as hallway, church, recording studio, basement, and concert hall.

Voice calls, video calls, voice chat rooms, online karaoke.

Free

Voice Changer

Supports various effects such as electric sound, old man voice, husky male voice, and lively female voice.

Online karaoke rooms, voice chat rooms

Smart noise reduction

Under the premise of high-fidelity human voice, peripheral noise is eliminated, sudden noise is suppressed, and device buzzing noise is eliminated.

Voice calls, multi-person conferences

Low-latency in-ear monitoring

During audio capture, processing, and playback, a user's voice is fed back to them through headphones (or other audio output devices) with minimal delay.

Interactive streaming, online karaoke, recording room

Audio 3A processing

Supports Acoustic Echo Cancellation (AEC), Automatic Noise Suppression (ANS), and Automatic Gain Control (AGC).

Voice-related scenarios

Screen sharing

Shares desktop, window, or specific screen areas with other users, and supports simultaneous display with camera view.

Online classes, remote assistance

Spatial audio

Simulates sound propagation in three-dimensional space through advanced audio technology, creating an immersive audio experience with spatial and directional awareness.

Online karaoke rooms, voice chat rooms

Custom audio/ video input

Supports input of external audio and video stream data.

Custom beauty effects, custom sound effects

Benefits

High-quality service worldwide

ApsaraVideo Live boasts an extensive global presence, with:

  • 9 live centers: China (Beijing), China (Shenzhen), China (Shanghai), China (Qingdao), Singapore, Germany (Frankfurt), Japan (Tokyo), Indonesia (Jakarta), and Saudi Arabia (Riyadh) regions

  • 3 stream relay hubs: China (Shanghai), Singapore, and Saudi Arabia (Riyadh) regions

  • Over 3200 nodes worldwide

This ensures reliable and high-availability services around the globe.

Security compliance

ARTC maintains full compliance with global regulations regarding calling and adheres to stringent privacy protection standards.

Diverse product combinations

ARTC provides a one-stop solution that leverages diverse Alibaba Cloud products and services, including ECS, OSS, security services, live streaming, video-on-demand, avatars, and AI.

Easy to use

  • Scenario-based API integration: Encapsulates underlying API operations based on business scenarios to simplify development. For more information, see Development reference.

  • Multi-scenario practices: Covers various scenarios, such as one-on-one calls, co-streaming, voice chat rooms, and online karaoke. For more information, see Scenarios.

Concepts

The following table lists concepts related to ARTC.

Concept

Description

SDKAppID

To manage customer services, ARTC uses SDKAppID as a unique identifier for applications. You need to create an independent SDKAppID for a service and use the SDKAppID to isolate the services and configurations. 

ChannelID

A channel, indicated by a ChannelID, is an audio and video space defined by ARTC. Users in the same channel can interact with each other via audio and video. For specific scenarios, Alibaba Cloud also provides users with cross-channel audio and video interaction capabilities. 

UserID

In ARTC, a UserID uniquely identifies a user in an application. 

Token

token is a security signature designed by Alibaba Cloud to prevent malicious parties from accessing your cloud service resources. You need to provide information including the SDKAppIDUserIDChannelIDtimestamp, and token in the login function of the corresponding SDK.

Publish

Publish refers to the operation of uploading local audio and video data to Alibaba Cloud servers. This operation is equivalent to stream ingest

Subscribe

Subscription refers to the operation of pulling audio and video data from Alibaba Cloud servers to local devices. This operation is equivalent to stream pulling

Role

In ARTC, there are two kinds of roles: streamer and viewer. A streamer can publish or subscribe to audio and video streams. A viewer can only subscribe to audio and video streams. Users can switch between the roles during a call. 

Stream mixing and relay

The feature allows you to mix multiple audio and video streams, configure encoding settings for the streams, and then relay the processed streams to ApsaraVideo Live or a third-party live streaming platform. 

If you relay the streams to ApsaraVideo Live, you can transcode, record, and play the streams by using capabilities that are provided by ApsaraVideo Live.