All Products
Search
Document Center

ApsaraVideo Live:Overview

Last Updated:Feb 25, 2025

In traditional live streaming, streamers deliver content in a unidirectional way, which means that user engagement and lead conversion are low. ApsaraVideo Real-time Communication (ARTC) adds more interactive elements to traditional live streaming, such as voice chat, video co-streaming, and game interaction. This allows viewers to be not only viewers but also participants, enhancing the social connection. This topic describes the architecture, scenarios, and features of ARTC.

Architecture

image

Item

Description

Third-party content moderation platform

Audio streams can be reviewed by using the content moderation feature provided by ApsaraVideo Live or by using a third-party content moderation platform.

Media Center

Various media processing capabilities such as recording, content moderation, stream mixing, and transcoding are provided.

Co-streamers

Up to 50 co-streamers can turn on their mic and chat at the same time. A room management component is required.

Global Realtime Transport Network (GRTN)

GRTN of Alibaba Cloud supports the transmission of live streams, videos-on-demand (VODs), WebRTC data, and signaling information.

Ordinary viewers

Ordinary viewers have the same latency as co-streamers. The room management component is required.

CDN viewers

The audio and video streams of users are relayed to ApsaraVideo Live. Viewers can pull the streams over Real-Time Streaming (RTS), Flash Video (FLV), Real-Time Messaging Protocol (RTMP), and HTTP Live Streaming (HLS). The number of viewers is not limited, and the room management component is not required.

Room management component

This is a hosted channel component that provides customers with virtual channel management to facilitate development.

Scenarios

Co-streaming

With the co-streaming feature, streamers and viewers can interact with each other, a streamer can have a battle with another streamer, and multiple people can co-stream. This makes live streaming more interesting. ApsaraVideo Live reduces the end-to-end latency to less than 300 ms, allowing viewers to turn on/off mic smoothly. Standard streaming and RTS seamlessly apply to the co-streaming feature. A live room supports more than 100,000 concurrent viewers.

image..png

Voice chat

Up to 50 people can co-stream at the same time, with an end-to-end latency of less than 300 ms. They can turn on/off their mic smoothly. Interesting audio effects such as voice change, reverberation, and voice beautification are provided. To ensure the compliance of audio content, Alibaba Cloud allows you to review audio by using the content moderation feature provided by ApsaraVideo Live or by using a third-party content moderation platform. This helps you quickly launch solutions.

image

Features

Feature

Description

Video interaction

Multiple people can interact with each other in a video stream that has an end-to-end latency of less than 300 ms. Supported video resolutions include 480p, 720p, and 1080p. For example, a streamer in a live room can interact with the viewers, or streamers can have a battle with each other across rooms.

Voice interaction

High-quality audio with a 48 kHz sampling rate is supported. The end-to-end latency is within 300 ms. This feature is suitable for various scenarios such as voice chat room, radio room, and customer service.

Stream mixing and relay

Multiple streams can be mixed into a single stream based on specific rules. The single stream can then be relayed to ApsaraVideo Live or a third party.

CDN acceleration

Standard streaming and RTS seamlessly apply to the co-streaming feature. In this case, a live room can support more than 100,000 concurrent viewers.

Cloud-based recording

You can record audio and video streams and store them in Object Storage Service (OSS) or ApsaraVideo VOD.

Cloud-based transcoding

Cloud-based transcoding is supported.

Reverberation and voice change

  • Reverberation: Effects such as corridor, church, recording studio, basement, and concert hall are supported.

  • Voice change: Effects such as electric sound, old man voice, husky male voice, and lively female voice are supported.

Intelligent noise reduction

Under the premise of high-fidelity human voice, peripheral noise is eliminated, sudden noise is suppressed, and device buzzing noise is eliminated.

In-ear monitoring

Low-latency in-ear monitoring is supported.

Video retouching

Multiple types of retouching effects are provided.

Audio moderation

You can review audio by seamlessly using the content moderation feature provided by ApsaraVideo Live or by using a third-party content moderation platform.

Video moderation

You can review videos by seamlessly using the content moderation feature provided by ApsaraVideo Live or by using a third-party content moderation platform.

Benefits

  • Multi-network integration: GRTN uses more than 3,200 points of presence (POPs) deployed around the world to facilitate resource sharing between Content Delivery Network (CDN) streaming and Web Real-time Communication (WebRTC) networks. This ensures reliable and low-latency global communications.

  • Rich media processing capabilities: Capabilities such as recording, stream mixing, transcoding, and content moderation are provided.

  • Ease of access: Sophisticated best practices for accessing interactive streaming are provided.

Concepts

The following table lists concepts related to ApsaraVideo Real-time Communication.

Concept

Description

SDKAppID

To manage customer services, ARTC uses SDKAppID as a unique identifier for applications. You need to create an independent SDKAppID for a service and use the SDKAppID to isolate the services and configurations. 

ChannelID

A channel, indicated by a ChannelID, is an audio and video space defined by ARTC. Users in the same channel can interact with each other in the form of audio and video. For specific scenarios, Alibaba Cloud also provides users with cross-channel audio and video interaction capabilities. 

UserID

In ARTC, a UserID uniquely identifies a user in an application. 

Token

token is a security signature designed by Alibaba Cloud to prevent malicious parties from accessing your cloud service resources. You need to provide information including the SDKAppIDUserIDChannelIDtimestamp, and token in the login function of the corresponding SDK.

Publish

Publish refers to the operation of uploading local audio and video data to Alibaba Cloud servers. This operation is equivalent to stream ingest

Subscription

Subscription refers to the operation of pulling audio and video data from Alibaba Cloud servers to local devices. This operation is equivalent to stream pulling

Role

In ARTC, there are two kinds of roles: streamer and viewer. A streamer can publish or subscribe to audio and video streams. A viewer can only subscribe to audio and video streams. Users can switch between the roles during a call. 

Stream mixing and relay

The feature allows you to mix multiple audio and video streams, configure encoding settings for the streams, and then relay the processed streams to ApsaraVideo Live or a third-party live streaming platform. 

If you relay the streams to ApsaraVideo Live, you can transcode, record, and play the streams by using capabilities that are provided by ApsaraVideo Live.