This topic provides a use case to describe how to use Realtime Compute to monitor core video metrics.

Background

With the development of Internet technology, the concept of live streaming has been expanded. More and more people are attracted by online live streaming, especially the ecological chain of live streaming. As long as you have an Internet connection, you can watch sports games, major events, and news that you are interested in online and experience the joy and convenience of live streaming anytime, anywhere.

In this experience-oriented era, any poor experience will lead to a large-scale churn of users. The operator of a live streaming website platform must attach importance to the experience of users (including casters and fans). The key system metrics that affect user experience include the audio or video frame freezing rate, latency, and packet loss rate. To meet the timeliness requirements of live streaming, the platform operator must detect faults of peripheral systems in real time and rectify the faults before user experience is affected. The platform operator must also track and understand the overall website operating status and hot video products in a timely manner.

The following use case describes how a live streaming website platform uses Realtime Compute to monitor and display the system stability and operating status in real time.

Business description

To build a highly interactive user community and cover more live streaming scenarios to realize more profit, a platform operator usually hires multiple casters at their live streaming website. Each caster broadcasts to fans in a channel. Fans can watch the video broadcast by the caster in a channel and hear the caster's voice. The caster can have a private chat with multiple fans in the channel.

The live streaming app used by the caster and fans on their devices sends tracked logs to the server every 10 seconds. The server saves the logs to local disks and sends the logs to Log Service through the log collector of Log Service. Then Realtime Compute subscribes to the logs to compute the video playback status on the clients in real time.

The following figure shows the overall process.



Business objectives

Obtain the following metrics based on the tracked logs sent from the client app:

  • Channel faults, including frame freezing, frame drop, and out of synchronization between the audio and video signals
  • Average E2E latency collected by region
  • Overall frame freezing rate collected in real time (Number of online users suffering from frame freezing/Total number of online users × 100%. This metric can be used to measure the scope of users currently suffering from frame freezing.)
  • Number of frame freezing times per user (Total number of online frame freezing times/Total number of online users. This metric can be used to measure the overall severity of frame freezing based on frame freezing times.)

We use Realtime Compute to compute the preceding data in real time, save the result to RDS, and display online data through reports and dashboards. Alerts triggered upon monitoring are also displayed.

Data format

The following table describes the data format of the tracked logs that the client app sends to the server.

Field name Description
ip The IP address of the client.
agent The device type of the client.
roomid The ID of the channel.
userid The ID of the user.
abytes The audio bitrate.
afcnt The number of audio frames.
adrop The number of dropped audio frames.
afts The audio timestamp.
alat The E2E latency of audio frames.
vbytes The video bitrate.
vfcnt The number of video frames.
vdrop The number of dropped video frames.
vfts The video timestamp.
vlat The E2E latency of video frames.
ublock The number of upstream frame freezing times.
dblock The number of downstream frame freezing times.
timestamp The timestamp when a log was generated.
region The region where live streaming is performed.

Log Service uses semi-structured storage and displays the preceding fields in the following log format:

{
    "ip": "ip",
    "agent": "agent",
    "roomid": "123456789",
    "userid": "123456789",
    "abytes": "123456",
    "afcnt": "34",
    "adrop": "3",
    "afts": "1515922566",
    "alat": "123",
    "vbytes": "123456",
    "vfcnt": "34",
    "vdrop": "4",
    "vfts": "1515922566",
    "vlat": "123",
    "ublock": "1",
    "dblock": "2",
    "timestamp": "15151922566",
    "region": "hangzhou"
}
			

SQL statements

  • Data cleansing

    Declare a source table in Realtime Compute.

    CREATE TABLE app_heartbeat_stream_source (
        ip VARCHAR,
        agent VARCHAR,
        roomid VARCHAR,
        userid VARCHAR,
        abytes VARCHAR,
        afcnt VARCHAR,
        adrop VARCHAR,
        afts VARCHAR,
        alat VARCHAR,
        vbytes VARCHAR,
        vfcnt VARCHAR,
        vdrop VARCHAR,
        vfts VARCHAR,
        vlat VARCHAR,
        ublock VARCHAR,
        dblock VARCHAR,
        `timestamp` VARCHAR,
        app_ts AS TO_TIMESTAMP(CAST(`timestamp` AS BIGINT)), // Specify the fields for generating a watermark.
        WATERMARK FOR app_ts AS withOffset(app_ts, 10000) // Add an offset of 10 seconds to define a watermark.
    ) WITH (
        type ='sls',
        endPoint ='http://cn-hangzhou-corp.sls.aliyuncs.com',
        accessId ='yourAccessId',
        accessKey ='yourAccessSecret',
        project ='yourProjectName',
        logStore ='app_heartbeat_stream_source',
    );

    For business convenience, we process all data as the VARCHAR type in the source table. To facilitate subsequent processing, we cleanse data in the source table for the following purposes:

    1. Format conversion: Convert the format of some data from VARCHAR to BIGINT.
    2. Business data supplement: For example, enter the region information.

    Example:

    CREATE VIEW view_app_heartbeat_stream AS
    SELECT
        ip,
        agent,
        CAST(roomid AS BIGINT),
        CAST(userid AS BIGINT),
        CAST(abytes AS BIGINT),
        CAST(afcnt AS BIGINT),
        CAST(adrop AS BIGINT),
        CAST(afts AS BIGINT),
        CAST(alat AS BIGINT),
        CAST(vbytes AS BIGINT),
        CAST(vfcnt AS BIGINT),
        CAST(vdrop AS BIGINT),
        CAST(vfts AS BIGINT),
        CAST(vlat AS BIGINT),
        CAST(ublock AS BIGINT),
        CAST(dblock AS BIGINT),
        app_ts,    
        region
    FROM
        app_heartbeat_stream_source;
  • Channel fault statistics

    Use a new window every 10 minutes to collect statistics on channel faults, including frame freezing, frame drop, and out of synchronization between the audio and video signals.

    CREATE VIEW room_error_statistics_10min AS
    SELECT
        CAST(TUMBLE_START(app_ts, INTERVAL '10' MINUTE) as VARCHAR) as app_ts,
        roomid,
        SUM(ublock) as ublock, // Collect statistics on the number of upstream frame freezing times in the last 10 minutes.
        SUM(dblock) as dblock, // Collect statistics on the number of downstream frame freezing times in the last 10 minutes.
        SUM(adrop) as adrop, // Collect statistics on the number of audio packets dropped in the last 10 minutes.
        SUM(vdrop) as vdrop, // Collect statistics on the number of video packets dropped in the last 10 minutes.
        SUM(alat) as alat, // Collect statistics on the audio latency in the last 10 minutes.
        SUM(vlat) as vlat, // Collect statistics on the video latency in the last 10 minutes.
    FROM
        view_app_heartbeat_stream
    GROUP BY
        TUMBLE(app_ts, INTERVAL '10' MINUTE), roomid
  • Latency statistics collected by region

    Collect statistics on the average E2E latency of audio and video data by region every 10 minutes.

    CREATE VIEW region_lat_statistics_10min AS
    SELECT 
        CAST(TUMBLE_START(app_ts, INTERVAL '10' MINUTE) as VARCHAR) as app_ts,
        region,
        SUM(alat)/COUNT(alat) as alat,
        SUM(vlat)/COUNT(vlat) as vlat,
    FROM
        view_app_heartbeat_stream
    GROUP BY
        TUMBLE(app_ts, INTERVAL '10' MINUTE), region;
  • Overall frame freezing rate collected in real time

    Compute the overall frame freezing rate according to the following formula: Number of online users suffering from frame freezing/Total number of online users × 100%. This metric can be used to measure the scope of users currently suffering from frame freezing.

    CREATE VIEW block_total_statistics_10min AS
    SELECT
        CAST(TUMBLE_START(app_ts, INTERVAL '10' MINUTE) as VARCHAR) as app_ts,
        SUM(IF(ublock <> 0 OR dblock <> 0, 1, 0)) / CAST(COUNT(DISTINCT userid) AS DOUBLE) as block_rate, // COUNT(DISTINCT) is supported only in Realtime Compute V1.4.4 and later.
    FROM
        view_app_heartbeat_stream
    GROUP BY
        TUMBLE(app_ts, INTERVAL '10' MINUTE);
  • Number of frame freezing times per user

    Compute the number of frame freezing times per user according to the following formula: Total number of online frame freezing times/Total number of online users. This metric can be used to measure the overall severity of frame freezing based on frame freezing times.

    CREATE VIEW block_peruser_statistics_10min AS
    SELECT
        CAST(TUMBLE_START(app_ts, INTERVAL '10' MINUTE) as VARCHAR) as app_ts,
        SUM(ublock+dblock) / CAST(COUNT(DISTINCT userid) AS DOUBLE) as block_peruser, // COUNT(DISTINCT) is supported only in Realtime Compute V1.4.4 and later.
    FROM
        view_app_heartbeat_stream
    GROUP BY
        TUMBLE(app_ts, INTERVAL '10' MINUTE);

Demo code and source code

The Alibaba Cloud team has created demo code that contains a complete link applicable to the core video metric monitoring solution described above.

  • Upload a CSV file to a DataHub instance as the source table.
  • Create an RDS result table.

Refer to the complete demo code to register input and output data to develop your own core video metric monitoring solution. Click Attachment to download the demo code.