This topic provides a use case to describe how to use Realtime Compute for Apache Flink to implement digital operations for live streaming.

Digital operations

This topic focuses on digital operations. You can use Realtime Compute for Apache Flink to monitor the operating status of streaming channels, such as popular videos and user trends, at your live streaming website in real time.

Solutions

  • Business objectives
    • Collect statistics on the total number of users and user trend at your website.
    • Collect statistics on the total number of users and user trend of a channel.
    • Collect statistics on the top 10 popular channels at your website, and top 10 popular channels in each category.
  • Data format

    Use the logs tracked in the client app as the raw data to collect statistics.

    The following table describes the data format of the tracked logs that the client app sends to the server.
    Field name Description
    ip The IP address of the client.
    agent The type of the client.
    roomid The ID of the channel.
    userid The ID of the user.
    abytes The audio bitrate.
    afcnt The number of audio frames.
    adrop The number of dropped audio frames.
    afts The audio timestamp.
    alat The end-to-end latency of audio frames.
    vbytes The video bitrate.
    vfcnt The number of video frames.
    vdrop The number of dropped video frames.
    vfts The video timestamp.
    vlat The end-to-end latency of video frames.
    ublock The number of upstream frame freezing times.
    dblock The number of downstream frame freezing times.
    timestamp The timestamp when a log was generated.
    region The region where live streaming is performed.
    Log Service uses semi-structured storage and displays the preceding fields in the following log format:
    {
        "ip": "ip",
        "agent": "agent",
        "roomid": "123456789",
        "userid": "123456789",
        "abytes": "123456",
        "afcnt": "34",
        "adrop": "3",
        "afts": "1515922566",
        "alat": "123",
        "vbytes": "123456",
        "vfcnt": "34",
        "vdrop": "4",
        "vfts": "1515922566",
        "vlat": "123",
        "ublock": "1",
        "dblock": "2",
            "timestamp": "1515922566",
        "region": "hangzhou"
    }                    
  • SQL statements
    • Collect statistics on the total number of users and user trend at your website.
      Use a new window every minute to collect statistics on the user trend at your website. The statistical result of the last minute in the trend is the current total number of users at your website.
      CREATE VIEW view_app_total_visit_1min AS
      SELECT
          CAST(TUMBLE_START(app_ts, INTERVAL '1' MINUTE) as VARCHAR) as app_ts,
          COUNT(DISTINCT userid) as app_total_user_cnt
      FROM
          view_app_heartbeat_stream
      GROUP BY
          TUMBLE(app_ts, INTERVAL '1' MINUTE);                         
    • Collect statistics on the total number of users and user trend of a channel.
      Similarly, use a new window every minute to collect statistics on the user trend of a channel. The statistical result of the last minute in the trend is the current total number of users of the channel.
      CREATE VIEW view_app_room_visit_1min AS
      SELECT
          CAST(TUMBLE_START(app_ts, INTERVAL '1' MINUTE) as VARCHAR) as app_ts,
          roomid as room_id,
          COUNT(DISTINCT userid) as app_room_user_cnt
      FROM
          view_app_heartbeat_stream
      GROUP BY
          TUMBLE(app_ts, INTERVAL '1' MINUTE), roomid;                         
    • Rank the top 10 popular channels at your website.
      Collect statistics on the top 10 popular channels and promote these channels on the homepage to increase your website traffic and clicks.
      CREATE VIEW view_app_room_visit_top10 AS
      SELECT
        app_ts,
        room_id,
        app_room_user_cnt,
        rangking
      FROM
      (
          SELECT 
              app_ts,
              room_id,
              app_room_user_cnt,
              ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY app_room_user_cnt desc) AS ranking
          FROM
              view_app_room_visit_1min
      ) WHERE ranking <= 10;
                                      
    • Rank the top 10 popular channels in each category.

      To build a highly interactive social community and cover more live streaming scenarios to realize more profit, a platform operator can establish diversified channels at their live streaming website to meet the requirements of different user groups. For example, Taobao Live covers multiple categories such as makeup, men's wear, automotive, and fitness.

      The category and channel relationship table is a mapping table that is stored in an ApsaraDB RDS database.
      CREATE TABLE dim_category_room ( 
          id    BIGINT,
          category_id BIGINT,
          category_name VARCHAR,
          room_id    BIGINT
            PRIMARY KEY (room_id), 
            PERIOD FOR SYSTEM_TIME --The identifier of the dimension table.  
       ) WITH ( 
           type= 'rds', 
           url = 'xxxx', --The URL of your database.  
           tableName = 'xxx', /--The name of your table. 
           userName = 'xxx', --Your username.  
           password = 'xxx' --Your password.  
       );                    
      Join the dim_category_room table based on the channel ID and compute the rankings of channels in each category.
      CREATE VIEW view_app_room_visit_1min AS
      SELECT
          CAST(TUMBLE_START(app_ts, INTERVAL '1' MINUTE) as VARCHAR) as app_ts,
          roomid as room_id,
          COUNT(DISTINCT userid) as app_room_user_cnt
      FROM
          view_app_heartbeat_stream
      GROUP BY
          TUMBLE(app_ts, INTERVAL '1' MINUTE), roomid;
      
      // Join the dim_category_room table based on the channel ID. 
      CREATE VIEW view_app_category_visit_1min AS
      SELECT 
          r.app_ts,
          r.room_id,
          d.category_id,
          d.category_name, 
          r.app_room_user_cnt
      FROM
          view_app_room_visit_1min r
      JOIN
          dim_category_room d
      ON
          r.room_id = d.room_id;
      
      
      // Compute the rankings of channels in each category. 
      CREATE VIEW view_app_category_visit_top10 AS
      SELECT
        app_ts,
        category_id,
        category_name,
        app_room_user_cnt,
        rangking
      FROM
      (
          SELECT 
              app_ts,
              room_id,
              category_id,
              category_name,
              app_room_user_cnt,
              ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY app_room_user_cnt desc) AS ranking
          FROM
              view_app_category_visit_1min
      ) WHERE ranking <= 10;                

Demo code and source code

Alibaba Cloud provides a complete demo. You can refer to the demo code to register upstream and downstream storage data and to develop your own digital operations solution for live streaming. When you use the demo, take note of the following items for the upstream and downstream storage:
  • Upload a CSV file as a source table to DataHub.
  • Create an ApsaraDB RDS dimension table.
  • Create an ApsaraDB RDS result table.