Real-Time Streaming (RTS) is implemented based on Web Real-Time Communication (WebRTC) signaling. RTS achieves ultra-low latency live streaming with the help of worldwide Alibaba Cloud Content Delivery Network (CDN) nodes and excellent scheduling algorithms. This topic describes the specifications of WebRTC signaling that are used to access Global Realtime Transport Network (GRTN). This topic is intended for developers who master the basic knowledge of WebRTC.

Background information

Live streams over the Transmission Control Protocol (TCP) protocol have a latency of 3 to 6 seconds. In order to reduce the latency, ApsaraVideo Live provides the value-added RTS feature that allows you to live stream over the User Datagram Protocol (UDP) protocol. RTS provides easy-to-access, high definition, and smooth live streaming service that has an ultra-low latency in milliseconds and can handle tens of millions of concurrent requests. In the design process of RTS, great importance is attached to the construction of an open, standard ecosystem. Apart from RTS SDKs provided by ApsaraVideo Live, you can also use your own clients to push streams to or pull streams from CDN nodes by using a signaling method similar to WebRTC. ApsaraVideo Live provides worldwide CDN nodes and excellent scheduling algorithms to facilitate you to manage and use the RTS service in a large scale.

GRTN provides the capability to perform well under poor network conditions.

Prerequisites

Signaling process

The following figure shows the signaling process.

001

Signaling process

  1. The client sends a request with a Session Description Protocol (SDP) offer.
    1. Create an RTCPeerConnection object on the client, specify whether to receive or send audio and video signals, and then create an SDP offer.
      // Specify whether to receive or send audio and video signals.
      { offerToReceiveVideo: true, offerToReceiveAudio: true }
    2. Send a stream pulling request from the client to ApsaraVideo Live by using the HTTPS POST method. The request body is a JSON string. For more information about the request parameters, see the Definition of the RTS signaling protocol section of this topic.
      • The version parameter specifies the version of the RTS signaling protocol. Set the value to 2.
      • The sdk_version parameter specifies the version of RTS SDK. You can set the parameter as needed.
    3. Send the constructed request to ApsaraVideo Live based on the signaling URL by using the POST method. Specify the source URL in the JSON-formatted request body.
      POST /app/streamname?auth=xxx HTTP/1.1
      Host: domain
      Connection: keep-alive
      Content-Length: 2205
      Content-Type: application/json
      Note The content of a signaling URL is basically the same as that of a source URL, except the protocol header. The following URLs provide examples:
      • Signaling URL: https://domain/app/streamname?auth=xxx
      • Source URL: artc://domain/app/streamname?auth=xxx
  2. The server returns a response with an SDP answer.

    After the server of ApsaraVideo Live verifies the request, the server generates an SDP answer and returns a response that contains the information about the live streaming node to the client. For more information about the response parameters, see the Definition of the RTS signaling protocol section of this topic.

  3. The client initiates Interactive Connectivity Establishment (ICE).
    1. After the client receives the response with an SDP answer, specify the session description in the RTCPeerConnection object.
      peerConnection.setRemoteDescription(new RTCSessionDescription(answer.jsep));
    2. Use the RTCPeerConnection object to initiate ICE and Datagram Transport Layer Security (DTLS) encryption. After the signaling channel is established, the client can pull streams from ApsaraVideo Live. This way, you can implement stream pulling and playback based on the standards of WebRTC.
  4. The client initiates a disconnection.

    The client sends a DTLS alert message that initiates a disconnection to stop stream ingest or playback.

    Disconnect
Sample code for the HTML5 player
// Create peer connection and local offer sdp.
peerConnection = new RTCPeerConnection();
peerConnection.onicecandidate = iceCandidateCallback;
peerConnection.ontrack = remoteStreamCallback;
peerConnection.createOffer({ offerToReceiveVideo: true, offerToReceiveAudio: true })
      .then(signaling_pull).catch(errorHandler);


// CDN live post pull stream request.
function signaling_pull(offer_sdp) {
  console.log('local offer sdp', offer_sdp);

  peerConnection.setLocalDescription(offer_sdp).then(function() {
    // Get pull stream url.
    var stream_url = $("#stream_url").val();
    console.log("stream url:" , stream_url);

    // Add sdk and protocol versions.
    var protocol_version = 2;
    var sdk_version = "0.0.1";

    $.ajax({url: stream_url, data: JSON.stringify({
          mode: "live",
          version: protocol_version,
          sdk_version: sdk_version,
          jsep:description,
      }),
      type: "post",
      success:function(result){
          var signal = JSON.parse(result);
          peerConnection.setRemoteDescription(new RTCSessionDescription(signal.jsep)).then(function() {
              console.log("get remote answer sdp: ", signal.jsep.sdp);
          }).catch(errorHandler);
      }});
  }).catch(errorHandler);
}
                

Definition of the RTS signaling protocol

The RTS signaling protocol establishes a short-lived connection based on HTTPS. The protocol uses messages in the JSON format. Sample code:

{
    "version":2,
    "sdk_version":"0.0.1",
    "mode":"live",
    "pull_streams":[
        {
            "url":"artc://your.domain.com/live/testname",
            "amsid":[
                "rts audio"
            ],
            "vmsid":[
                "rts video"
            ]
        }
    ],
    "jsep":{
        "type":"offer",
        "sdp":"v=0\n\ro=- 6839248142876176651 2 IN IP4 127.0.0.1\n\rs=-\n\r Omitted content"
    }
}

Playback

  • Protocol description about playback
    Table 1. Request parameters
    Parameter Type Required Description
    mode string Yes The mode of the stream. In this example, set the parameter to live.
    version int Yes The version of the protocol. In this example, set the parameter to 2.
    push_stream string No The ingest URL.
    pull_streams []object No The stream that you want to pull. You can pull multiple streams at a time. For more information about the attributes of the pull_stream parameter, see the following table.
    sdk_version string No The version of the SDK.
    jsep.type string Yes The type of the SDP message. In this example, set the parameter to offer.
    jsep.sdp string Yes The description of the SDP message.
    Table 2. Attributes of the pull_stream parameter
    Attribute Type Required Description
    url string Yes The source URL that starts with artc://.
    amsid []string Yes The media stream ID (MSID) of the audio stream that you want to pull. In this example, set the parameter to rts audio.
    vmsid []string Yes The MSID of the video stream that you want to pull. In this example, set the parameter to rts video.
    Table 3. Response parameters
    Parameter Type Required Description
    code int Yes The HTTP status code. If the request is successful, the code 200 is returned. For more information about status codes, see the "Status codes" section.
    trace_id string Yes The globally unique ID (GUID) of the request. The GUID is generated by Alibaba Cloud CDN and can be used to troubleshoot issues. Keep the GUID properly.
    jsep.type string Yes The type of the SDP message. In this example, the value answer is returned. .
    jsep.sdp string Yes The description of the SDP message that is generated when CDN nodes pull streams from the origin.
  • Playback request example
    Request:
    {
        "version":2,
        "sdk_version":"0.0.1",
        "mode":"live",
        "pull_streams":[
            {
                "url":"artc://your.domain.com/live/testname",
                "amsid":[
                    "rts audio"
                ],
                "vmsid":[
                    "rts video"
                ]
            }
        ],
        "jsep":{
            "type":"offer",
            "sdp":"v=0\n\ro=- 6839248142876176651 2 IN IP4 127.0.0.1\n\rs=-\n\r Omitted content"
        }
    }
    
    Response:
    {
        "trace_id":"2_1591173296_101.227.0.169_702080732320_dec327eb6eed0e0b07b349c8a5653eca",
        "code":200,
        "jsep":{
            "type":"answer",
            "sdp":"v=0\r\no=- 1591173291 2 IN IP4 127.0.0.1\n\r Omitted content"
        }
    }

    sps-pps-idr-in-keyframe: We recommend that you perform the following operations to prevent screen flickers when the browser plays live streams under poor network conditions.

    After an SDP offer is created for a subscribed stream for HTML5 playback and before you call setLocalDescription, modify the SDP message. Find a line similar to the following one about H.264 video attributes:
    a=fmtp:127 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f
    Add sps-pps-idr-in-keyframe=1. The line becomes:
    a=fmtp:127 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f;sps-pps-idr-in-keyframe=1

    GRTN responds and turns on the feature for the browser.

  • Error handling

    If the stream pulling request is valid, the HTTP status code 200 is returned. The error handling result varies based on the returned HTTP status code in the JSON-formatted response body. The following table shows the code and message parameters in the response:

    Response:
    {
       "code": 200, // A value of 200 indicates that the request is successful. For more information about status codes, see the "Status codes" section.
       "message": "success" // The returned message.
    }
    Table 4. Response parameters
    Parameter Type Description
    code int The HTTP status code. For more information, see the "Status codes" section.
    message string The returned message.
    Table 5. Status codes
    Status code Description
    403 Indicates that the authentication failed.
    404 Indicates that the stream does not exist.
    611 Indicates that the client must play the stream over TCP.
    302 Indicates that the client must send the request to a new address.

Stream ingest

  • Protocol description about stream ingest
    Table 6. Request parameters
    Parameter Type Required Description
    mode string Yes The mode of the stream. In this example, set the parameter to live.
    version int Yes The version of the protocol. In this example, set the parameter to 2.
    push_stream string No The ingest URL.
    sdk_version string No The version of the SDK.
    jsep.type string Yes The type of the SDP message. In this example, set the parameter to offer.
    jsep.sdp string Yes The description of the SDP message.
    Table 7. Response parameters
    Parameter Type Required Description
    code int Yes The HTTP status code. If the request is successful, the code 200 is returned. For more information about status codes, see the "Status codes" section.
    trace_id string Yes The GUID of the request. The GUID is generated by Alibaba Cloud CDN and can be used to troubleshoot issues. Keep the GUID properly.
    jsep.type string Yes The type of the SDP message. In this example, the value answer is returned.
    jsep.sdp string Yes The description of the SDP message that is generated when CDN nodes pull streams from the origin.
  • Stream ingest request example
    Request:
    {
        "version":2,
        "sdk_version":"0.0.1",
        "mode":"rtc",
        "push_stream":"artc://host/app/name",
        "jsep":{
            "type":"offer",
            "sdp":"v=0\r\no=- 1385856200224536561 2 IN IP4 127.0.0.1\r\ns=-\r\nt=0 0\r\na=group:BUNDLE 0 1 2\r\na=extmap-allow-mixed\r\na=msid-semantic: WMS rts\r\nm=audio 9 UDP/TLS/RTP/SAVPF 111 63 103 104 9 0 8 106 105 13 110 112 113 126\r\nc=IN IP4 0.0.0.0\r\na=rtcp:9 IN IP4 0.0.0.0\r\na=ice-ufrag:iQyM\r\na=ice-pwd:D3GXKCcUGvW9djaAozff5ppT\r\na=ice-options:trickle\r\na=fingerprint:sha-256 20:50:72:9B:A2:C0:D8:50:AD:D0:EF:A7:62:8F:EF:C3:AB:86:D5:B6:3E:17:22:69:79:5B:CE:E8:42:33:B5:E4\r\na=setup:actpass\r\na=mid:0\r\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\r\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=sendrecv\r\na=msid:rts audio\r\na=rtcp-mux\r\na=rtpmap:111 opus/48000/2\r\na=rtcp-fb:111 transport-cc\r\na=rtcp-fb:111 nack\r\na=fmtp:111 minptime=10;useinbandfec=1\r\na=rtpmap:63 red/48000/2\r\na=fmtp:63 111/111\r\na=rtpmap:103 ISAC/16000\r\na=rtpmap:104 ISAC/32000\r\na=rtpmap:9 G722/8000\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:106 CN/32000\r\na=rtpmap:105 CN/16000\r\na=rtpmap:13 CN/8000\r\na=rtpmap:110 telephone-event/48000\r\na=rtpmap:112 telephone-event/32000\r\na=rtpmap:113 telephone-event/16000\r\na=rtpmap:126 telephone-event/8000\r\na=ssrc:3411287802 cname:s4eR7OuKnnPL0vKS\r\na=ssrc:3411287802 msid:rts audio\r\na=ssrc:3411287802 mslabel:rts\r\na=ssrc:3411287802 label:audio\r\nm=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 127 121 125 107 108 109 124 120 123 119 35 36 41 42 114 115 116 117 118\r\nc=IN IP4 0.0.0.0\r\na=rtcp:9 IN IP4 0.0.0.0\r\na=ice-ufrag:iQyM\r\na=ice-pwd:D3GXKCcUGvW9djaAozff5ppT\r\na=ice-options:trickle\r\na=fingerprint:sha-256 20:50:72:9B:A2:C0:D8:50:AD:D0:EF:A7:62:8F:EF:C3:AB:86:D5:B6:3E:17:22:69:79:5B:CE:E8:42:33:B5:E4\r\na=setup:actpass\r\na=mid:1\r\na=extmap:14 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:13 urn:3gpp:video-orientation\r\na=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\r\na=extmap:5 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay\r\na=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/video-content-type\r\na=extmap:7 http://www.webrtc.org/experiments/rtp-hdrext/video-timing\r\na=extmap:8 http://www.webrtc.org/experiments/rtp-hdrext/color-space\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=extmap:10 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id\r\na=extmap:11 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id\r\na=sendrecv\r\na=msid:rts video\r\na=rtcp-mux\r\na=rtcp-rsize\r\na=rtpmap:96 VP8/90000\r\na=rtcp-fb:96 goog-remb\r\na=rtcp-fb:96 transport-cc\r\na=rtcp-fb:96 ccm fir\r\na=rtcp-fb:96 nack\r\na=rtcp-fb:96 nack pli\r\na=rtpmap:97 rtx/90000\r\na=fmtp:97 apt=96\r\na=rtpmap:98 VP9/90000\r\na=rtcp-fb:98 goog-remb\r\na=rtcp-fb:98 transport-cc\r\na=rtcp-fb:98 ccm fir\r\na=rtcp-fb:98 nack\r\na=rtcp-fb:98 nack pli\r\na=fmtp:98 profile-id=0\r\na=rtpmap:99 rtx/90000\r\na=fmtp:99 apt=98\r\na=rtpmap:100 VP9/90000\r\na=rtcp-fb:100 goog-remb\r\na=rtcp-fb:100 transport-cc\r\na=rtcp-fb:100 ccm fir\r\na=rtcp-fb:100 nack\r\na=rtcp-fb:100 nack pli\r\na=fmtp:100 profile-id=2\r\na=rtpmap:101 rtx/90000\r\na=fmtp:101 apt=100\r\na=rtpmap:127 H264/90000\r\na=rtcp-fb:127 goog-remb\r\na=rtcp-fb:127 transport-cc\r\na=rtcp-fb:127 ccm fir\r\na=rtcp-fb:127 nack\r\na=rtcp-fb:127 nack pli\r\na=fmtp:127 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f\r\na=rtpmap:121 rtx/90000\r\na=fmtp:121 apt=127\r\na=rtpmap:125 H264/90000\r\na=rtcp-fb:125 goog-remb\r\na=rtcp-fb:125 transport-cc\r\na=rtcp-fb:125 ccm fir\r\na=rtcp-fb:125 nack\r\na=rtcp-fb:125 nack pli\r\na=fmtp:125 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42001f\r\na=rtpmap:107 rtx/90000\r\na=fmtp:107 apt=125\r\na=rtpmap:108 H264/90000\r\na=rtcp-fb:108 goog-remb\r\na=rtcp-fb:108 transport-cc\r\na=rtcp-fb:108 ccm fir\r\na=rtcp-fb:108 nack\r\na=rtcp-fb:108 nack pli\r\na=fmtp:108 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f\r\na=rtpmap:109 rtx/90000\r\na=fmtp:109 apt=108\r\na=rtpmap:124 H264/90000\r\na=rtcp-fb:124 goog-remb\r\na=rtcp-fb:124 transport-cc\r\na=rtcp-fb:124 ccm fir\r\na=rtcp-fb:124 nack\r\na=rtcp-fb:124 nack pli\r\na=fmtp:124 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42e01f\r\na=rtpmap:120 rtx/90000\r\na=fmtp:120 apt=124\r\na=rtpmap:123 H264/90000\r\na=rtcp-fb:123 goog-remb\r\na=rtcp-fb:123 transport-cc\r\na=rtcp-fb:123 ccm fir\r\na=rtcp-fb:123 nack\r\na=rtcp-fb:123 nack pli\r\na=fmtp:123 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=4d001f\r\na=rtpmap:119 rtx/90000\r\na=fmtp:119 apt=123\r\na=rtpmap:35 H264/90000\r\na=rtcp-fb:35 goog-remb\r\na=rtcp-fb:35 transport-cc\r\na=rtcp-fb:35 ccm fir\r\na=rtcp-fb:35 nack\r\na=rtcp-fb:35 nack pli\r\na=fmtp:35 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=4d001f\r\na=rtpmap:36 rtx/90000\r\na=fmtp:36 apt=35\r\na=rtpmap:41 AV1/90000\r\na=rtcp-fb:41 goog-remb\r\na=rtcp-fb:41 transport-cc\r\na=rtcp-fb:41 ccm fir\r\na=rtcp-fb:41 nack\r\na=rtcp-fb:41 nack pli\r\na=rtpmap:42 rtx/90000\r\na=fmtp:42 apt=41\r\na=rtpmap:114 H264/90000\r\na=rtcp-fb:114 goog-remb\r\na=rtcp-fb:114 transport-cc\r\na=rtcp-fb:114 ccm fir\r\na=rtcp-fb:114 nack\r\na=rtcp-fb:114 nack pli\r\na=fmtp:114 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=64001f\r\na=rtpmap:115 rtx/90000\r\na=fmtp:115 apt=114\r\na=rtpmap:116 red/90000\r\na=rtpmap:117 rtx/90000\r\na=fmtp:117 apt=116\r\na=rtpmap:118 ulpfec/90000\r\na=ssrc-group:FID 4075787827 945566690\r\na=ssrc:4075787827 cname:s4eR7OuKnnPL0vKS\r\na=ssrc:4075787827 msid:rts video\r\na=ssrc:4075787827 mslabel:rts\r\na=ssrc:4075787827 label:video\r\na=ssrc:945566690 cname:s4eR7OuKnnPL0vKS\r\na=ssrc:945566690 msid:rts video\r\na=ssrc:945566690 mslabel:rts\r\na=ssrc:945566690 label:video\r\nm=application 9 UDP/DTLS/SCTP webrtc-datachannel\r\nc=IN IP4 0.0.0.0\r\na=ice-ufrag:iQyM\r\na=ice-pwd:D3GXKCcUGvW9djaAozff5ppT\r\na=ice-options:trickle\r\na=fingerprint:sha-256 20:50:72:9B:A2:C0:D8:50:AD:D0:EF:A7:62:8F:EF:C3:AB:86:D5:B6:3E:17:22:69:79:5B:CE:E8:42:33:B5:E4\r\na=setup:actpass\r\na=mid:2\r\na=sctp-port:5000\r\na=max-message-size:262144\r\n"
        }
    }
    
    Response:
    {
        "trace_id":"...",
        "code":200,
        "jsep":{
            "type":"answer",
            "sdp":"v=0\r\no=- 1657264764 2 IN IP4 127.0.0.1 Omitted content"
        }
    }
                        
    In the preceding signaling request, push_stream indicates the ingest URL, which is similar to stream ingest over Real-Time Messaging Protocol (RTMP). Take note of the following points:
    • You must specify an MSID in the SDP message. GRTN uses the MSID to identify the media stream.
    • The media stream does not support negotiation. For an MSID specified by the client, only one codec is allowed. GRTN does not make choices. Supported codecs include Advanced Audio Coding (AAC), Opus, H.264, and H.265.
    • Sample code for audioaudio1audio2
    • videovideo1video2
  • Error handling

    If the stream pulling request is valid, the HTTP status code 200 is returned. The error handling result varies based on the returned HTTP status code in the JSON-formatted response body. The following table shows the code and message parameters in the response:

    Response:
    {
       "code": 200, // A value of 200 indicates that the request is successful. For more information about status codes, see the "Status codes" section.
       "message": "success" // The returned message.
    }
    Table 8. Response parameters
    Parameter Type Description
    code int The HTTP status code. For more information, see the "Status codes" section.
    message string The returned message.
    Table 9. Status codes
    Status code Description
    403 Indicates that the authentication failed.
    611 Indicates that the client must play the stream over TCP.
    302 Indicates that the client must send the request to a new address.

Enhanced SDP negotiation

Messages are exchanged in the SDP format during signaling. SDP negotiation is generally based on RFC 4566. RTS expands more semantics to make the negotiation compatible with the characteristics of the live streaming industry. RTS supports more container formats of videos and audio and more communications protocols. This way, RTS resolves the issue that WebRTC supports only the Opus format for audio and does not support B-frames. RTS meets the needs of increasing streaming protocols.

AAC supported for stream ingest and pulling

RTS can transmit audio in various AAC formats over RTMP. The AAC formats include AAC-LC, HE-AACv1, and HE-AACv2.

RTS can transmit audio in AAC formats by using the Low-overhead MPEG-4 Audio Transport Multiplex (LATM) container format. LATM determines whether the encoding information about audio is transmitted in in-band or out-of-band mode based on whether the audio contains the encoding information. In-band transmission sends the encoding information for each audio frame. Out-of-band transmission sends the encoding information only once. The muxconfigPresent parameter in an AudioMuxElement array specifies whether the information in AudioSpecificConfig is transmitted in in-band or out-of-band mode. Therefore, LATM is more flexible than Audio Data Transport Stream (ADTS). If the information in AudioSpecificConfig remains unchanged, the information in StreamMuxConfig can be first transmitted in an SDP message.

  • AAC supported for stream ingest

    You can inform the server of the AAC audio format by carrying relevant information in the SDP offer. In addition, you can add config=StreamMuxConfig, which is obtained from AudioSpecificConfig of the ingested stream, in the fmtp attribute. This way, the AudioSpecificConfig parameter can be carried to the server to generate an AAC header.

    SDP offer
    a=rtpmap:125 MP4A-LATM/48000/2
    a=fmtp:125 config=4000232000;cpresent=0;object=2;profile-level-id=1
  • AAC supported for stream pulling

    During signaling, RTS parses the encoding information during audio stream ingest and returns the parsed information in the negotiation response, as shown in the following code.

    SDP offer SDP answer
    AAC-LC HE-AACv1 HE-AACv2
    m=audio 9 UDP/RTP/AVPF 120 96 
    a=rtpmap:120 MP4A-LATM/44100/2   
    AudioSpecificConfig = 0x1210
    AudioSpecificConfig = 2b920800
    AudioSpecificConfig = eb8a0800
    a=rtpmap:120 MP4A-LATM/44100/2
    a=fmtp:120 cpresent=0;profile-level-id=1;object=2;config=400024203fc0
    a=rtpmap:120 MP4A-LATM/44100/2 
    a=fmtp:120 cpresent=0;profile-level-id=1;object=2;config=4000572410003fc0;SBR-enabled=1
    a=rtpmap:120 MP4A-LATM/44100/2 
    a=fmtp:120 cpresent=0;object=2;profile-level-id=1;config=4001d71410003fc0;PS-enabled=1;SBR-enabled=1

    If SBR-enabled=1 is added in the fmtp attribute of MP4A-LATM, the AAC format is AAC-HE. If SBR-enabled=1 and PS-enabled=1 are added, the AAC format is HE-AACv2. The AAC format is evolved from AAC-LC to HE-AACv2. Therefore, the SBR and PS fields can be used in the fmtp attribute to indicate AAC formats. In addition, add config=StreamMuxConfig in the fmtp attribute. StreamMuxConfig is obtained from AudioSpecificConfig of the ingested stream and contains parameters that are related to the details of the encoding information. The client can obtain the details as needed.

    002

H.265 supported for stream ingest and pulling

  • H.265 supported for stream ingest

    You can inform the server of the H.265 video format by carrying relevant information in the SDP offer.

    SDP offer:
    a=rtpmap:102 H265/90000 
  • H.265 supported for stream pulling

    During the signaling process of stream pulling, the server obtains the video codec of the source stream, such as H.264, H.265, and performs negotiation based on the information.

    SDP offer SDP answer
    a=rtpmap:102 H265/90000
    a=rtpmap:122 H265/90000
    a=fmtp:122

B-frames supported for stream ingest and pulling

  • B-frames supported for stream ingest

    B-frames are supported in the ingested stream.

  • B-frames supported for stream pulling

    During signaling, the client can add a field in the SDP offer to specify whether to decode videos that contain B-frames. For example, if the client adds BFrame-enabled = 1 in the fmtp attribute, the client can decode videos that contain B-frames. In this case, RTP timestamp = PTS can be added, which means the client decodes each frame based on the increasing sequence number. If videos that contain B-frames are not supported, RTS can transcode the source streams to remove B-frames.

    To decode videos that contain B-frames, "RTP timestamp = PTS" can be added. The client decodes each frame based on the increasing sequence number.

    In addition, the server can return a composition timestamp (CTS). This allows the client to calculate the decoding timestamp (DTS) based on the following formula: Presentation timestamp (PTS) = DTS + CTS. If an SDP offer contains a=extmap:{$id} uri:webrtc:rtc:rtp-hdrext:video:CompositionTime, RTS adds extension identifier = {$id} to the first Real-time Transport Protocol (RTP) packet of each video frame. The value of the id variable is determined by the SDP offer that is sent by the client.

    RTS allows the client to determine whether to decode videos that contain B frames and whether to return CTS information. RTS values a general method in communication from the very beginning.

The following figures show the partial content of the SDP offer and the packet capture during stream pulling:

Partial content of the SDP offer004

Signaling request carrying metadata supported for stream ingest

Ingested streams over RTMP carry metadata, which can be used to return callbacks and provide information for the stream pulling client. However, the WebRTC signaling process does not involve metadata. As a remedial measure, RTS allows metadata to be carried in the signaling request. This way, metadata can be generated in the ingested stream.

Add the metadata field to the body of the signaling request for the ingested stream:

{
    "version":2,
    "sdk_version":"0.0.1",
    "mode":"rtc",
    "push_stream":"artc://host/app/name",
    "jsep":{
        "type":"offer",
        "sdp":"..."
    },
    "metadata":{
        "framerate":"20",
        "platform":"iOS",
        "audiodatarate":"200",
        "videodatarate":"2000"
    }
}

MSID mechanism

For more information about MSID, see The Msid Mechanism. Pay attention to the following content.Msid