Several NALUs per RTP packet - can browser's H.264 decoder cope with that?

736 views
Skip to first unread message

Neil Young

unread,
Apr 4, 2022, 3:19:56 PM4/4/22
to discuss-webrtc
I'm having an RTSP source (Parrot Anafi), which produces RTP packages containing several NALUs, for instance SPS and PPS and a non-IDR packet in one RTP packet.

Using a GStreamer webrtcbin pipeline I was trying to relay these RTP packages "as is" into a WebRTC connection, but neither my MediaServer (Kurento, OpenH264) nor any of the browsers I have in access (Safari, Chrome, Firefox) are able to display the video.

The ICE connectivity is reached, but then the video element just displays a spinning wheel.

Does that make sense?

The same pipeline has no problem with an RTSP source, which delivers each NALU as a separate package and forwarding this to the browsers works.

Neil Young

unread,
Apr 4, 2022, 4:55:29 PM4/4/22
to discuss-webrtc
> Does that make sense? 

Yes

MANEs MAY convert single NAL unit packets into one aggregation packet, convert an aggregation packet into several single NAL unit packets, or mix both concepts, in an RTP translator. The RTP translator SHOULD take into account at least the following parameters: path MTU size, unequal protection mechanisms (e.g., through packet-based FEC according to RFC 2733 [18], especially for sequence and picture parameter set NAL units and coded slice data partition A NAL units), bearable latency of the system, and buffering capabilities of the receiver.

Vitaly Ivanov

unread,
Apr 4, 2022, 10:23:02 PM4/4/22
to discuss...@googlegroups.com
Did you try different packetization-mode's?

--

---
You received this message because you are subscribed to the Google Groups "discuss-webrtc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/discuss-webrtc/8c8321db-2c88-4390-b209-d0385d634b55n%40googlegroups.com.

Neil Young

unread,
Apr 5, 2022, 7:43:20 AM4/5/22
to discuss-webrtc
I don't have access to this part, I'm just consuming the RTPS stream, rtph264depaying it and rtph264paying it again with GStreamer. This woks as said with a Chinese SRICAM, but not with the ANAFI. And I suppose it is due to the use of STAP-A

This video for a colleague


demonstrates the difference in the inputs coming from both devices. Forwarding the NALUs repacketized "as is" works for SRICAM, not for ANAFI. Browsers unable to decode.

I'm sure, the payload from the ANAFI drone is OK, but somehow it is not understandable for the browsers. The relay pipeline I'm using can be seen in the video. It works for the other device.

Neil Young

unread,
Apr 5, 2022, 7:45:10 AM4/5/22
to discuss-webrtc
"I'm just consuming the *RTSP stream..." of course

Vitaly Ivanov

unread,
Apr 5, 2022, 8:11:44 AM4/5/22
to discuss...@googlegroups.com
I don't see a single IDR slice in the ANAFI stream (SEI has nothing to do with IDR btw), not surprising it's not playing back. I don't know if STAP-A is causing problems, but missing IDRs certainly do - you cannot start decoding a stream without an IDR

Neil Young

unread,
Apr 5, 2022, 8:16:57 AM4/5/22
to discuss-webrtc
Yes, I wasn't seeing it too. SPS and PPS and SEI and non-IDR.... very strange.

However, this thing plays back if using the entire GStreamer chain, so rtph264depay -> h264parse -> avdec_h264 -> videoconvert -> autovideosink.

Chema Gonzalez

unread,
Apr 5, 2022, 2:08:37 PM4/5/22
to discuss-webrtc
IIUC, the offending (second in the YT video) stream is broken: You have SPS/PPS, then SEI (receiver can ignore these and still present something), then non-IDRs (which should be ignored as they don't have an IDR to refer to), but you have no IDRs. Can you do a longer capture and check if they eventually show up?

-Chema

Neil Young

unread,
Apr 5, 2022, 2:21:37 PM4/5/22
to discuss-webrtc
OK. I'm having added a trace into the source code of GStreamer, gst-plugins-good, rtph264depay, since I assumed, they would have a problem with the decoding of STAP-A. In fact, there is no IDR frame at all.

I see SPS and PPS and tons of non-IDR slices (not traced here), but if I treat NALU type 5 correctly as IDR frame, then there is no.

Is it for sure, that the IDR frame is vital for starting the video decoding browser side? Please confirm. It also looks like as if the SPS/PPS sequence would be send every 2 seconds, which would match the common behaviour of sending a keyframe once per 2 seconds (at least this is what I have seen on a page of restreamer) 

0:01:00.050848151 20737 0x707092f0 WARN            rtph264depay gstrtph264depay.c:899:gst_rtp_h264_depay_handle_nal: SPS

0:01:00.051000703 20737 0x707092f0 WARN            rtph264depay gstrtph264depay.c:903:gst_rtp_h264_depay_handle_nal: PPS

0:01:00.131786525 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:00.333446184 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:00.533416996 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:00.732810571 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:00.950763602 20737 0x707092f0 WARN            rtph264depay gstrtph264depay.c:899:gst_rtp_h264_depay_handle_nal: SPS

0:01:00.950883080 20737 0x707092f0 WARN            rtph264depay gstrtph264depay.c:903:gst_rtp_h264_depay_handle_nal: PPS

0:01:01.134798019 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:01.333826698 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:01.534764433 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:01.742393807 20737  0x27afb50 INFO               webrtcbin gstwebrtcbin.c:5596:on_rtpbin_ssrc_active:<webrtcbin> session 0 ssrc 1 active

0:01:01.854877261 20737 0x707092f0 WARN            rtph264depay gstrtph264depay.c:899:gst_rtp_h264_depay_handle_nal: SPS

0:01:01.855516581 20737 0x707092f0 WARN            rtph264depay gstrtph264depay.c:903:gst_rtp_h264_depay_handle_nal: PPS

I'm now going to create a longer tcpdump in order to extract the H.264 and let a stream analyzer run over it.

  

But the most important question is: Is it true, that WebRTC video decoding in browsers does not start, before at least one IDR has been received? If this would be true, I could try to go to Parrot and complain :)



Neil Young

unread,
Apr 5, 2022, 3:17:20 PM4/5/22
to discuss-webrtc
Uploaded a trace of 1 minute running...

gst-launch-1.0 -v rtspsrc location=rtsp://192.168.42.1/live ! fakesink


I can't see any IDR frame. 

If you open this in WS you will most likely just see RTP frames (or UDP in worst case), unless you have the H.264 decoding filter enabled. If not done do this:

1) Mark an UDP package, right click, "Decode as...RTP" then save. Packages should be shown as RTP now
2) The dynamic payload type is 96, so go to "Wireshark/Rreferences/Rrotocols", unfold the protocl list, select the H264 filter and enter the number 96 as "dynamic payload type"

Now you should see the packages as H.264.

I additionally tried to extract the H.264 NALUs using this LUA plugin (which usually works fine):


But the results where strange: I did run FFMPEG over the extracted H.264. FFMPEG produced a lot of errors ("Invalid NALU type 0" and others) and came up with a 39 seconds black MP4 video :) So I don't really trust the extraction result or the input is already garbage. 

My next attempt will be to put a filesink at the end of "rtph264depay ! h264parse" and examine the output. Maybe this gives more info.

TIA

Kevin Wang

unread,
Apr 5, 2022, 4:24:18 PM4/5/22
to discuss...@googlegroups.com
I had a tough time tuning x264 to work correctly with Chrome WebRTC. You might have to reencode your bitstream to get it to play back. For example, subtle issues like this one: https://groups.google.com/g/discuss-webrtc/c/3tLWL9yyjsA can prevent playback completely. Definitely getting a clean ffmpeg playback is a worthwhile first step.

Regarding your question, I haven't seen that an IDR frame is necessary, my playback always starts on SPS/PPS. Having several NALUs per packet also hasn't been an issue in my experience but I am not familiar enough with libwebrtc to say with certainty.

Neil Young

unread,
Apr 5, 2022, 5:05:37 PM4/5/22
to discuss-webrtc
Next step: Added some parameters to rtph264depay in order to make it unpack the RTP to something I'm able to recognize (especially all what starts with 00 00 00 01 :)):

This pipeline produces H.264, which then can easily transformed to an MP4 and displayed:

gst-launch-1.0 -v rtspsrc location=rtsp://192.168.42.1/live ! rtph264depay ! video/x-h264,stream-format=byte-stream,alignment=nal ! h264parse ! filesink location=dump.h264

The "h264parse" step isn't really necessary, but doesn't harm either.

Then I was using FFMPEG to make it a displayable MP4:

ffmpeg -framerate 30 -i dump.h264 -c copy output.mp4

The appearance of the video was funny: Like in an old Western movie, when the scene is revealed through and enlarging pin hole. This again points to the suspecion, that Anafi is NOT sending IDR frames. Moreover, viewing the image over time definitly doesn't show the known full screen artefacts, if an IDR is displayed.

However, I could create a valid MP4 from it and watch it. Perfect.

Now I "just" need to figure out, what rtph264pay parameters need to be applied, so that it appears in the browser.

@Kevin Wang: I know, I know... This is not my first attempt to merry H.264 and the browsers. The special new thing here is, that all my tricks (SDP munging, profile-level-id faking and stuff) this time didn't help at all. And more: The video appears in NONE of the browsers I have access to: Not in Chrome, Chromium, Safari and also not in Firefox. I up to now haven't seen such a degree of resistance :)

However, transcoding is not an option. I know, that it works, but I'm supposed to work on an Edge device with this and this is simply too weak for stuff like that.

If it in the end is just the IDR... well, then Parrot would have the ball in their field. 

Neil Young

unread,
Apr 5, 2022, 5:13:31 PM4/5/22
to discuss-webrtc
BTW: I wanted to add, that this time also not "nothing" happens: WebRTC-internal shows, that there is a perfect bitstream flowing from my WebRTC proxy to the browser (5MBit/s, as the SDP orders) and all required stages of ICE connectivity have been passed, even though data channels are already open. So I feel, I'm really, really close, the more I know, that it works with other encoders like so (see SRICAM RTSP). Also all GStreamer logs tell me, that video is sent out. This is a pretty high level, at which it fails finally. 

It might be just a little step. Over the weekend I was desperately trying to feed YT with an RTMP stream I created on the fly from H.264 on a DJI drone. It didn't work. But it worked with FB and Twitter and some re-streamers. Just not with YT. Then I found a hidden notice in one github: YT RTMP stream expects AUDIO.... but my drone video didn't have audio. So in a last attempt I merged a silent audio stream via FFMPEG to my video and booom: It worked...

Sometimes it is just a bit.

Never give up :)

Vitaly Ivanov

unread,
Apr 5, 2022, 10:23:58 PM4/5/22
to discuss...@googlegroups.com
Your description sounds very much like a technique called intra-refresh - no IDRs, but there's a rolling row(s) of intra-coded macroblocks from which you can gradually recreate the whole picture. This way you can avoid bit rate spikes when sending an IDR.
This intra-refresh thing is relatively exotic (though supported by x264) and very likely not supported by browsers. You see a stream coming in, but the video decoder is just dropping it all waiting for an IDR slice to start

Neil Young

unread,
Apr 6, 2022, 2:06:51 AM4/6/22
to discuss-webrtc
Interesting. I have read a bit about. Not sure, the X264 states, that the first frame is always and IDR. (http://www.chaneru.com/Roku/HLS/X264_Settings.htm#intra-refresh)

I'm pretty sure, that there is not a single IDR in this payload. I made some attempts to consume the stream with FFMPEG. It works perfectly, if you just record it to file and convert it to MP4 for display. Just that the initial scene kind of "unfolds", from inside to outside for one or two seconds, as tried to explain like a 70th Western intro.

Also a local re-stream of the NAL units using UDP and a GStreamer pipeline on the other side, which displays that, work absolutely fine. But I can't convince a browser to "eat" that.

Like this:

1) One process:

ffmpeg -i rtsp://192.168.42.1/live -f h264  -vcodec copy udp://127.0.0.1:1234

2) Another process

gst-launch-1.0 -v udpsrc port=1234 ! h264parse ! avdec_h264 ! videoconvert ! autovideosink

This works.

BTW: This is what FFMPEG tells about the H.264 internals:

Input #0, rtsp, from 'rtsp://192.168.42.1/live':
  Metadata:
    title           : live
    comment         : ANAFI-G134295
  Duration: N/A, start: 1.101100, bitrate: N/A
  Stream #0:0: Video: h264 (Main), yuv420p(tv, bt709, progressive), 1280x720 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 90k tbn
Output #0, h264, to 'udp://127.0.0.1:1234':
  Metadata:
    title           : live
    comment         : ANAFI-G134295
    encoder         : Lavf59.16.100
  Stream #0:0: Video: h264 (Main), yuv420p(tv, bt709, progressive), 1280x720 [SAR 1:1 DAR 16:9], q=2-31, 29.97 fps, 29.97 tbr, 29.97 tbn
Stream mapping:
  Stream #0:0 -> #0:0 (copy)


Astonishing enough, I have also seen another encoder version in another attempt... This is a strange little thing.

I fear it will really not work w/o transcoding...:/

But thanks for the confirmation, that w/o IDR nothing will work.

Chema Gonzalez

unread,
Apr 6, 2022, 1:29:45 PM4/6/22
to discuss-webrtc
> Your description sounds very much like a technique called intra-refresh - no IDRs, but there's a rolling row(s) of intra-coded macroblocks from which you can gradually recreate the whole picture. This way you can avoid bit rate spikes when sending an IDR.
> This intra-refresh thing is relatively exotic (though supported by x264) and very likely not supported by browsers. You see a stream coming in, but the video decoder is just dropping it all waiting for an IDR slice to start

Yeah. Something similar called GDR (gradual decoding refresh) was added in VVC/h266. 

-Chema

Chema Gonzalez

unread,
Apr 6, 2022, 1:30:43 PM4/6/22
to discuss-webrtc
>  Just that the initial scene kind of "unfolds", from inside to outside for one or two seconds, as tried to explain like a 70th Western intro.
Can you please add a couple of images?

-Chema

Neil Young

unread,
Apr 6, 2022, 1:32:59 PM4/6/22
to discuss...@googlegroups.com
I'll be doing a new short video

Von meinem iPad gesendet

Am 06.04.2022 um 19:30 schrieb Chema Gonzalez <che...@gmail.com>:

>  Just that the initial scene kind of "unfolds", from inside to outside for one or two seconds, as tried to explain like a 70th Western intro.

Neil Young

unread,
Apr 6, 2022, 1:52:58 PM4/6/22
to discuss-webrtc
OK, here we go: https://youtu.be/1W4nxA0pw6g

Just made a partial recording of the screen. Unfortunately my video overlay always just appears that small, didn't find a way to resize it by default.

Can you see the "unfolding" effect as well as the macro block sparkle?

Pipeline:

gst-launch-1.0 rtspsrc location=rtsp://192.168.42.1/live is-live=true connection-speed=3000 ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! autovideosink

Neil Young

unread,
Apr 6, 2022, 2:32:59 PM4/6/22
to discuss-webrtc
Right now I think I found kind of a "smoking gun", which seem to confirm all what has been discussed here:

I fired up Chromium with log enabled and after a lot of stuff and a finally positive ICE and SDP negotiation the browser cries endlessly:

[15032:48131:0406/202231.110771:WARNING:video_receive_stream2.cc(951)] No decodable frame in 200 ms, requesting keyframe.


Neil Young

unread,
Apr 7, 2022, 5:25:17 AM4/7/22
to discuss-webrtc
A friendly contributor in the Anafi developer forum confirmed these observations. Would like to share that here in order to finish this thread.


It makes no sense to continue here. I would rather have to go with transcoding or maybe wait for either Anafi comes close to the browsers or the browser implement tolerance for intra-refresh (whatever happens first)

Thanks to all contributors. Great discussion.
Reply all
Reply to author
Forward
0 new messages