I randomly picked other media from our archive. In this media problem also occurs, but it's slightly different. It's interesting that in this case audio goes first and video goes afterwards. Problem is noticeable from the very beginning, but the reason is probably the same - I can see resolution change in 00:06. I attach mjr files (test2 - guy talking):
Ffmpeg info about converted files:
Input #0, ogg, from 'audio.opus':
Duration: 00:35:23.16, start: 0.000000, bitrate: 29 kb/s
Input #0, matroska,webm, from 'video.webm':
Metadata:
encoder : Lavf54.20.4
Duration: 00:36:36.43, start: 0.000000, bitrate: 172 kb/s
Stream #0:0: Video: vp8, yuv420p, 1024x576, SAR 1:1 DAR 16:9, 1 fps, 25.42 tbr, 1k tbn, 1k tbc (default)
As you can see, there is above 1 min of difference in duration.
Ffmpeg commands that I use (FYI):
ffmpeg -i $VIDEO.webm -c:v libx264 $VIDEO_converted.mp4
ffmpeg -i $VIDEO_converted.mp4 -i $AUDIO.opus -c:v copy -c:a aac -strict experimental -y $VIDEO_merged.mp4
As before, after re-encoding, video works fine, there is no weird resize issues in Totem and no crashes in VLC, but audio/video is out of sync.
It seems that resolution changes messes up the sync. Unfortunately, it's said that making WebRTC to have fixed resolution is impossible, and can be overcomed in post-processing:
But, of course, it doesn't help us, as we lose sync at particular moment of resolution change, and later manipulations won't help.
On the other hand, I recorded some test video using demo (other samples that I provided were recorded using our config, where bitrates and resolutions were quite higher) in which I made some clapping and finger snapping at the end, and that was also out of sync. The problem is, that I couldn't find a moment when resolution changes, not re-encoded video works fine. That may lead to conclusion that losing sync isn't related with resolution changes, what is kinda interesting, as before sync was lost exactly when resolution was changed. I also attach this files (test3 - window and clapping):
Well, that findings exclude each other. I'm confused.