VDH can't download it? Maybe ffmpeg can.

3,668 views

Skip to first unread message

Wild Willy

unread,

Nov 25, 2021, 4:21:31 AM11/25/21

to Video DownloadHelper Q&A

To set the stage, I am doing this on Windows 7 64-bit, Firefox 94.0.2 64-bit, licensed VDH 7.6.3a1 beta, CoApp 1.6.3, VLC 3.0.16 Vetinari. The fact that I have a VDH license makes no difference, as you can learn by following references found in this thread:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/BzPLK2YyL-s

Everything I'm showing here works the same even if you don't have a VDH license. But that's pretty much irrelevant anyway because I'm discussing how to do this when VDH doesn't get the video you want.

To start, let's go to this web page:

https://twitter.com/canadiensMTL

I am not a member of Twitter (or any social media site, for that matter) so being logged on there is not an issue. But I chose Twitter because many people have had trouble getting videos off Twitter. Don't get tied up in that. What I'm showing here is something that can apply to many other web sites as well.

Twitter pages tend to have many videos on them. I've scrolled down to one that is nice & short. But you'll notice that I've also scrolled the VDH menu & there isn't an entry for the one in the image, the one that's just 16 seconds long. Truth in advertising, it does appear further down the VDH menu. But depending on how far down the Twitter page your selected video might be, it could be like searching for a needle in a haystack to find it on the VDH menu. But there's a way to simplify this. It may not be the best way, but like I say, I'm not a member of Twitter so this is what I've discovered. If you have a better way, do feel free to post about it in this thread.

To isolate just the video I want, click mouse button 2 (MB2) on the video. Don't play the video, just pop up the context menu.

#02.png

Not much of a context menu. Still, it's got what we need. Click that one item. This puts the URL of just this one video into the system clipboard. Now click in the address bar, paste in the URL, & hit Enter.

https://twitter.com/i/status/1463581536725475337

#03.png

Hmmmm....... The VDH menu shows a variant of length 34 seconds for one 16-second video. Maybe I ought to go ahead & click play on the video.

#04.png

Doesn't make any difference. I guess the information provided by Twitter to VDH isn't enough for VDH to get the duration right. OK Let's assume this variant is our video. Let's go ahead & download it.

#05.png

I don't like that default file name that VDH has picked. I'll replace it with the name of the guy in the video.

#06.png

Now here's the VDH download progress status menu for this. Doesn't look promising.

#07.png

Wild Willy

unread,

Nov 25, 2021, 5:26:57 AM11/25/21

to Video DownloadHelper Q&A

Indeed, it wasn't.

Time to go to plan B. Open the Firefox Network Monitor by clicking the F12 key.

It is possible that you won't have exactly what I'm showing. You need to select the highlighted items Network and All. The other selections to the right of All might also be selected at the same time as All. Unselect them if necessary. Then type m3u8 into the Filter field.

What is m3u8? There is a naming convention for manifest files. The file name for any manifest file has an extension of m3u8. We are looking for any manifest file that might be present here. Why are we interested in manifests? This will become clear as we go along here.

So far, there isn't anything showing in the Network Monitor. Not too helpful. Hit F5 to reload the page.

#10.png

Good. We do have manifests. Some sites don't, notably YouTube. In that case, this whole discussion is of no help at all. Sorry. That's just the way it is. There is no international standard for web sites to present audio-visual content, so there's lots of sites that do not supply manifests. I suspect that in some cases, this is intentionally meant to thwart downloaders such as VDH & people who use the technique I'm discussing here. But where there is a manifest, there is a way. In this case, we are lucky.

So now that we know we have a manifest, what good does it do us? Let's look inside one. On the assumption that the first one is the most important one, let's look at that one. Double click the first entry in the list. In this image, I am hovering the mouse over the entry before double clicking it.

#11.png

Note how the complete URL in the hover text includes the string shown in the Network Monitor. The Network Monitor shows only the last part of the URL. You have to hover the mouse over the item in the Network Monitor to see the complete URL. Also notice the string m3u8 slyly buried in the middle of all that. That's why you filter. With the filter field empty, the Network Monitor would fill up with all kinds of stuff over time. Try it yourself some time. You'll see what I mean.

So now I double click on that entry. This normally brings up a Firefox dialog offering to display or save or otherwise handle the file. I have long ago gone through the exercise of setting my text editor as the default handler for objects of type m3u8. You do whatever you need to do on your system to get the manifest to display in your text editor. I'm talking about Notepad or a replacement for it. I happen to use Notepad++ but there are others. You see, a manifest is just a plain text file. Nothing mysterious to it. It's plain text that anybody can read. Here is the manifest I got by double clicking that entry in the Network Monitor. The lines are long but I can't figure out how to stop Google from splitting & wrapping the lines. So I've inserted something to indicate where the lines really begin & end. I have attached the Master Manifest as a file so you can see what it really looks like. But that just doesn't serve my purposes. I just want you to understand that some of this text is broken into lines that aren't really lines in the manifest. I'm also showing this text with altered colors because I can't figure out how to tell Google to treat this as code or something like that to make it scroll horizontally. If it gets really confusing, just open the attached manifest in your text editor & follow along there. Turn off line wrapping to see the file as it really is.

#EXTM3U
<new line>
#EXT-X-VERSION:6
<new line>
#EXT-X-INDEPENDENT-SEGMENTS
<new line>
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=NO,FORCED=NO,URI="/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8",LANGUAGE="en",AUTOSELECT=YES,CHARACTERISTICS="twitter.show-text-when-muted"
<new line>
#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=288000,BANDWIDTH=288000,RESOLUTION=336x270,CODECS="mp4a.40.2,avc1.4d001e",SUBTITLES="subs"
<new line>
/amplify_video/1463574264683110405/pl/336x270/7mThfhFrLetUC3CT.m3u8?container=fmp4
<new line>
#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=832000,BANDWIDTH=832000,RESOLUTION=450x360,CODECS="mp4a.40.2,avc1.4d001e",SUBTITLES="subs"
<new line>
/amplify_video/1463574264683110405/pl/450x360/31cFuKED8bVckwfS.m3u8?container=fmp4
<new line>
#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=2176000,BANDWIDTH=2176000,RESOLUTION=900x720,CODECS="mp4a.40.2,avc1.640020",SUBTITLES="subs"
<new line>
/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

There's a lot of detail in there that will make you dizzy. I will point out the important things you need to focus on. This is a master manifest. It does not describe a stream directly. It contains references to other manifests. That's what makes it a master. Those other manifests contain definitions of streams. This will gradually make more sense as we go along.

The first 3 lines are boilerplate. They are of no interest to us. You can ignore them.

#EXTM3U
<new line>
#EXT-X-VERSION:6
<new line>
#EXT-X-INDEPENDENT-SEGMENTS
<new line>

Each line that begins with #EXT-X- describes a stream of data that is stored on Twitter & that we can download. But we are not going to use VDH to do the download, obviously. We already tried that & it didn't work. The tool we are going to use is ffmpeg. You can get ffmpeg at ffmpeg.org.

#13.png

It doesn't take a degree in rocket science to figure out what to do next. When you do the obvious, you get this page:

#14.png

The 3 big icons on the left are for, left to right, Linux, Windows, macOS. When you hover your mouse over each one, the information displayed below the icons changes. I'm on Windows so that's what I'm showing. You select whichever platform you're on. Clicking one of the links below your icon takes you to another page where you can select what you need to download. Generally, this offers a range of selections. Ffmpeg is a rather complex piece of software that can be built with more or fewer features. Michel has included a trimmed down build of ffmpeg with VDH. I have tried to use that for what we're discussing here but it doesn't work for everything. So I have always gotten a full-featured version of ffmpeg. The page you download from will explain the details of each available package. If you're so inclined, you can even get the source code of ffmpeg. I have never been so inclined.

The ffmpeg package is just a zip file. There is no installer utility. You just unzip the zipfile somewhere on your disk space. Here's what mine looks like:

#15.png

Not much to it. You'll note on the left that there is a subdirectory named doc. There is ffmpeg documentation in there in HTML format. Feel free to increase your frustration level by trying to make heads or tails of any of it. Take it from me, it's not usually a rewarding effort. You'll also note that I have a rather weird directory name. That's just what came with the package. It has the benefit of including the build date within it. As you can see, I downloaded this on July 11, 2021. You'll get whatever is there on the day you do this.

When I get a new ffmpeg (maybe in a few months), I will unzip it into the directory you can see called ffmpeg. That way, I will have 2 versions of ffmpeg side by side. I will try out the newer version a few times until I'm satisfied that it works properly. Then I will simply delete the directory tree for the old version. You could do things with your system SET environment to integrate ffmpeg with your Windows system. I have not bothered. It's more flexible this way. But to each his own.

But this is preparatory work. Let's get back to the manifest. Look at the first stream definition:

#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=NO,FORCED=NO,URI="/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8",LANGUAGE="en",AUTOSELECT=YES,CHARACTERISTICS="twitter.show-text-when-muted"

Specifically, look for TYPE=SUBTITLES.

#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=NO,FORCED=NO,URI="/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8",LANGUAGE="en",AUTOSELECT=YES,CHARACTERISTICS="twitter.show-text-when-muted"

We are lucking out. This video has subtitles. Many don't. Look at the URI parameter, which I have extracted below to remove the unwanted line break:

#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=NO,FORCED=NO,URI="/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8",LANGUAGE="en",AUTOSELECT=YES,CHARACTERISTICS="twitter.show-text-when-muted"

URI="/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8"

First, notice that it ends with m3u8. So this stream definition points off to another manifest. Also, note that this is not a complete URL. Go back to the hover text in the Network Monitor. Note how that URL starts with:

https://video.twimg.com/amplify_video/

To get the full URL of the stream manifest for the subtitles stream, you put the two together:

https://video.twimg.com/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8

Remember this URL. I'll be using it later.

Now look at this bit:

#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=NO,FORCED=NO,URI="/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8",LANGUAGE="en",AUTOSELECT=YES,CHARACTERISTICS="twitter.show-text-when-muted"

This is how a manifest gives a name to a stream description. You can see that not every stream description in the manifest has a GROUP-ID. But this one does & it will become apparent why as we go along. So "subs" is the name of this stream.

Following the subtitles definition in our master manifest we have 3 more stream descriptions. They are all quite similar. I want to focus on the last one:

#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=2176000,BANDWIDTH=2176000,RESOLUTION=900x720,CODECS="mp4a.40.2,avc1.640020",SUBTITLES="subs"
/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

The first thing I want to point out is the part that says RESOLUTION=900x720.

#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=2176000,BANDWIDTH=2176000,RESOLUTION=900x720,CODECS="mp4a.40.2,avc1.640020",SUBTITLES="subs"
/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

This is the best resolution being offered here. The other 2 stream descriptions have lower resolutions.

Now focus on BANDWIDTH=2176000.

#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=2176000,BANDWIDTH=2176000,RESOLUTION=900x720,CODECS="mp4a.40.2,avc1.640020",SUBTITLES="subs"
/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

This gives a rough estimate of the quality of the video. Not the resolution, the quality. You could encounter manifests that show multiple stream descriptions of the same resolution. When that happens, just look at the BANDWIDTH parameter. It is likely that the value of BANDWIDTH will be different for each stream description of that particular resolution. The higher the BANDWIDTH value, the better quality the video will be at that resolution. Of course, the file you get will be larger as well. We don't happen to have that much choice in this case. That just makes things a tad simpler.

Notice that this stream description takes up 2 lines. It may look like 3 lines, but that's just Google automatically splitting & wrapping the 2 lines. The real manifest has just 2 lines here:

#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=2176000,BANDWIDTH=2176000,RESOLUTION=900x720,CODECS="mp4a.40.2,avc1.640020",SUBTITLES="subs"
/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

There's all the various parameters on the first line. Then there's the partial URL on the second line. Note how this URL has .m3u8 hiding in it, so this is another reference from our master manifest to a stream manifest. As I describe above, this partial URL can be turned into a complete URL like this:

https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

Just for fun, navigate to this last URL. As it happens, it's the third entry in the Network Monitor. Just double click that. You will see that this is another manifest that looks completely unlike our master manifest here. It is the description of the actual stream itself. I won't be discussing this at all. You don't need to know anything about what's in the stream manifest. You just need to know its URL. And given the bizarre URLs we're looking at here, which are typical of what you'll find around the web, we should all be grateful for copy/paste.

Now focus on this bit:

#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=2176000,BANDWIDTH=2176000,RESOLUTION=900x720,CODECS="mp4a.40.2,avc1.640020",SUBTITLES="subs"
/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

This is a reference back from this stream description to the earlier stream description.

#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=NO,FORCED=NO,URI="/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8",LANGUAGE="en",AUTOSELECT=YES,CHARACTERISTICS="twitter.show-text-when-muted"

#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=2176000,BANDWIDTH=2176000,RESOLUTION=900x720,CODECS="mp4a.40.2,avc1.640020",SUBTITLES="subs"
/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4

SUBTITLES="subs" in the last stream description refers back to GROUP-ID="subs" in the earlier stream description. The value "subs" is the same in both places. That is the connection. You'll note that all 3 #EXT-X-STREAM-INF stream descriptions refer back to the same subtitle stream description. That's not really surprising. The resolution is different but the video is the same, so all the streams share the same subtitles. You will encounter this reference structure fairly often if you have occasion to be looking at manifests with any regularity.

We have now identified everything we need to download this video and its subtitles. Let's get to it. I'm going to describe the process using basic steps. I don't do these things exactly like this. I have written everything into a script so I don't have to retype these complicated commands every time. I leave it to you to create a script convenient for you on your system.

Master Manifest.txt

Wild Willy

unread,

Nov 25, 2021, 6:12:12 AM11/25/21

to Video DownloadHelper Q&A

Here is the command to download the video:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

Let's break this down piece by piece. In general, the syntax for ffmpeg goes like this:

ffmpeg [parms for input file] -i [input file name] [parms for output file] [output file name]

The first piece is this:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

I surround the full thing with double quotation marks. This isn't strictly necessary here. But as you'll see below, I have spaces in my file specifications. Some of my directories have spaces in their names & some of my files do as well. These need quotation marks because of the spaces. So I'm just in the habit of always coding quotation marks so I don't accidentally forget.

You can use a simple process to get to this without having to type all of this in. First, in Windows Explorer (the file manager) go to the directory where ffmpeg.exe is, as you can see in an image I included above. Now press the key sequence Alt+d. This works for me on Windows 7. I believe you can find this documented if you do a Google search. The key sequence may be different on your platform. You'll have to figure that out for yourself. When I press Alt+d, it looks like this:

#16.png

Now press the Home key to move the cursor to the front of the area & type in cmd followed by a space:

#17.png

Now hit Enter.

#18.png

Poof. Magic. You get a command window already set to the ffmpeg directory. You'll want to do some copy/pasting of this path information into whatever script you write. The script can reside in any directory on your system as long as you execute ffmpeg by coding the full file specification as I show above.

Next we have the parameters that apply to the input file:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

The whitelist is something that I figured out by trial & error. Over time as I have used ffmpeg, I encountered sites that would cause ffmpeg to generate an error message about a protocol that was not in the whitelist. That was all gibberish to me, so I scrounged around in the nearly incomprehensible documentation. I stumbled upon the -protocol_whitelist parameter. Over time, I kept encountering protocols that I needed to add to the list. This list has been unchanged for me for a while now. I'm sure ffmpeg will let me know with an error message if there's some new protocol I need to add to the list.

The -hwaccel parameter is something that's discussed over here:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/uBknY74Q1SI

I won't explain it further. I have run with this parameter & without it. I suppose it's possible that in some situations it helps. It may not be doing anything for me. I haven't had a long enough video to download for which I had enough time to measure whether I'm getting any benefit. My sample video here is only 16 seconds long so I doubt it has much impact in this case. I have found that at least it doesn't cause any problems so I just run with it. You can make your own decision about this.

The next part is the input file:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

In this case, the input file is the stream manifest for the 900x720 video. This is a copy/paste out of the last stream definition in the master manifest, with the site name inserted in front so it's a complete URL.

Ffmpeg can take many different things as input: the URL of an MP4 on the web, the URL of a manifest on the web (as here), the file name on your system of a manifest, and plenty more. This tool is like a Swiss Army knife for audio-visual functions. I know the barest fraction of what it can do. The horrible documentation & fruitless Google searches have kept me from learning more. If you know more, do feel free to share it by posting here.

Next we have the parameters for the output file:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

In our case, we have only the one parameter. -codec: says the option applies to stream data of every supported media type: video, audio, subtitles, whatever is in the source input. You can code -codec:a, -codec:v, and other things. For our purposes, -codec: is all we need. The option here, copy, simply copies the input to the output. The input is coming off the Internet. The output is going into a file on my system. Ffmpeg does nothing to analyze or process the input in any way. It just copies what it reads from the input to the output. Ffmpeg can do things like muxing & encoding & other things, which I mostly don't understand. Like I said, Swiss Army knife. I understand copying. That's easy. It's all I need to understand. It works for what I want to do.

The next part is the output file:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

It's just there at the end of the command. There's no -o or -switch or anything. Just the file name. Note the quotation marks because of the spaces.

I have some more things tacked on the end that are Windows tricks for capturing the output of ffmpeg.

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

This captures any command error messages that might occur. This file has always been empty when I run ffmpeg, but I keep it for completeness.

This is more important:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/900x720/bHlh8rkjC1GzHuFS.m3u8?container=fmp4" -codec: copy "q:\VDH Testing\Ryan Poehling.mp4" 1>"q:\VDH Testing\Ryan Poehling.Err" 2>"q:\VDH Testing\Ryan Poehling.Log"

This captures the important, interesting, educational output generated by ffmpeg. I've attached the file below as Video Download Log.

After I run a download, I like to run ffprobe. As you can see in earlier images, ffprobe comes with ffmpeg. It generates interesting output that confirms you've got the file you expected:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffprobe.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp "q:\VDH Testing\Ryan Poehling.mp4" 1>>"q:\VDH Testing\Ryan Poehling.Err" 2>>"q:\VDH Testing\Ryan Poehling.Log"

The only additional thing worth noting here is the >> notation. On the ffmpeg command, I have > coded. This creates a new file. If the file already exists, > causes it to be overwritten. I use >> here so that the ffprobe output is added to the end of the file that already exists. This has nothing to do with ffmpeg. It's how Windows works. You can see the ffprobe output in the Video Download Log attachment. When it comes to other platforms, you're on your own figuring out how to capture the output of the command. Do please post that information here for everybody to benefit from your knowledge.

Now let's get the subtitles. What kind of subtitles are these? I copied the URL of the subtitle manifest, which I describe above & said to remember:

https://video.twimg.com/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8

I pasted it into the address bar of Firefox, & hit Enter. This is what I got:

#EXTM3U
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:17
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:16.92,
/subtitles/amplify_video/1463574264683110405/0/ElDvtq3kN5eIVN50.vtt
#EXT-X-ENDLIST

So the subtitles are in WEBVTT format. That's my preferred format. There are other formats & if you encounter them, you would be able to figure them out like I've done here. Given that the subtitles are in a .vtt file, I executed this invocation of ffmpeg:

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://video.twimg.com/amplify_video/1463574264683110405/pl/s0/Su37b71s0nn2C_-b.m3u8" -codec: copy "Q:\VDH Testing\Ryan Poehling.vtt" 1>"Q:\VDH Testing\Ryan Poehling.Err" 2>"Q:\VDH Testing\Ryan Poehling.Log"

Pretty much the same thing again as above. Note that the input URL after -i is now the one I described above for the subtitles stream. Also note the file extension for the output file is .vtt. Ffmpeg figures out a lot of what you want to do from the file extension you give on the output file. It figured out I wanted a target MP4 file earlier from the .mp4 file extension I coded. Here, it knows I want subtitles from the .vtt file extension I coded on the output file. I followed this with an invocation of ffprobe, too similar to the above discussion to go over it again. All that output is attached below in the file Subtitles Download Log.

I was careful to name my output files the same, except for the extensions .mp4 & .vtt. This allows VLC to automatically detect the subtitles & display them during playback:

Simple? Not.

Easy? I hope once you have followed these instructions a few times, you'll find it easy, especially if you write a script to help you avoid typos.

Not clear? Tell me. Post here. Let's discuss it.

Subtitles Download Log.Txt

Video Download Log.Txt

Wild Willy

unread,

Nov 25, 2021, 6:25:35 AM11/25/21

to Video DownloadHelper Q&A

I got some of my colored strings wrong. They're hard to read. I apologize. I would go back & fix them if I could edit my posts. But the edit function was taken away from us by Google a year or two ago & no amount of complaining since has given it back. The only way for me to fix it is to delete the post & completely reenter it, uploading the embedded images & attachments again, to say nothing of fiddling with the colors again. Way too much trouble for way too little return on the investment. Also, Google might think I'm trying to spam the group & I might be blocked from posting for a while. Again, not worth it. Some of the things that are hard to read because of their colors are clickable links, & clicking those might compensate for my fat-fingering.

Some of my images are a bit small. Sorry again. I think I accidentally had the small image setting turned on. You can get around this by using the Firefox function for displaying an image in a separate tab. When you click MB2 on any image, that function appears in the popup context menu. You probably want to do this anyway so you can properly see what I'm talking about.

I must add that the above approach is what you should try if you find you are getting separate video & audio content. Most of the time, you can deal with it directly with VDH. But once in a while you need to resort to this approach, especially if subtitles are involved.

jcv...@gmail.com

unread,

Nov 26, 2021, 1:14:11 AM11/26/21

to Video DownloadHelper Q&A

impressive :)

thanks

jerome

Wild Willy

unread,

Nov 26, 2021, 10:31:00 AM11/26/21

to Video DownloadHelper Q&A

Thanks, Jérôme. Maybe all the images & so on above are impressive but the impressive work was really done over here, in the big opera thread:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/8V2cRB-bcK4

I was not alone in amassing the knowledge it takes to perform this kind of download.

Now, on to other things . . .

For the following discussion, I assume you've read & understood the above posts in this thread.

In a recent post a fellow user offered this example that helps to illustrate even better what is possible:

https://www.gaia.com/video/astral-projection?fullplayer=feature

You can see I've already opened the Network Monitor & discovered the manifests. I've also popped up the context menu on the embedded video player. It shows that this site is using the Brightcove video player. That's the same one we encountered at the Metropolitan Opera. This means pretty much everything here will be the same as what we used to do with the free nightly opera streams (they've stopped those now), right down to the site serving the video. The URLs in the manifest are from the same site as what we were getting from the Met.

I'm not totally satisfied with my experiment above with color-coded text. So in this discussion, I will instead show screenshots of my text editor with annotations. This will probably be clearer.

I got the master manifest the same way as detailed above. I've attached it as file Master Manifest #2. I must point out that in the images below I have turned off line wrapping in Notepad++. The URLs involved are ridiculously long, & showing the file with the lines wrapped makes it harder to see the important details among the mass of unimportant details. You can see this for yourself by downloading the attachment & pulling it up in Notepad or whatever you use instead of Notepad. I emphasize Notepad because you should absolutely not be looking at this in Word or Wordpad or anything like that. You want to be using a simple text editor, nothing fancier than that.

The structure of this manifest is a bit more elaborate than for the Twitter clip I used in my earlier example. There are more streams to choose from.

The first items I want to point out are the I-FRAME stream descriptions.

#102.jpg

It looks like each I-FRAME stream description is a duplicate of the video stream description right above it. The resolution & bandwidth of each I-FRAME stream description matches its partner right above it. But, despite appearances, they're actually not duplicates. I-FRAMEs are used to speed up skipping around in a video while you're streaming it off a server. If you were to download an I-FRAME stream (I have done it just once), you would find that VLC wouldn't play it. I discovered that it would play in ffplay (something that comes with ffmpeg & ffprobe). It turns out that it is a stream consisting of a single frame of the video extracted at regular intervals. It's just video without audio. The one time I tried this, it showed me one frame of the full video every 2 seconds. Basically it was unwatchable. There is a link in the big opera thread to my source for this information. So I will remove the I-FRAME items from the following images here.

Master Manifest #2.txt

Wild Willy

unread,

Nov 26, 2021, 10:55:43 AM11/26/21

to Video DownloadHelper Q&A

Now, let's get down to the proper business of analyzing this manifest.

Before we go too deep into this analysis, I want to get an idea of what we're dealing with here. You can use ffprobe to greatly help in this task. I ran ffprobe (an example is above) using the URL of this master manifest as input. I've attached the results as file Astral Projection Manifest ffprobe. I'm not going to go into details on this file just yet. It will make more sense after you understand the discussion below. I will say that ffprobe is able to untangle the various relationships in a master manifest & display the attributes of the various streams. You can look at that file now but I recommend you wait until later. Just as a curiosity, I'll mention now that ffprobe had a problem with the subtitles stream manifest. I don't know why. It worke fine, something I'll get to.

So. Back to our master manifest. The first thing I want to point out is the audio stream descriptions:

There are 4 of them. The ffprobe output tells me these are audio streams without video. You can guess this from the absence of video resolution information from the stream descriptions.

#104.jpg

Their names are audio-0, audio-1, audio-2, audio-3, in order from top to bottom. These names can be anything at all. They could be Elina, Isabel, Michel, & Jérôme. The names are whatever the developers of the server web site choose. The names are just ways to refer back to those stream descriptions. There's nothing intrinsically important about the names. The important thing about them is that they exist.

We also have 7 video stream descriptions:

#105.jpg

These are video streams without audio. The ffprobe output says that definitively. But you can guess that without looking at the ffprobe output by the fact that there is video resolution information present in the stream descriptions. There is also a parameter named AUDIO present, but I'll get to that in a moment.

You'll notice that some of the resolutions appear more than once.

#106.jpg

You can tell that they are different streams from their bandwidths. The stream with the higher bandwidth value is a higher quality stream. Presumably, the one with the higher bandwidth value uses up more of the capacity of your Internet connection when you stream it in the player on the web page. The higher bandwidth value will also result in a larger file when you download it.

There is no similar way to differentiate the various audio stream descriptions. I can imagine that different audio streams could also be of different qualities. This might show up as different sampling rates or different bit rates. It might show up as mono vs stereo vs surround, even 5.1 surround vs 6.1 surround vs 7.1 surround vs 7.2 surround. The ffprobe output shows different bit rates for the audio streams. But the embedded player on the web site doen't give a tool for selecting audio quality, only video resolution. That isn't an issue, as will gradually become clear as we go along here. In the end, it's the same video at all the resolutions so the audio stream would be the same no matter the video resolution. Or so you might think. As we progress here, you'll learn how to pick the right audio stream to download for the video stream you will download.

Astral Projection Manifest ffprobe.txt

Wild Willy

unread,

Nov 26, 2021, 11:22:30 AM11/26/21

to Video DownloadHelper Q&A

I've mentioned that the audio streams here have no video streams, & the video streams have no audio streams. But it would be silly if there were no way to pair up an audio stream with a video stream. That pairing information is present in this manifest.

We're finally getting to that naming thing I mentioned earlier. This image shows how the audio & video streams are related. The AUDIO parameter value on each video stream description matches the GROUP-ID parameter value on an earlier audio stream description. On an audio stream description, the parameter is named GROUP-ID. On a video stream description, the parameter is named AUDIO. Why can't they be the same? Beats me. I didn't design this syntax. We just have to put up with it. The important thing is the matching parameter values: audio-0, audio-1, audio-2, audio-3. The matching of the parameter values is what associates an audio stream with a video stream.

You can see that any given audio stream may have multiple video streams associated with it. The first 3 audio streams have only a single video stream each. But the fourth audio stream has 4 partner video streams.

Now that the structure is clear, we can choose which items to download.

I'm going to choose the highest video resolution, which means I will choose the audio stream associated with it.

I've highlighted the relevant URLs here. In the Twitter example upthread, the manifest contained only partial URLs & we had to insert https://etc. at the front of each partial URL that was in the manifest. Here, it's all done for us. So much simpler. Why don't all sites do it like this one? Might as well ask, "Why do I like opera but other people don't?" Like I said upthread, there's no international standard for this.

Notice that in an audio stream description, the URL is in a parameter of the form URI="". Not URL. URI. And there's double quotation marks around it. In a video stream description, the URL is on its own line, no parameter name with an equal sign in front of it, and no double quotation marks. Gee, isn't that consistent . . . NOT!!! Hey, they didn't ask me my opinion. It's dumb. I know it. You know it. We just have to live with it. Let's move on.

You'll note that in this image, my highlighting stops at the right edge. The URLs are really, really long, nearly 300 characters each. That's why I've turned off line wrapping in my text editor. It makes this easier to visualize what's going on here. But it also means you can't see the entirety of the URLs in the images. There's long chunks of URL off screen to the right. If you download the full manifest I've attached above, you'll be able to see the complete URLs.

At this point, it might be worth it for you to download & look at the ffprobe report I attached above. Look for how ffprobe has broken things down into Program 0, Program 1, Program 2, etc. Each Program consists of one audio stream & one video stream, paralleling the structure of the manifest as we've analyzed above. Note how ffprobe identifies each stream as Stream #0:0, Stream #0:1, Stream #0:2, etc. This is the ffmpeg way of identifying individual tracks of a stream. Note specifically how Stream #0:6 appears more than once, which is how it recognizes the audio stream "audio-3" & its 4 partner video streams. You can read the audio characteristics of each audio stream & see the differing bit rates, expressed as kb/s. You can also read the video characteristics, the resolution, frame rate, bit rate, & so on. Note how the bit rates shown in the ffprobe output match up with the BANDWIDTH values in the master manifest. Very handy tool, this ffprobe thing.

Wild Willy

unread,

Nov 26, 2021, 11:35:37 AM11/26/21

to Video DownloadHelper Q&A

One last item in our manifest is the subtitles or captions. You'll see both words used. In fact, they both appear at various places in our master manifest. They are synonyms. In order to see how they fit in, I have turned line wrapping back on in my display of the master manifest. The relevant bits were scrolled off to the right when I turned line wrapping off, and I couldn't get all the necessary pieces to display in one screen by just scrolling right. So bear with the line wrapping for a moment.

There's just the one subtitles stream.

The web site developers have decided to name the subtitles stream subtitles-0. Also note that these are Czech subtitles. Maybe my build of ffprobe doesn't specifically include support for Czech language subtitles & that's why it couldn't deal with them. I'm just guessing.

#111.jpg

Every video stream description apparently has no closed captions (if you can ignore the unfortunate line breaks).

Just as each video stream refers back to one of the audio streams, each video stream also refers back to the subtitles stream. In this case, they all refer back to the same subtitles stream, which is totally as expected.

Wild Willy

unread,

Nov 26, 2021, 11:50:36 AM11/26/21

to Video DownloadHelper Q&A

The URL for the subtitles stream appears in the stream description here in the manifest. Let's go look at it. We need to find out what type of subtitles we've got here. Copy the URL out of the manifest & paste it into the address bar in the browser. This is the first screenload of what you'll get:

It turns out these subtitles are in WEBVTT format.

Now we've extracted everything we need to download these components with ffmpeg. Here are the commands I used. For a detailed breakdown of the parts of these commands, read the posts upthread dealing with the Twitter example.

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://manifest.prod.boltdns.net/manifest/v1/hls/v4/clear/1324209225001/6eb540f6-5111-41f2-938c-49780a4d50d7/d4e6b9a9-77c4-4e0e-96d9-7a591784a1c6/rendition.m3u8?fastly_token=NjFhMWYwNzRfOGFkMGM4YTNkNzdmMjY4YjFlN2NlOTYyNTI0Mzg3Yjk4OTc5OTEyYWYxYmRkZDNmMzZkNDNiZjc1OWM0OGE0OQDD" -codec: copy "Q:\VDH Testing\Astral Projection Video 20200504.vtt" 1>"Q:\VDH Testing\Get Example.Err" 2>"Q:\VDH Testing\Get Example.Log"

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://manifest.prod.boltdns.net/manifest/v1/hls/v4/aes128/1324209225001/6eb540f6-5111-41f2-938c-49780a4d50d7/d506cd77-473f-4fa9-9048-01d9824e11ed/10s/rendition.m3u8?fastly_token=NjFhMWYwNzRfOWQwYjhkN2YxYjU2NWVkZjYwYWFjYjQ0M2FjNDhiMGYxMzkzNjQwY2NlNmNkZDQyZDkwYmI2YzdjMTgxZmRhNwDD" -codec: copy "Q:\VDH Testing\Astral Projection Audio 20200504.mp4" 1>>"Q:\VDH Testing\Get Example.Err" 2>>"Q:\VDH Testing\Get Example.Log"

"G:\ffmpeg\ffmpeg-2021-07-11-git-79ebdbb9b9-full_build\bin\ffmpeg.exe" -protocol_whitelist file,crypto,data,http,https,tls,tcp -hwaccel auto -i "https://manifest.prod.boltdns.net/manifest/v1/hls/v4/aes128/1324209225001/6eb540f6-5111-41f2-938c-49780a4d50d7/72ad2b36-b37b-42ce-9755-cea2bda91823/10s/rendition.m3u8?fastly_token=NjFhMWYwNzRfZTQwNzRiNzY2ZGViNWU4MjBiMTBkOWZmMzJhMTU1NmIyM2EwMjEyNDgyMTBhNTQwOThlMWRmZGM0ZGExNTJiZADD" -codec: copy "Q:\VDH Testing\Astral Projection Video 20200504.mp4" 1>>"Q:\VDH Testing\Get Example.Err" 2>>"Q:\VDH Testing\Get Example.Log"

Each one of these 3 commands is a single line. Google has broken the lines & wrapped them to make them fit on the web page. But when I executed them, they were single long lines.

I've attached the ffmpeg log file from these downloads.

Here are the Windows Properties of the audio & the video file I downloaded.

#114.jpg

Note the asymetry of what's reported for each file. The audio file has no video properties & the video file has no audio properties.

I played the 2 files synchronously in VLC.

#115.jpg

As you can see, I've got video with Czech captions. This happened automatically because I chose the same target file name for the captions file & the video file, changing only their extensions, .vtt & .mp4, respectively. You'll have to trust me that I've also got audio (in English).

Astral Projection ffmpeg Log.Txt

Wild Willy

unread,

Dec 9, 2021, 8:50:30 PM12/9/21

to Video DownloadHelper Q&A

I've just encountered a rather curious case in connection with this thread:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/PG5Nrok1_YI

I'm attaching the manifest from that one here. There's a number of interesting things
going on in this manifest.

First, look at the very last line. It shows AES & the URL ends with .key. Usually, this
indicates that we're looking at an encrypted stream. When I've hit things like this
before, ffmpeg wouldn't download them. Despite that, I did manage to get ffmpeg to
download this object, as I detail in the other thread.

Now look at the audio stream descriptions. They are similar to the ones I talk about
upthread here. But there's a catch. You'll notice that these audio streams do NOT
include URLs. So we're not looking at one of those cases of separate video & audio.
Each video stream here should include audio that does not need to be downloaded
separately. Note the names in the GROUP-ID parameters. The 160, 256, & 320 look like
bit rates, indicating different audio qualities. This appears to be borne out by my
results in the other thread.

Look at each video stream description. Each one contains an AUDIO parameter that refers
back to the GROUP-ID parameter in one of the audio stream descriptions, just like I
discuss upthread. I find this a bit weird, since the audio stream descriptions don't
actually describe audio streams; there's no URLs for the audio streams. Nevertheless, it
looks like the higher video resolutions refer to an audio stream that is of a higher
quality.

In this manifest, all the I-FRAME stream descriptions are in a block. This is unlike the
example upthread in which the I-FRAME stream descriptions alternated with regular video
stream descriptions. That's not really significant. It just shows that the structure of
a manifest is flexible. Compare the BANDWIDTH value on each I-FRAME stream description
with the BANDWIDTH value on each video stream description of the corresponding
RESOLUTION. They don't match exactly, unlike in the sample manifest upthread. I didn't
expect this. I guess they don't always have to match. Live & learn.

Master Manifest.txt

Wild Willy

unread,

Dec 9, 2021, 9:11:39 PM12/9/21

to Video DownloadHelper Q&A

Come to think of it, it makes sense for the I-FRAME streams to be of a much lower BANDWIDTH than their video stream counterparts. The I-FRAME streams are just snapshots of the video stream taken at certain intervals, maybe every 5 seconds or 10 seconds. Such a stream would not need as much bandwidth to retrieve as the actual video. I would think the I-FRAME BANDWIDTH value would decrease proportionately as the sampling rate gets slower. Lower BANDWIDTH for an I-FRAME stream taken every 10 seconds than one taken every 5 seconds.

Wild Willy

unread,

Dec 16, 2021, 8:26:50 PM12/16/21

to Video DownloadHelper Q&A

Every so often when you're trying to download something with ffmpeg, you'll get an error saying either Access Denied or Forbidden. In such cases sometimes, I say SOMETIMES, not always, you can get past this by simply supplying your user ID & password on the URL you supply in the -i parameter of ffmpeg. The format is like this:

https://userid:password @ www.sitename/something/something/something

(As usual, I have to surround the @ with spaces to prevent Google from obscuring what it thinks is an E-mail address within this post. You would remove the spaces when you code the URL.)

This is not anything special to ffmpeg. This is a general rule of URLs. Do a web search on this search key:

supply username and password in url

I have found this works in some places & just fails in others so it's not a guaranteed solution.

Clearly, @ is a special character in this kind of URL. But what if your user ID on the site in question happens to be your E-mail address? Every E-mail address contains @ so you have to take evasive action to make that work. You would code this:

https://email%40isp:password @ www.etc. . . .

%40 is the code for @. You can find considerably more information about this here:

https://www.w3schools.com/html/html_urlencode.asp

Wild Willy

unread,

Dec 30, 2021, 7:52:15 PM12/30/21

to Video DownloadHelper Q&A

I was looking around on the BBC web site to see if I could find the schedule listing for
the Vienna Philharmonic New Year's Concert 2022 coming up in less than 2 days. I ran
into a bit of a roadblock with something called a TV license. Apparently, all fine
British citizens who want to watch TV must pay an annual license fee of £159. So it's
ixnay on BBC TV for me. But BBC Radio appears to be free to anyone anywhere. BBC Radio3
will also be carrying the concert. Although there won't be much to see in that
broadcast, it will sound perfectly lovely. I was thinking if I can't record a video
version of it, at least not live, maybe I can get an audio version. It appears that BBC3
has an archive of recent shows available. I picked one at random (a concert by the
Bournemouth Symphony Orchestra, which apparently was saved on the site on December 15 &
will be available for 14 days, which I think actually expired yesterday but it's still
there, Happy New Year). VDH couldn't see it even after I launched playback. It offered
only an MP4 of 640 bytes. No way that's a 2.5 hour concert.

So I started rummaging around in the Network Monitor following my formula described
upthread here. I was hoping to find a manifest but filtering on m3u8 gave no hits. So I
filtered on nothing & just scrolled what was there. I'm not going to give all kinds of
images for this. You should be able to do this yourself once you've got the hang of what
I explained above.

The one entry that I thought looked interesting was labeled as type dash. It was a file
with extension .mpd. I downloaded it & it turned out to be a plain text file of only 48
lines. It was an XML file. It had an early line that said it was an MPEG DASH Schema.
I don't pretend to really know or understand what was in there but it looked a bit like a
recipe for cooking a media download. There seemed to be lines that looked like templates
for URLs with certain parts of the URLs meant to be sequential numbers starting at 1 &
going up. Looking elsewhere in the Network Monitor, I did find some entries for MP4
chunks whose URLs seemed to match the recipe in this mpd file. I toyed with the idea of
trying to construct a sequence of ffmpeg commands using this recipe to see if I could
download this show, not so much because I was interested in listening to it but rather as
a dry run for recording the New Year's Day concert.

I gritted my teeth & opened the ffmpeg documentation. I didn't find anything that was
helpful for what I was thinking of doing. That's typical of nearly all of my forays into
that documentation. It is really quite reprehensibly impossible to learn from. But I
stumbled upon a short section, barely 3 lines, talking about a Dynamic Adaptive Streaming
over HTTP demuxer. DASH. This made me take a wild-ass guess that maybe .mpd files are
like manifests for DASH streams. Maybe I could just give ffmpeg the URL of this .mpd
file & it would download something.

Here's the command I executed:

ffmpeg -i "ugly URL" "file on my system"

I decided that for this test I would dispense with all the other parameters I usually
use. The ugly URL was the URL of the .mpd file, which you can get by clicking MB2 on the
entry in the Network Monitor & copying it using the context menu that pops up. The URL
contained a few ampersands so I needed to enclose it in quotation marks to prevent the
Windows command processor from interpreting those things as separate command parameters.
I also enclosed the file name on my system in quotation marks because I have spaces in my
directory & file names.

To my huge surprise, it started downloading something. There were a few errors coming
out of ffmpeg but it kept going in spite of them. It took maybe 20 minutes or so, after
which I had an audio-only MP4 of size 142M, duration 2:32:59. Like I said, I wasn't
interested in listening to the whole show. But skimming through it showed that I had a
recording of the radio broadcast that VDH could not even detect. So I should be able to
download from the BBC3 archive the audio at least of the New Year's Day concert within a
few days after the fact.

The mpd file appears to refer to 2 different qualities of audio recording. The one that
ffmpeg decided to get was the one at sample rate 48kHz. But there appears to also be one
at 96kHz. I haven't figured out how to make ffmpeg get that one instead. But 48kHz is a
pretty standard sample rate for content you'll find around the web. The audio bit rate
on this was 128kbps, also a common value. What I heard through my speakers sounded
perfectly decent. Still, if there's something else there that's a higher quality, I'd
like to be able to get it. Maybe somebody reading this has some ideas.

Wild Willy

unread,

Dec 30, 2021, 8:37:13 PM12/30/21

to Video DownloadHelper Q&A

Somebody reading this would be me. I thought to give the mpd file to ffprobe. It showed me 2 streams in the program. Stream 0:0 looked like it was the 48kHz item & stream 0:1 was the 96kHz item. When I ran the ffmpeg download I mentioned above, it showed that it was mapping 0:0 -> 0:0 by default. So I tried this:

ffmpeg -i "ugly URL" -map 0:1 "file on my system"

This time, ffmpeg did say it was mapping 0:1 -> 0:0, which seemed encouraging. But what I got appeared to be the same file again, still 48kHz & 128kbps. The ffmpeg log from the start of the download claimed that it was using a variant bitrate of 96000. But the result was 48kHz despite that. Oh well. I say this is just what is on the BBC web site & their mpd file exhibits wishful thinking, not factual reporting.

Nonetheless, this may just be an idiosyncrasy of BBC content. In the general case, this is what you have to do. First, ffprobe the mpd to see what streams are on offer. Determine from that output what might be the best quality stream. Then use that to inform your choice of -map parameter value. If it appears that input stream 0:0 is your best choice, you can omit the -map parameter. Otherwise, code -map to select the "right" stream.

Wild Willy

unread,

Dec 31, 2021, 12:15:57 AM12/31/21

to Video DownloadHelper Q&A

By the most amazing coincidence, another user had a problem that put to good use what I learned earlier today on the BBC web site. That thread is over here:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/yj0X6iZxBVo

Wild Willy

unread,

Jan 1, 2022, 12:12:38 AM1/1/22

to Video Download Helper Google Group

Although what I describe above for BBC Radio3 works great on their archived content, it
doesn't work for recording their livestream. When I set ffmpeg to try to record their
livestream using their mpd file, it read 2 or 3 chunks of the stream & then started
endlessly getting 404 Not Found errors. I suspect they've done something sneaky behind
the scenes to thwart downloading their livestreams. Oh well. As long as their archived
stuff can be downloaded, that's good enough, I suppose.

Wild Willy

unread,

Jan 11, 2022, 3:11:47 AM1/11/22

to Video DownloadHelper Q&A

While I was hunting for something unrelated, I stumbled across this old thread:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/GFF9rPfIlPc

I was inspired to add some content there on the subject of .mpd files. You might find it helpful.

Wild Willy

unread,

Jan 11, 2022, 5:38:04 AM1/11/22

to Video Download Helper Google Group

Michel made some very helpful comments about HLS vs DASH in this thread:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/yj0X6iZxBVo

Wild Willy

unread,

Jan 13, 2022, 12:32:27 PM1/13/22

to Video Download Helper Google Group

As I go along, I get new ideas. This thread contains a couple of them:

https://groups.google.com/g/video-downloadhelper-q-and-a/c/XKE0kvBKVDI

That might simplify a few things.

Wild Willy

unread,

Jan 19, 2022, 8:49:39 AM1/19/22

to Video Download Helper Google Group

It's easy to get wrapped up in the details of all this so let me try to explain this at a
higher level. I do assume you read what's above. I rely on your understanding &
performing certain of those tasks as I've already described them. This is not a
replacement for what's come before, just a clarification.

You should always start by trying to get VDH to download the item in question. It is by
far the easiest way to do things. But VDH makes no claims to being able to handle every
case so this approach should be something you go to only when VDH doesn't work.

There are 2 types of streams that this approach can handle. One is HTTP Live Streaming
(HLS). The other is Dynamic Adaptive Streaming over HTTP (DASH). In both cases, there
is a high-level file sent to your browser that describes the streams that can play in
some sort of video player in a web page you are surfing to. This high-level file is
called a manifest. You have to find the manifest as your first step in getting the
content you're having trouble downloading.

For HLS, the manifest is a file whose extension is .m3u8. For DASH, the manifest is a
file whose extension is .mpd. These are plain text files & you would do well to open
them in a text editor & look at them to get an idea of how the information inside is
structured. Remember, you're dealing with computers here. A computer is just a pile of
electronic circuitry that can't think. So a manifest is a rather simple-minded way of
describing stream data.

I describe upthread here how to find your manifest. Open the Network Monitor & filter on
either .m3u8 or .mpd. You don't know ahead of time which one will show up so you have to
try them both. We've even encountered situations lately in which both types of manifest
are used on the same web page. That means that you might have to look for a .m3u8, then
a .mpd, then maybe even a .m3u8 again, & maybe even a .mpd again. You just have to make
some guesses & be persistent. Sadly, there are also web sites that use neither .m3u8
nor .mpd. Sorry, I have no solution for those.

Quite often, especially in the case of HLS streams, filtering on .m3u8 or .mpd will turn
up multiple manifests. Start working with the first one that is displayed. This is
usually the master manifest. It normally does not describe a single stream directly, but
rather it describes a collection of stream manifests. Often enough, its file name
contains the word "master." When you see multiple manifests, the most likely situation
is that the first one is the master manifest & the others are stream manifests. Those
stream manifests will have entries in the master manifest.

Once you've found a manifest, you want to do 2 things with it:
- Download it onto your system so you can open it in a text editor (like Notepad).
- Get its URL.
I describe how to do both of these tasks upthread. I've already said you should look
inside your manifest to get an idea of what's in there. But don't break your head trying
to understand it in full. You have a tool that interprets a manifest for you & reports
what is in there in a reasonably easy to read format. That tool is ffprobe. I say its
report is reasonably easy to read. Not totally easy, but if you look carefully at the
ffprobe output, it starts to make a certain kind of sense. I have never read any
documentation that explains what ffprobe reports. I just figured it out by looking at
it. No, I'm not some kind of wizard. Just look at the report & you'll see things that
are pretty obvious to understand. There's also a few things that I don't completely
understand but the important parts are pretty clear.

You need to understand that ffprobe can report what's in a .m3u8 manifest same as in
a .mpd manifest. It generates the same kind of report in both cases. All you have to do
is execute this command:

ffprobe "http://URL-of-the-master-manifest"

In order to illustrate what ffprobe is telling you, I ran ffprobe against some files I
already have on my system. That's like this:

ffprobe "x:\directory\directory\directory\file"

Here are some samples of ffprobe output for various file types I happen to have handy.
This is not a complete list of every type of file there is. It's just a few types that
you are likely to encounter. Also, I am extracting only a small portion of what ffprobe
reports so you can focus on the bits that you will use to decide what to download.
There's a lot of other detail in ffprobe output. It's certainly interesting but most of
it is not pertinent to downloading content.

Here's what ffprobe tells you about a regular old .mp4:

Stream #0:0(eng): Video: h264 (High) (avc1
Stream #0:1(eng): Audio: aac (LC) (mp4a

This .mp4 file consists of 2 streams. Streams. That's the terminology ffmpeg & ffprobe
use for these things. They are the video track & the audio track of the file. This file
has a Stream #0:0 that is a video track & a Stream #0:1 that is an audio track.
Sometimes the audio track comes before the video track. This is not significant. It
only matters that there is a video track & an audio track.

Here's what ffprobe tells you about a .mkv file:

Stream #0:0(eng): Audio: opus
Stream #0:1: Video: av1

Almost the same information. This file happens to have the audio before the video,
whereas my sample .mp4 had them in the other order. That's an insignificant difference.
See past that. The major difference I want to point out is that the .mp4 shows avc1 in
the video stream & the .mkv shows av1.

Here's what ffprobe tells you about a .webm file:

Stream #0:0(eng): Audio: opus
Stream #0:1(eng): Video: vp9

Here's what ffprobe tells you about a .mp3 file:

Stream #0:0: Audio: mp3

The .mp3 format is audio-only. There is only one stream, one track, in a .mp3.

Here's what ffprobe tells you about an audio-only .mp4 file:

Stream #0:0(und): Audio: aac (LC) (mp4a

I got this from one of the audio files I downloaded from the Metropolitan Opera. This is
the same information that I show above for the regular .mp4 file that consists of both
video & audio. This one has only the audio but the information ffprobe displays for the
audio stream is the same in both cases.

Here's what ffprobe tells you about a captions or subtitles file:

Stream #0:0: Subtitle: webvtt

You need to look at the ffprobe report in order to decide what file type you need on your
output file in the ffmpeg command that will download the object. You have to provide an
output file name with the right extension depending on what the input is. There's 2
parts of the ffmpeg command that interact & I will explain that here in a moment. One
part is the type of stream you will be taking as your input. Hang on, I'm getting to the
other part.

You need to look carefully at other details that ffprobe reports, details I'm not showing
here. But they are really obvious once you start looking at what ffprobe reports. On
the lines for video streams, you will see the video resolution: 640x480, 1280x720,
1920x1080, 3840x2160. Every manifest will have its own pattern of resolutions. Every
manifest is different. But the concept is the same. Other things you'll need to look
for are a bit less noticeable. These are numbers followed by kb/s. These are bit rates
that you can use to choose between 2 video streams that show the same resolution. Audio
streams also have bit rates & if the manifest has more than one audio stream, this can
help you decide which one to download. The higher the kb/s, the higher the quality of
the stream & the larger the file will be when you download it. One other detail to look
for in video streams is a number followed by fps. This is the frame rate, frames per
second, of the video, another factor that indicates the quality of the video.

The stream number provides a name for you to use when you decide which stream to
download. I have examples upthread here of manifests that include several streams. So
it's not always true that you just get streams 0:0 & 0:1. You have to look at the
ffprobe output to determine which one(s) is(are) video & which audio.

The simplified form of the ffmpeg command for downloading something is this:

ffmpeg parameters -i "URL of the master manifest" -codec: copy -map p:q -map r:s "x:\output directory\output file name.correct extension"

The -map parameters select the streams you want out of the manifest. If you study a
report from ffprobe & decide that, let's say, video stream 0:6 is the one you want, and
audio stream 0:9 is the one you want, you would code this:

ffmpeg parameters -i "URL of the master manifest" -codec: copy -map 0:6 -map 0:9 "x:\output directory\output file name.correct extension"

This will put the video stream into the output file in front of the audio stream.
Equally, you could do this:

ffmpeg parameters -i "URL of the master manifest" -codec: copy -map 0:9 -map 0:6 "x:\output directory\output file name.correct extension"

The streams would end up in the other order. That is insignificant. All video players
can handle files configured either way. The important idea here is that ffmpeg can put
separate video & audio streams into a single output file. If you do want to keep the 2
streams in separate files, just execute 2 ffmpeg commands, one with only -map 0:6, the
other with only -map 0:9.

This use of the -map parameter simplifies something I talk about upthread here. I
pointed out one manifest that had partial URLs & the complicated way you had to
reconstruct the full URLs. You can forget all of that. With the -map parameter, ffmpeg
figures it all out for you. Poof. Magic. I hadn't figured this out when I wrote that
stuff upthread.

All the ffmpeg examples I show in this thread include the output parameter -codec: copy.
This tells ffmpeg to copy the input to the output unchanged. This suppresses one of the
interesting & powerful functions of ffmpeg: encoding. I think that's the term. Maybe
it's transcoding. Converting. Hey, I've never pretended to be an expert on ffmpeg. I
know only the little bits I explain here & no more. You've read my rants about the lousy
ffmpeg documentation. I would know more but for that. Anyway, -codec: copy does no
processing of the input. That's why you have to look at the ffprobe report to know what
kind of stream you're getting. Streams aren't always .mp4. All my examples in this
thread have had .mp4 output files but that's just a bit lucky. We all know there's
plenty of other file types out there. So if you use -codec: copy, you have to make sure
your output file name has the right extension.

But if you leave off -codec: copy, ffmpeg does additional processing to encode the input,
not just copy it. Even if your output file name matches the type you figured out from
the manifest, ffmpeg still encodes the file as it gets written. If, on the other hand,
let's say you figured out that the input is .mkv but you code an output file name
with .mp4. During the download ffmpeg will convert the file. Ffmpeg looks at the file
extension of your specified output file & generates output to match that file extension.
Be aware that this is a very CPU-intensive process. Expect ffmpeg to use about 100% of
your CPU & expect to hear your case fans blowing up a hurricane. This is not something
to be alarmed about, though. Case fans were invented to blow harder when the CPU works
harder. And ffmpeg using 100% of your CPU should not cause any undue problems of system
responsiveness. Operating systems are supposed to support multitasking. That means if
ffmpeg is using 100% of your CPU when you open some other application, the operating
system is supposed to cut ffmpeg's resource allocation so the other task can run. You
might see ffmpeg's CPU usage drop to 80% so your other task can use the 20% it needs. If
you want to run 3 or 4 other tasks, they should all coexist quite sociably. You should
not get the least bit disturbed because ffmpeg (or anything else for that matter) uses
100% CPU, assuming it isn't just having a problem & looping. So if you omit -codec:
copy, you are asking ffmpeg to do a lot of extra processing. I just want to make sure
you understand this possibility. Don't do it by accident & then be surprised.

Omitting -codec: copy should have no effect on the speed of your download. Modern CPUs &
disk drives operate at a much higher speed than even the fastest Internet connection
these days. So your system can still process & write whatever is coming in off the line
faster than ffmpeg can feed it in, even if you run some other tasks that take system
resources away from ffmpeg.

When it comes to -codec: copy it's like this. Omit it & you can code any name for your
ffmpeg output file; ffmpeg will convert the input to the type of the output file.
Include -codec: copy & you have to make sure the output file name has the right
extension. I haven't experimented a whole lot with doing conversions with ffmpeg but I
believe a mismatched output file extension when you code -codec: copy gives an error &
the download is not performed. Maybe somebody will encounter a situation in which you
either confirm or disprove what I'm saying. Do post here about that.

So here's the steps:
1. Find the master manifest, get it on your system, get its URL.
2. Run ffprobe on the manifest using its URL.
3. Read the ffprobe report, figure out what's beng offered, choose what stream or streams you want to download.
4. Decide whether to include or omit -codec: copy, name your output file accordingly.
5. Download with ffmpeg using the stream identifiers in -map parameters.

mjs

unread,

May 31, 2022, 6:03:34 AM5/31/22

to Video DownloadHelper Q&A

I found another simplified way to use ffmpeg, someone posted this in a comment for a youtube video.

ffmpeg -i "m3u8manifestURL" -c copy fileName.mp4

If a video has more than one resolution and you want to choose the quality, select the resolution in the video player for example 720p

A new manifest url will appear in the developer tools / network. The manifest url will be for 720p resolution.

Wild Willy

unread,

May 31, 2022, 5:16:21 PM5/31/22

to Video Download Helper Google Group

That would actually work in every situation where you can get an HLS master manifest. No
need to select anything yourself to make another manifest appear. Just do what you
suggest using the URL of the master manifest. You just wouldn't have control over which
resolution you would get. There was a user on here (I would have to search for longer
than I am willing) who did exactly that & did not get the resolution he wanted. I have
also observed that in many cases, one & sometimes two of the subordinate stream manifests
get displayed in the Network Monitor without any further intervention from the user.
Again, these are choices made by the web site, which may or may not be the resolutions
you want. In such cases, using a gear icon or some other mechanism provided in the media
player in the web page may or may not cause a new manifest to appear because it may
already be displayed. I think in the most general case, you are better off doing ffprobe
on the master manifest & figuring it all out for yourself. I have always been leery of
defaults. There's been too many times I've let things (in the broadest sense, not just
restricted to downloading videos) take their default & it hasn't been what I wanted. So
I'm in the habit of making the most decisions myself that I possibly can.

Just a nit. -c is an abbreviation for -codec: . An example of how I don't like
abbreviations any more than I like defaults.

Another nit. Quality & resolution are not synonyms. A given resolution can have
multiple qualities, as evidenced by differing bit rates in the file properties. The
issue is confused by YouTube's unfortunate choice to use the term quality in their gear
icon for selecting resolution. That's a slovenly use of language & leads to perhaps a
confused understanding of certain issues.

And HLS manifests on YouTube? That I have to see. Please post an example.

Wild Willy

unread,

Jul 1, 2022, 2:06:42 AM7/1/22

to Video DownloadHelper Q&A

Since I opened this thread, I've learned some things. I'm not entirely satisfied that the above discussion is as clear as it could be. Plus I used an example drawn from Twitter. Twitter has since made itself most inhospitable to non-members. So I'm going to try again & you tell me if this is easier to understand.

I have explanations above for how to get ffmpeg & the purpose of a manifest. I assume you have read those bits & understand them. I also assume you have gotten ffmpeg & installed it on your system. Maybe "installed" is too elaborate a word. Just unzip the package you got from ffmpeg.org. That's all the installation there is. Also, I'm using Firefox for this demonstration. There may be slight variances here & there for the other browsers. I rely on you, as a user of either Chrome or Edge, to figure out how to do the equivalent things on your browser. Also, I am on Windows (Windows 7 64-bit, to be precise). Once again, if you are on Linux or Mac, I rely on you to figure out how to do the equivalent things on your system.

For my first example, I'm using this page:

https://www.medici.tv/en/concerts/kirill-petrenko-conducts-vasks-berio-silvestrovkarabyz-janacek-and-sibelius-elina-garanca/

This is a subscription site but you can see that I am not logged in there. Anybody can visit that page & do what I'm showing here if you'd like to try it yourself as an exercise.

Once you're on the page, type the F12 key. This opens the Network Monitor. You can get the same thing by navigating menus: Tools -> Browser Tools -> Web Developer Tools.

Once you have the Network Monitor open, make sure you've selected Network & All as I show in the image. Then type the string .m3u8 into the filter field. I suppose you don't really need to include the leading period. I like to include it because I have found that some sites have other files that are not actually manifests but they do just happen to include the string m3u8 within their names. So I recommend including the leading period. Once you've done all of that, you may need to reload the page once or twice to make the manifests appear in the Network Monitor. A manifest of type .m3u8 is for an HLS stream.

Once you've got some manifests listed, assume the first one is the master manifest. This is an extremely safe assumption. Get the URL of this master manifest into your system clipboard as I show in the image.

I babbled on upthread about how you really ought to copy the whole manifest to your system & look inside it with Notepad or some other text editor. That's not as important as I once thought it was. I do think you should still make some effort to look at a raw manifest to see what's in it. But we're going to use ffprobe to do the heavy lifting here. Trying to interpret a manifest yourself will usually give you a headache. Let ffprobe do the work.

Now, in order to execute ffprobe, I strongly recommend you create a one-line .bat file for the purpose. You'll be running ffprobe a lot & there's no sense in retyping everything every time you do this. Your one-line .bat file will consist of this one command:

ffprobe [a parameter] [the URL of the master manifest] [redirection to capture the ffprobe output to a file]

In my case, this is what the command looks like on my system:

"G:\ffmpeg\ffmpeg-2022-03-17-git-242c07982a-full_build\bin\ffprobe.exe"
-protocol_whitelist file,crypto,data,http,https,tls,tcp
"https://playout.prod.medicitv.fr/satie/live/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1cmwiOiJodHRwczovL3Nkbi1nbG9iYWwtc3RyZWFtaW5nLWNhY2hlLjNxc2RuLmNvbS85Mzc4L2ZpbGVzLzIyLzA1LzAyLzUzNzI2NDUvOTM3OC1uTmNKUUdkbTJ3eGdrM3EtZHJtLWFlcy5pc20vbWFuaWZlc3QubTN1OD9zdGFydD0wJmVuZD0xODAiLCJpYXQiOjE2NTY2MjQ3MDcsImV4cCI6MTY1NzIyOTUwN30.FqKXD4-j2eiIk7Oq1Qm0tyYQ474jKaFNSgVTkOKgiOI/m.m3u8"
2>"Q:\To Watch\ffprobe.txt"

I've broken that up into separate lines. Since it's such a long line, Google would wrap it anyway. So I've made the line breaks in more helpful places. In the original, it is a single long line of text. Let's take it piece by piece.

"G:\ffmpeg\ffmpeg-2022-03-17-git-242c07982a-full_build\bin\ffprobe.exe"

This invokes ffprobe. The G:\ffmpeg\etc part is the directory on my system where ffprobe resides. (Also ffmpeg.) You will most likely have some other directory path on your system. The important part is at the end: ffprobe.exe. This invokes the ffprobe utility.

-protocol_whitelist file,crypto,data,http,https,tls,tcp

This is a parameter that I have found is necessary, not only here but later on ffmpeg. I built it up piece by piece over time based on experience. The ffmpeg documentation would lead you to believe this parameter is not needed. My experience has been that it very much is needed. Don't question it. Just code it.

"https://playout.prod.medicitv.fr/blah blah blah"

This is the URL of the master manifest I copied out of the browser window above & pasted into my command invocation. It is typically ugly. Of course, it will be different from one web page to another, from one web site to another, even from one user to another & one day to another. The important thing is to get the URL & give it to ffprobe as input.

2>"Q:\To Watch\ffprobe.txt"

This is the redirection that captures the output of the command in a file. The particular file name within the quotation marks is not really important. You will almost certainly put it in a different partition than Q: & a different directory than "To Watch", and you can choose any file name other than ffprobe.txt. Or you could use ffprobe.txt. It's your choice. The names don't matter. You just need something valid here. Also, I've made it a .txt file because I want to attach it to my post here, & Google doesn't accept just any old file extension. Plus, an ffprobe report, any ffprobe report, is just a plain text file. So I've made it .txt to make it easy for you to copy it from this post to your system where you can look at it. Do look at it. It is critical to the next steps. Read along with me as I discuss the contents of this file. Look carefully for the things I mention. You may have to stare at the report for a bit to see what I'm talking about. Take the time. It is very much worth it.

The ffprobe report begins with about 60 lines (in this case) of stuff you can skip over. If you take the trouble to look inside the manifest yourself, you will see where the information in those 60 lines is coming from. It is an interesting exercise & will enhance your understanding of what's going on. But it's not really necessary to look at that section in order to do the download we are going to do here. The interesting part of any ffprobe report is always at the bottom of the file.

In this case, there are 6 Programs listed by ffprobe, numbered 0 through 5. Program is the jargon ffprobe uses to refer to a multimedia object, in this case an MP4 video. I discuss upthread how you can determine from an ffprobe report what type of media you have encountered. Here, we have an MP4.

Right before the line containing the title Program 0, notice the line that starts with Duration: 00:03:00.00. This says our video is 3 minutes long. We're not logged on. We don't have access to the full concert of whatever length it was, a couple of hours probably. We will be able to download only the 3-minute teaser, the free item. That's good enough for this demonstration.

Each Program consists of 2 Streams. Again, this is ffprobe jargon. Stream is the way it labels the tracks of the MP4. You can see the name of each Stream right after the word Stream. The names of the Streams here are 0:0, 0:1, 0:2, and so on. Each Program has an Audio stream followed by a Video stream. The order is not important. You'll encounter cases where the video comes before the audio. That's not important. What's important is that each program has an Audio Stream & a Video Stream.

You can see that the Video Streams are of various resolutions ranging from a low of 256x144 to a high of 1920x1080. This is a typical array of choices. Each one of the video streams shows a frame rate of 25fps. This is not necessarily what you will always encounter. It's possible that on any given site, the media object might have 25fps on its low resolution Streams but 30fps on higher resolution Streams & maybe even 60fps on its highest resolution Streams. That just happens not to be the case here. It is just something you should keep an eye out for.

In each Program in this report, there are parts that are labelled Metadata: variant_bitrate. The numbers there give a rough indication of how much bandwidth usage it will take to download the Streams of that Program. In this particular manifest, the 6 Programs show 6 different resolutions. But it is common to encounter cases in which a given resolution occurs more than once. The bitrate will almost certainly be different for the Streams of the same resolution. You can determine which Stream includes the higher quality video at that resolution. It will be the one with the higher bitrate value. Also, the resulting file you download will be larger for the larger bitrate. (I emphasize again as I have done many times in this forum. Quality and resolution are NOT synonyms.)

There is a similar consideration for the Audio Streams. In this particular case, all 6 Audio Streams use a 48kHz sampling rate. (That's shown in the report as 48000 Hz.) But this may not always be true. You could see Audio Streams with differing sampling rates. This would help you select the quality of Audio Stream you want to download. On the other hand, there are varying bit rates for these Audio Streams. We have 134kb/s, 165kb/s, 258kb/s, & 3 occurrences of 322kb/s. This also helps you to select which quality of Audio Stream you want to download.

This manifest shows a different Audio Stream with each Video Stream. We have 6 Programs here & the various Audio & Video streams are numbered 0:0 through 0:11. But you are likely to encounter cases in which several Video Streams are paired with a common Audio Stream. If this were true here, the last stream wouldn't be 0:11 but something lower, like maybe 0:8. You would see that maybe Audio Stream 0:4 would appear within a few Programs paired with different Video Streams. You might see 0:4 as the Audio Stream within the same Program as Video Stream 0:5. But then you might see 0:4 with 0:6, 0:4 with 0:7, 0:4 with 0:8. It's a possible manifest structure you might encounter. It's not true here, but I want you to keep an eye out for this possibility elsewhere.

OK. We've studied this manifest long enough. Time to decide what we want to download. I like to get the best choice on offer. If you're getting a classical music video, you want the best sound with the best image. So I'm going to choose Program 5 here. That's Audio Stream 0:10 & Video Stream 0:11. To do that, we're going to execute ffmpeg. The structure of the command is this:

ffmpeg [input file parameters] -i [input file name] [output file parameters] [output file name] [windows redirections to capture the log file]

Again, you'd be doing yourself a favor to put this into a .bat file. Here's the actual command to do the download. Once again, this is just a single very long line. I've broken it up here to highlight the parts.

"G:\ffmpeg\ffmpeg-2022-03-17-git-242c07982a-full_build\bin\ffmpeg.exe"
-protocol_whitelist file,crypto,data,http,https,tls,tcp
-hwaccel auto
-i "https://playout.prod.medicitv.fr/satie/live/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1cmwiOiJodHRwczovL3Nkbi1nbG9iYWwtc3RyZWFtaW5nLWNhY2hlLjNxc2RuLmNvbS85Mzc4L2ZpbGVzLzIyLzA1LzAyLzUzNzI2NDUvOTM3OC1uTmNKUUdkbTJ3eGdrM3EtZHJtLWFlcy5pc20vbWFuaWZlc3QubTN1OD9zdGFydD0wJmVuZD0xODAiLCJpYXQiOjE2NTY2MjQ3MDcsImV4cCI6MTY1NzIyOTUwN30.FqKXD4-j2eiIk7Oq1Qm0tyYQ474jKaFNSgVTkOKgiOI/m.m3u8"
-codec: copy
-map 0:11 -map 0:10
"Q:\VDH Testing\Teaser.mp4"
1>"Q:\VDH Testing\Teasermp4.Err"
2>"Q:\VDH Testing\Teasermp4.Log"

Now let's go through the parts one at a time.

"G:\ffmpeg\ffmpeg-2022-03-17-git-242c07982a-full_build\bin\ffmpeg.exe"

This is just like what I did above for ffprobe. This time, I'm executing ffmpeg.

-protocol_whitelist file,crypto,data,http,https,tls,tcp

This is the same as for ffprobe.

-hwaccel auto

This is a parameter on ffmpeg that is discussed elsewhere in this forum. You can do a search for it if you're interested. I'm not sure to what extent it actually helps. But it doesn't seem to hurt. If you don't want to code it, don't.

-i "https://playout.prod.medicitv.fr/blah blah blah"

This is the same manifest URL as I gave to ffprobe above. But on the ffprobe command, you just put the URL there in the command. On the ffmpeg command, you have to prefix the manifest URL with the -i switch. It's just syntax. Nothing significant. This is the rule. Just do it like this.

-codec: copy

This tells ffmpeg to take whatever it is reading as input & transfer it to the output unchanged. In our case, we know from looking at the ffprobe report, that we have an Audio Stream & a Video Stream. So -codec: says to process all Stream types in the input. Other forms of coding this parameter are -codec:a for audio, -codec:v for video, & other possibilities. When you get more advanced with ffmpeg, you can scour the documentation for further options. We don't need those here & you're unlikely to need them all that often in practice. So -codec: is good enough. The copy that follows -codec: is the value of the parameter. The parameter is -codec: & the parameter value is copy. By coding this, you stop ffmpeg from doing a lot of very CPU-intensive converting. As I have said elsewhere, ffmpeg is like a Swiss Army knife. It does all manner of things if you know how to tell it to do them. I don't. This is what you will be using almost all the time. Once again, when you get advanced, you'll perhaps leave this parameter out. Most of the time, this is what you will want to code.

-map 0:11 -map 0:10

This is how you select individual Streams out of the manifest. This tells ffmpeg to ignore the first 5 Programs & download only the 6th Program. We determined the 2 Streams we were going to download. Now we're using those Stream names as parameter values for the -map parameter. You'll notice that I've been a bit sly here. The manifest said the Audio Stream comes before the Video Stream in each Program. I've swapped the order here. Why? Just to show that you can do this. It is of no significance whether audio comes before or after video in a file. All video players can handle it either way. You would get a playable result file by coding the -map parameters in the other order. You would not notice any difference when you play the file. But as I've done it here, the result file that gets downloaded will have the video track in front of the audio track.

"Q:\VDH Testing\Teaser.mp4"

This is the specification for the target file on your system. You will code whatever partition, directory, & file name you choose. The only important thing here is the .mp4 part. As we already determined, this object is an MP4. So you have to code your target file name with the extension .mp4. The rest of it is up to you.

1>"Q:\VDH Testing\Teasermp4.Err"
2>"Q:\VDH Testing\Teasermp4.Log"

These are the redirections for capturing the output of the ffmpeg command. The 1> redirection has always been empty for me. I include it just to be careful. I suppose you could leave it out. The 2> redirection is the important one. It is the log file created by ffmpeg. It's usually a sizeable file. In particular, for actual performances on Medici, things that last a couple of hours, this log file gets really huge, much larger than for example the Metropolitan Opera downloads. There are things logged that I wish I knew how to suppress. Maybe some day I will make an effort to figure out how to do that. For this little 3-minute clip, it's not an issue.

So here's what my results look like:

#02.jpg

You can see from the time stamps in that image that the download took a minute. One interesting thing to note is that the audio bit rate is only 319kbps whereas ffprobe advertised it as 322kb/s. This sort of minor discrepancy is common. I don't fret over it.

I have attached the log file to this post. I had to change the file extension to .txt to keep Google happy. You should download it to your system & read along in it as I discuss what's in it.

Let's go through the ffmpeg log & look at the interesting bits. You should be able to recognize that ffmpeg starts by issuing ffprobe against the input. The input to ffmpeg is the same manifest from earlier, so the start of the log is just a repeat of what we've already analyzed. Eventually, we come to this passage, about 125 lines into the log:

Output #0, mp4, to 'Q:\VDH Testing\Teaser.mp4':
Metadata:
encoder : Lavf59.20.100
Stream #0:0: Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 25 fps, 25 tbr, 90k tbn
Metadata:
variant_bitrate : 5638000
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 322 kb/s (default)
Metadata:
variant_bitrate : 5638000
comment : audio
Stream mapping:
Stream #0:11 -> #0:0 (copy)
Stream #0:10 -> #0:1 (copy)

This is showing what ffmpeg is going to create. Notice how it will put the Video Stream in front of the Audio Stream, like I already discussed. Notice the section under Stream mapping. This reinforces what I already explained. It shows which Streams will be selected from the input manifest. It shows which order those Streams will be placed in the output file. It notes the requested copy function from the -codec: parameter.

Just below this, you'll see some lines that read:

No longer receiving playlist x

The x runs from 0 through 4. This is ffmpeg's way of acknowledging that the Streams for those Programs were not selected for processing.

The rest of the log file consists of a pattern 2 lines repeated over & over:

Opening 'ugly URL' for reading
frame= 49 fps=0.0 q=-1.0 size= 1280kB time=00:00:02.02 bitrate=5174.1kbits/s speed=3.06x

For the sharp-eyed among you, you'll notice that buried deep within the ugly URL is a number that starts at 1 & increases by 1 in each log line. The log line with the number 1 in it was actually generated by ffprobe when ffmpeg first started. In this section of the log, the first one has the number 2 & if you skip down to line 323, you'll see the last one has the number 90. That tells you that this 3-minute video was read in 90 chunks.

I don't completely understand everything in the second line here, the one that begins with frame=. I think the frame value is a count of the number of video frames that have been read so far. At the last, it appears that there were 4500 frames total in this clip. 4500 frames, 3-minute clip, 1500 frames per minute, divide by 60, 25 frames per second, sounds about right. The size value appears to be a cumulative count of the size of the file read so far. The time is the time index within the clip that has been read so far. In this line, it's processed just over 2 seconds of the input. The bit rate appears to be the same as the thing labelled Total bitrate in the Windows Properties in the Video section. You'll notice that each log line shows a different bitrate value. Videos generally don't need a fixed bit rate throughout. Different frames can be displayed at different bit rates, & this appears to be reflected here. The speed number indicates how fast ffmpeg is reading the input. In the line I'm showing here, 3.06x means that it is reading the input as if it were playing the clip at 3.06 times the normal speed at which you would sit & watch it. So a 3-minute clip played at a speed factor of 3.06 should take about 59 seconds to download. But if you scan all the log lines, you'll see this speed was not constant throughout the download for whatever reason. The actual download time was about 90 seconds.

The last lines of the ffmpeg log, lines 325 & 326 of the file, are these:

frame= 4500 fps= 58 q=-1.0 Lsize= 118517kB time=00:03:00.01 bitrate=5393.5kbits/s speed=2.31x
video:111380kB audio:7089kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.039939%

Pretty self explanatory.

After my ffmpeg download completed, I ran ffprobe against the downloaded file on my system. That's what follows at line 327 of the log. Here's how the ffprobe report connects to the Windows Properties.

#03.jpg

Now here's an exercise for you. Take a look at this page:

https://www.bbc.co.uk/sounds/play/collection:p0801f4y/m000qbgj

I have a feeling that page may disappear at some point. If I understand correctly, BBC Radio 3 does not archive their shows forever. No matter. Scrounge around on their site & look for other archived material.

Go through the steps as I've outlined them above . . . with one change. In the Network Monitor, instead of filtering on .m3u8, filter on .mpd. Manifests of type .mpd are for DASH streams instead of HLS streams. But everything else about the process is exactly the same.

So now you're wondering how do you know whether to look for HLS manifests or DASH manifests. You don't. You have to try them both. Sometimes you'll find .m3u8 files, sometimes you'll find .mpd files. We've even encountered one site (it's discussed in a thread somewhere in this forum) that used both types of manifest on the same web page. So try them both.

I don't want to oversell this process. It works on lots of sites. But it fails on plenty of sites as well. Usually, the problem is that you can't find a manifest at all. Sometimes you find a manifest but ffprobe gives you the error 403 Forbidden - access denied. My beloved Golf Channel content downloads great in VDH but I get the 403 Forbidden when I try to ffprobe their manifests. I don't know what magic Michel does to make it work. In many of these cases, I speculate that the web site has intentionally taken measures to thwart downloaders like VDH & ffmpeg. In that case, resort to OBS if you really want your own copy of the content. That's not as good as downloading. For one thing, you have to let OBS record the item at playback speed. So a 3 hour video is going to take 3 hours to record. Furthermore, you're quite likely going to have to stop using your computer for anything else for however long it takes to record the item. But it might end up being the only solution. Still, it's always worth a shot to try first VDH, then ffmpeg.

Teasermp4Log.txt

ffprobe.txt

mjs

unread,

Jul 22, 2022, 11:08:06 PM7/22/22

to Video DownloadHelper Q&A

ffmpeg is a good option to have when vdh doesn't work, but there is one problem I find with it. It is very finicky about what file name you choose.

It'll say Unable to find a suitable output format for word Invalid argument

It is just easier to call the file name video.mp4 then rename it after the download has completed.

Wild Willy

unread,

Jul 23, 2022, 8:41:59 PM7/23/22

to Video Download Helper Google Group

I've seen that error but I've usually seen it when there's been some mismatch between the
type of input you're giving it & the type of output you're telling it to create. If your
input is not actually MP4 but you've specified MP4 as your output, AND you've specified
-codec: copy, it will complain. If you leave out -codec: copy, it will try to do a
conversion, which might not be what you expect, & the particular transformation from the
given input to the specified output might not be possible. Besides that, there are a lot
of characters ffmpeg doesn't process correctly. These are mostly accented characters: é,
ç, à, ř, Greek letters, Cyrillic letters. If you want to use such characters in your
input file name, you'll have to change them. If you want to use such characters in your
output file name, your advice, mjs, is spot on. I don't know if there are any "language
packs" for ffmpeg. I haven't particularly searched for such a thing. I've just adapted
by doing what you suggest. At least I have yet to encounter a manifest whose name has
any characters that ffmpeg has a problem with.

Wild Willy

unread,

Jul 25, 2022, 12:37:46 PM7/25/22

to Video Download Helper Google Group

The learning continues. The Verbier Festival is underway & I have been recording a
number of the concerts from Medici TV (https://www.medici.tv/). These appear to be
available even if you do NOT have a full paying membership, which I don't. You generally
don't need a paying membership, just a free membership, to view their livestreams while
they are live. But quite often, after the livestream ends & they archive the show on the
site, they put it behind their paywall. So you have the odd situation where you have a
recording of a show that you made while it was live, but you can't view the archived
version of the same show if you aren't a paying member. But these Verbier concerts
appear to be available to non-paying members after the fact. I haven't looked at their
entire archive but I believe their archived copies of concerts from past years of the
Verbier Festival are behind their paywall. I am most appreciative of their making this
year's shows available for free.

In any case, I was wondering how I might get ffmpeg to record a complete livestream that
I might join in progress. Typically, they open their livestreams about 15 minutes before
show time. So if a show begins at 6:00 & I happen to join the livestream at 7:00, I
would want to tell ffmpeg to begin its recording at one hour & 15 minutes before the time
I launch ffmpeg. In other words, I wondered if there is a way to tell ffmpeg to rewind
to the beginning of a livestream. It appears that my Golf Channel recordings do that. I
was wondering if these Medici recordings might do the same.

I went huntng online for any advice. I found many search hits, a few of them
interesting, but none of them relevant. I finally searched with some search key I don't
remember & down at the bottom of one of the search pages I found this reference:
https://gist.github.com/s4y/46738a67c4bc842f1f02f09e1eaf23fd. This reference is no more
recent than March 2021, over a year ago now. I am always skeptical of anything I read
online about ffmpeg because it's been around so long & over the years it's changed a lot.
Plus my experience has been that not all the advice I try works. But I was willing to
give this one a shot.

The ffmpeg documentation (at least on Windows) is bundled in the zip file containing the
product. When you unzip the file you download from ffmpeg.org, you'll get a subdirectory
named doc. It contains a number of html files with file names that at least suggest what
might be in the files. So you're supposed to open the documentation in your web browser.
I'd prefer PDFs or plain text files, but it is what it is. At least you aren't reliant
on their web site being up in order to access their documentation. In the ffmpeg-all
file, they have what I have to guess is the most extensive documentation of ffmpeg. In
order to verify the advice I read at github, I did a string search in that web page for
-live_start_index. I got no hits, so I was beginning to doubt whether I had found good
advice. Every ffmpeg parameter starts with a hyphen. If you scan through that
particular page of documentation, you will see that a lot of parameters are documented
with the leading hyphen. But it took me a while to discover that I was looking at a case
that didn't follow the standard. This sort of inconsistent documentation makes me crazy.
I have complained many times in this forum about how lousy the ffmpeg documentation is.
Here's another reason to complain. Anyway, it took me a while but I finally tried
looking for that parameter without the leading hyphen. Bingo. It's there. Lesson:
don't always prefix you searches for parameters with the hyphen. If the search comes up
empty, try again without the hyphen.

So I finally located the documentation for -live_start_index (absent the hyphen) in
section 20.10. This is a surprisingly short section that documents the HLS demuxer. It
says there that -live_start_index specifies the segment index at which to start a live
stream. Is that to start creating one or start receiving one? This brilliant
documentation leaves the answer to that question to the reader's imagination. I believe
this is typical of all Linux documentation. The people writing this stuff are
intentionally trying to discourage newbies from learning whatever it is they're
documenting for the express purpose of building their egos & keeping theirs a closed
club. Here you have yet one more instance of that. In any case, the github advice says
to code -99999. This tells me that the author there believed the parameter would back up
that many segments from current time, the time you launch ffmpeg, whenever that happens
to be, & skip back some unreasonably large amount to reach start. But the documentation
says that negative numbers count back from the end of the stream. I pondered all of this
& decided -99999 was not really the correct way to go. The value you really want is
simply 0. That is 0 instead of 1 because ffmpeg usually numbers things, various things,
starting at 0. So I assumed 0 would be correct here. Assumed. Not explicitly
documented. I was guessing. Again. As I usually do when it comes to this execrable
documentation.

I tried -live_start_index 0 and it works. It causes ffmpeg recordings from Medici
livestreams to rewind to the start. Hooray! I was so happy to have finally found
something that worked. But I have given this some thought. It is quite likely that the
success of this parameter relies rather a great deal on the way the server presents the
livestream. On Medici, their livestreams rarely last more that a few hours. Skipping
back to 0 on Medici isn't such a long time. But suppose you're looking at the livestream
of your favorite public radio station, the one that broadcasts the Metropolitan Opera on
Saturday afternoons during their season. When is the beginning of their livestream?
That radio station streams live 24x7, most likely. The beginning of their livestream is
not within reach. What would you code for such a stream to rewind it only an hour or
two? I haven't figured that out. So this parameter is probably not universally
applicable. But you'll probably find plenty of cases in which it will be very useful.
Still, I am putting up the warning that it might not work on every site. I can imagine
the case of a site that might have been livestreaming something for 3 hours but
-live_start_index 0 might take you back only half an hour or an hour. On that 24x7 radio
site, -live_start_index 0 might go back a few hours. It could easily not work at all.
It all depends on the way the site presents the livestream. So keep this in mind.

I've been a bit vague so far on where exactly to code -live_start_index. This is
intentional because the story is actually a bit complicated. In that section 20.10 of
the ffmpeg documentation, I noticed a couple of other parameters documented just below
-live_start_index. It so happens that when I have recorded livestreams from Medici, it
seems like the livestream ends but ffmpeg sits there reading the manifest over & over
1000 times before it finally decides to terminate its recording. During those 1000
reads, ffmpeg writes no blocks to the output file. But this takes about 40 minutes. All
of this activity is logged by ffmpeg. If you read the documentation in section 20.10 for
the next few parameters after -live_start_index, you'll see 2 of them have default values
of 1000. Plus the names of the parameters seem particularly relevant. I decided this
can't be a coincidence & started experimenting with them.

When I say experimenting, I mean trial & error, lots of trials & an embarrassingly large
number of errors. When I put those 2 parameters into my invocation of ffmpeg, I kept
getting the syntax error, "Option not found." In order to understand what did finally
work, I need to explain the basic structure of the ffmpeg command. I've kind of done
that upthread here but I need to revisit the subject. The ffmpeg command boils down to
this:

ffmpeg -i <input> <output>

<input> can be many things, a URL of a manifest online, a URL of an MP4 or some other
media type online, the file specification for a manifest on your system, the file
specification of a media file on your system, it all depends on the function you want to
perform.

<output> is usually a file specification of a media file you want ffmpeg to create.

So the command has 2 gaps in it. The first gap is between ffmpeg & -i. The second gap
is between <input> & <output>. The second gap is easy to explain. The ffmpeg parameters
you code in the second gap apply to <output>.

The first gap is a little more complicated. There are 2 classes of ffmpeg parameters
that you code in the first gap. The class on the left, the parameters you code first,
are global parameters that apply to the overall execution of the command. The class on
the right, the parameters you code second, are parameters that apply to -i <input>,
parameters that apply to the input file. Among my many errors, I found that if you have
an input parameter & you follow it wih a global parameter, ffmpeg will give you the
syntax error "Option not found." It was a valid option, but I was coding it in the wrong
position, so ffmpeg couldn't find it. My experiments have led me to believe that the
-protocol_whitelist parameter that I have mentioned multiple times upthread here is an
input parameter. I think it is global in nature but it seems it is defined in the world
of ffmpeg as an input parameter. On the other hand, it seems that the -hwaccel parameter
you can find mentioned upthread is a global parameter, which at least matches my sense of
what it is. Further, it seems -live_start_index is an input parameter. So this is what
I have used to record Medici livestreams rewound to the beginning & not wasting 1000
useless reads of a manifest after the livestream has ended.

ffmpeg -m3u8_hold_counters 25 -max_reload 25 -hwaccel auto -protocol_whitelist file,crypto,data,http,https,tls,tcp -live_start_index 0 -i <input> outputparms <output>

That's all one line. Google may be displaying it wrapped & folded on your screen. It's
a single command line.

I am not entirely sure which of -m3u8_hold_counters or -max_reload limits ffmpeg to
rereading the manifest only 25 times after the livestream ends. But I've got them both
coded & it works.

Among my many errors, I have found that you can't code these parameters on just any sort
of input. If you're just downloading a file, the parameters as I show them above cause
the "Option not found" error. So it seems ffmpeg figures out what type of input you're
dealing with & parses the parameters accordingly. I have taken special care to code
-m3u8_hold_counters, -max_reload, & -live_start_index ONLY when I'm recording a
livestream. Those parameters are relevant only for livestreams anyway. But I get syntax
errors for that exact same coding pattern when it's not a livestream. So be careful with
it.

Something I learned from my many errors is that ffmpeg appears to scan the command line
right to left. One thing that makes me think this is that the global parameters & input
file parameters appear to depend on the type of input file specified. It seems logical
that the only way it could treat those parameters differently depending on the type of
input file is to be scanning the command line right to left. A second piece of evidence
is the way it was generating error messages. When I was making errors with the
-m3u8_hold_counters & -max_reload parameters, the error messages I got led me to believe
it was scanning right to left. When I coded -m3u8_hold_counters followed by -max_reload,
it threw the error message on -max_reload, then terminated command processing. When I
coded -max_reload followed by -m3u8_hold_counters, it threw the error message on
-m3u8_hold_counters, then terminated the command processing. It was always the rightmost
parameter that got flagged. It never flagged them both, even though they were both in
error. I don't know whether this right to left processing is documented. It might be.
I haven't looked. It's probably not really important for successfully using the command
to do downloads. I just found it interesting.

When a livestream ends, the serving web site is supposed to delete the manifest. When
that happens, ffmpeg reads the manifest it has been reading since the livestream began &
gets a 404 Not Found errorr. It looks like it tries again maybe once or twice, then
finalizes the output file & terminates. I have observed this with some of my Medici
downloads. But sometimes the serving web site doesn't remove the manifest from its site.
The majority of my Medici livestreams are like this. Before I found the
-m3u8_hold_counters & -max_reload parameters, ffmpeg would read the manifest 1000 times
without writing any data to the output file. This appeared to take about 40 minutes.
That works out to about 2.5 seconds per read of the manifest. Once the 1000 reads were
completed, ffmpeg finalized the output file & terminated execution. When I added
-m3u8_hold_counters 25 & -max_reload 25 to my ffmpeg invocations, ffmpeg read the
manifest only 25 times before deciding the livestream was over. So the
-m3u8_hold_counters & -max_reload parameters are something of a safeguard against a
livestream that is not terminated properly. I have found that parameter values of 25
work well enough for Medici livestreams. That works out to a delay of a bit over 1
minute after the livestream ends before ffmpeg decides to stop recording. This might not
work well on all serving sites. You'll have to experiment with the values for these
parameters. Maybe 50 would work better for your case, maybe 100. You'll have to
experiment with it to figure it out for your case. My guess is that -m3u8_hold_counters
is the parameter that is actually taking effect to end my Medici livestreams. But I have
not experimented with coding only -m3u8_hold_counters without -max_reload or only
-max_reload without -m3u8_hold_counters to test that hypothesis. I figure I'm covering
all the bases by coding both parameters & I'll just leave it like that until I hit a case
that makes me re-evaluate what I've done.

Now look again at that section 20.10 of the ffmpeg documentation. Do you see where they
explain that one parameter is an input parameter but the others are global parameters?
Do you see where they explain that you can't code these parameters on just any old ffmpeg
invocation? No. I had to suffer through countless trials & errors to learn these
things. Every day I look at this documentation my hatred for it grows. Maybe I've saved
you some trouble. I hope you find this latest update helpful.

Wild Willy

unread,

Aug 29, 2022, 11:35:08 PM8/29/22

to Video DownloadHelper Q&A

I have taken it as an intellectual challenge that I haven't been able to download content from YouTube using ffmpeg. But I think I've managed to figure this out. And if you're thinking I've got entirely too much time on my hands, you're right. I'm doing this mainly for fun. You might find this useful. If not, that's fine. Read something else.

The YouTube example I am going to use is this:

https://www.youtube.com/watch?v=BszBccYHuAk

Before I visit YouTube, I always tell Firefox to refuse to play the ads. This is how I do that:

#01.jpg

I know. It's the scary about:config. Don't be scared. Click the button that warns you away & filter on the "autoplay" string. I show the relevant preference in that image. I normally have that one set to 1, but I always change it to 2 for YouTube. I found this advice a long time ago via a Google search. If you're interested, you can do the search & read about it.

I visited the YouTube page & selected the highest resolution on offer. Then I opened the Network Monitor & reloaded the page. YouTube content is generally offered in multiple resolutions, & entries in the Network Monitor can include items for multiple resolutions. So it's good to choose your resolution, then reload the page to make the Network Monitor show you stuff that is more likely to be for only your chosen resolution. More likely. Not guaranteed. Just more likely. The Network Monitor will also include some content that is generated by the ads. This can't be helped. But it can be mitigated. Here's how:

#02.jpg

You would think that you could filter on mp4 or webm but those don't appear in the Network Monitor entries as file extensions. They might appear as character strings within the URLs so filtering on them might work. I think it's safer to sort the entries. I clicked the column header with the title Type to sort the entries. That then groups the entries by their type. I'm showing the section of the display where the mp4 entries end & the webm entries begin. But there's lots of entries both above & below this point that might be relevant. Which entries? Here's where the fun begins. You can't know without some trial & error, some guessing. There's no avoiding it. This is not an exact, scientific approach. There's a certain amount of improvising you have to do here. Deal with it. Moreover, as the web page sits there idle, more & more entries appear in the Network Monitor. Again, deal with it. My intuition is that the entries we will be most interested in will be listed lower down in the Network Monitor. So I always start my hunt for useful entries from the bottom.

#03.jpg

The last mp4 here has this URL:

https://rr2---sn-vgqsrnlk.googlevideo.com/videoplayback?expire=1661837491&ei=UkwNY-HBNYOdigSUwZfICQ&ip=2603%3A6011%3Ac306%3Aa341%3A0%3A0%3A0%3A10a9&id=o-ALw37cIFXsGaozPb-KezK17GPDx22WFqef3qqx64IGty&itag=399&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278%2C394%2C395%2C396%2C397%2C398%2C399&source=youtube&requiressl=yes&mh=ol&mm=31%2C29&mn=sn-vgqsrnlk%2Csn-vgqsknlr&ms=au%2Crdu&mv=m&mvi=2&pl=32&initcwndbps=1658750&vprv=1&mime=video%2Fmp4&ns=aRv9YIwZlIyv2U7VHP8NYY4H&gir=yes&clen=707156308&dur=2965.629&lmt=1588813648848138&mt=1661815513&fvip=5&keepalive=yes&fexp=24001373%2C24007246&c=WEB&rbqsm=fr&txp=5531432&n=mxZf8jrnife3IA&sparams=expire%2Cei%2Cip%2Cid%2Caitags%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cgir%2Cclen%2Cdur%2Clmt&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgEJvMO_cNeEMWS2bRdnqDb5qvHFfg8XLxu8zEyVS-F3ECICMidJsoQM2QBpaMWsozpW7jRyALutCA-AQHJvbvLaV5&alr=yes&sig=AOq0QJ8wRAIga2Km47Or8_4HCK-9WDLKIdmkTIkyxbB3ujnHQ-e3d4gCIClMW52IcyCfhjaFhQfx_DQWT2MgdyBAUSqHOYVrfYIo&cpn=ZxXJZNdnCPRecgCU&cver=2.20220829.00.00&range=14666906-16762263&rn=30&rbuf=65444&pot=D9480MCz9nrJ3gHRvYtm8oxH8lOXP6iYyfFaw6GQZzPvcn8bB23haWS7Li8C81Xsb-8Yv0at-x9-o9m2Tzsvr9lGS1xhshGv18xfzW8k5qho9R4cSIdpC4yS7UPhocbrJqAf7RabBUUqdQ==

Could not be uglier. But if you have really sharp eyes, you'll notice that the URL includes this little bit near, but not at, the end:

&range=14666906-16762263

At this point, I rely on a concept fellow user mjs pointed out a while ago in connection with an issue raised concerning content on another site entirely. But the concept applies here. You want to remove the &range= bit from the URL before you use it for anything:

https://rr2---sn-vgqsrnlk.googlevideo.com/videoplayback?expire=1661837491&ei=UkwNY-HBNYOdigSUwZfICQ&ip=2603%3A6011%3Ac306%3Aa341%3A0%3A0%3A0%3A10a9&id=o-ALw37cIFXsGaozPb-KezK17GPDx22WFqef3qqx64IGty&itag=399&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278%2C394%2C395%2C396%2C397%2C398%2C399&source=youtube&requiressl=yes&mh=ol&mm=31%2C29&mn=sn-vgqsrnlk%2Csn-vgqsknlr&ms=au%2Crdu&mv=m&mvi=2&pl=32&initcwndbps=1658750&vprv=1&mime=video%2Fmp4&ns=aRv9YIwZlIyv2U7VHP8NYY4H&gir=yes&clen=707156308&dur=2965.629&lmt=1588813648848138&mt=1661815513&fvip=5&keepalive=yes&fexp=24001373%2C24007246&c=WEB&rbqsm=fr&txp=5531432&n=mxZf8jrnife3IA&sparams=expire%2Cei%2Cip%2Cid%2Caitags%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cgir%2Cclen%2Cdur%2Clmt&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgEJvMO_cNeEMWS2bRdnqDb5qvHFfg8XLxu8zEyVS-F3ECICMidJsoQM2QBpaMWsozpW7jRyALutCA-AQHJvbvLaV5&alr=yes&sig=AOq0QJ8wRAIga2Km47Or8_4HCK-9WDLKIdmkTIkyxbB3ujnHQ-e3d4gCIClMW52IcyCfhjaFhQfx_DQWT2MgdyBAUSqHOYVrfYIo&cpn=ZxXJZNdnCPRecgCU&cver=2.20220829.00.00&rn=30&rbuf=65444&pot=D9480MCz9nrJ3gHRvYtm8oxH8lOXP6iYyfFaw6GQZzPvcn8bB23haWS7Li8C81Xsb-8Yv0at-x9-o9m2Tzsvr9lGS1xhshGv18xfzW8k5qho9R4cSIdpC4yS7UPhocbrJqAf7RabBUUqdQ==

So the &cver= is immediately followed now by the &rn=. I am going to pass this URL to ffprobe to find out whether this is in fact a URL of something I'm interested in. Since I use Windows command scripts, .bat files, for everything I do, there is one more bit of processing that I must perform on this URL before ffprobe can use it. You'll notice it's full of percent signs. In the Windows command language, you can refer to environment variables like this: %PATH%, %APPDATA%, etc. The SET symbol is surrounded by % signs. In order to prevent the Windows command processor from mistakenly interpreting parts of this URL as references to environment variables, I have to double every % in the URL:

https://rr2---sn-vgqsrnlk.googlevideo.com/videoplayback?expire=1661837491&ei=UkwNY-HBNYOdigSUwZfICQ&ip=2603%%3A6011%%3Ac306%%3Aa341%%3A0%%3A0%%3A0%%3A10a9&id=o-ALw37cIFXsGaozPb-KezK17GPDx22WFqef3qqx64IGty&itag=399&aitags=133%%2C134%%2C135%%2C136%%2C137%%2C160%%2C242%%2C243%%2C244%%2C247%%2C248%%2C278%%2C394%%2C395%%2C396%%2C397%%2C398%%2C399&source=youtube&requiressl=yes&mh=ol&mm=31%%2C29&mn=sn-vgqsrnlk%%2Csn-vgqsknlr&ms=au%%2Crdu&mv=m&mvi=2&pl=32&initcwndbps=1658750&vprv=1&mime=video%%2Fmp4&ns=aRv9YIwZlIyv2U7VHP8NYY4H&gir=yes&clen=707156308&dur=2965.629&lmt=1588813648848138&mt=1661815513&fvip=5&keepalive=yes&fexp=24001373%%2C24007246&c=WEB&rbqsm=fr&txp=5531432&n=mxZf8jrnife3IA&sparams=expire%%2Cei%%2Cip%%2Cid%%2Caitags%%2Csource%%2Crequiressl%%2Cvprv%%2Cmime%%2Cns%%2Cgir%%2Cclen%%2Cdur%%2Clmt&lsparams=mh%%2Cmm%%2Cmn%%2Cms%%2Cmv%%2Cmvi%%2Cpl%%2Cinitcwndbps&lsig=AG3C_xAwRAIgEJvMO_cNeEMWS2bRdnqDb5qvHFfg8XLxu8zEyVS-F3ECICMidJsoQM2QBpaMWsozpW7jRyALutCA-AQHJvbvLaV5&alr=yes&sig=AOq0QJ8wRAIga2Km47Or8_4HCK-9WDLKIdmkTIkyxbB3ujnHQ-e3d4gCIClMW52IcyCfhjaFhQfx_DQWT2MgdyBAUSqHOYVrfYIo&cpn=ZxXJZNdnCPRecgCU&cver=2.20220829.00.00&rn=30&rbuf=65444&pot=D9480MCz9nrJ3gHRvYtm8oxH8lOXP6iYyfFaw6GQZzPvcn8bB23haWS7Li8C81Xsb-8Yv0at-x9-o9m2Tzsvr9lGS1xhshGv18xfzW8k5qho9R4cSIdpC4yS7UPhocbrJqAf7RabBUUqdQ==

This is easily done in your favorite text editor. I happen to use Notepad++ but there are others, including Notepad itself. So now this URL is ready for ffprobe to use it. I've attached the result as file ffprobe video MP4.txt. You'll note that all the doubled % signs in the input have been transformed back to single % signs once they have passed through the Windows command processor & arrived at ffprobe. There are several important things to note in this ffprobe report. First, the duration is 00:49:25.63, which matches the duration of our clip, so we've found something that we are going to want to download. Next, the resolution is 1920x1080, so this is the video of what we want to download. Finally, note that there is only a Stream 0:0. There is no Stream 0:1 or any other Stream. That means this is a video track without an audio track.

Now we have to hunt down the audio track. And hunt is the perfect word for what we will do. I started by going to the next entry above that one in the Network Monitor & pushed that through ffprobe. After removing the &range= and doubling all the % signs, it turned out to be the same video again. I ran through all 7 of the mp4 entries listed & every one was for the video. Fortunately, there weren't even more mp4 entries. I have messed around with cases that showed 30 or 40 entries I wanted to check. In the end, none of these 7 mp4s was an audio track. I suppose this reinforces the frequent reports we've had on here of users getting content from YouTube that has video but no audio. It appears that the mp4 version of this clip has no audio.

ffprobe video MP4.txt

Wild Willy

unread,

Aug 30, 2022, 12:02:20 AM8/30/22

to Video DownloadHelper Q&A

OK. Drop back 5 & punt. Let's look at the webm entries.

Mercifully, there's only 2 of them. The ffprobe report on the second one is in attached file ffprobe audio WEBM.txt. It shows that it is the audio track without a video track. It's got the right duration. The bit rate of 138kb/s sounds like it's probably reasonable. The sampling rate is 48kHz, a typical number. At least it's stereo.

The other webm entry was also the audio track. Well. Isn't that interesting. We're going to have to merge an mp4 video track with a webm audio track. What sort of target file type will we choose? I'm arbitrarily picking mp4. I could have picked webm. I could have picked mkv. They all play fine in VLC. But I'm picking mp4. What this means is I can't use my preferred -codec: copy parameter on my ffmpeg invocation. So my ffmpeg invocation will look like this:

ffmpeg -hwaccel auto -protocol_whitelist file,crypto,data,http,https,tls,tcp -i "video mp4 URL" -i "audio webm URL" "target file.mp4" 1>"Q:\VDH Testing\You Tube Video.err" 2>"Q:\VDH Testing\You Tube Video.log"

This simultaneously downloads the 2 tracks, video-only & audio-only, & both merges & converts them to mp4. The video doesn't need converting but since the audio is a different type, the whole thing needs to be converted. This takes more CPU than if everything were mp4 (or webm or mkv) but I have to deal with what I'm given. If the inputs were the same type, I would code -codec: copy as a parameter for the output file & it would take a lot less CPU. That doesn't affect the download speed, only the processor usage on my system. I'm not showing the actual URLs again. They're ugly & long & would only obscure the important details of what I'm doing here. The 1> & 2> bits at the end capture the output of the execution of ffmpeg in the named files. The 1> file is actually empty, as it always is with ffmpeg. But I always code that bit for form, just in case something ever does end up there. The 2> bit is where ffmpeg writes its log file. I don't want to give the impression that I'm hiding anything. The log is just a long, repetitive file. I've condensed it somewhat & attached that as file ffmpeg.log.

While the download was running, this is what I was seeing in the Windows Resource Monitor:

#05.jpg

You'll note my horrible download speed of not even 500,000 bytes per second. Look down in the TCP Connections pane of the window. You'll see 2 entries for ffmpeg. The first entry is for the video track & the second is for the audio track. Look at that speed for the audio track. It's as bad as what you used to get with an old dial-up modem. Like I've said so many times, YouTube throttles their service. But I think I can tell you their formula. Whatever duration the clip is, cut that in half, & that's approximately the time it will take to download the clip. YouTube gives you enough bandwidth usage equivalent to playback at double speed or less. You can see evidence of this in the ffmpeg log. The lines end with a number that is more or less 1.35x, give or take. The numbers are a bit higher at the start of the log, but they eventually reach an equilibrium of about 1.35x for the majority of the download. That means ffmpeg is downloading the content at a speed factor of 1.35 compared to normal playback speed. So for our clip of duration 49:25 (2965 seconds), we can expect the download to take 2965/1.35 = 2196 seconds, which works out to just under 37 minutes. That's a ballpark figure since download speed is never perfectly constant for the duration of the download.

You may note a couple of lines showing a date & time at the very beginning & after the completion of the download. I have added those lines myself by coding this in my script:

date /t 1>>"Q:\VDH Testing\You Tube Video.log"
time /t 1>>"Q:\VDH Testing\You Tube Video.log"

I do that mainly to get a record of how long the download lasts. You'll see that my estimate of 37 minutes was quite accurate. I consider that a bit lucky. You'll also notice that the log file ends with an ffprobe report on the output file. That's just for interest & not entirely required.

Compare the ffprobe information at the start of the log with the ffprobe report at the end. The information for the video track is comparable. Not quite identical, but typical of mp4 video content both before & after. The input is av1 while the output is h264. The bit rate of the output is higher but I'm not convinced that's significant. The output can't be any better than the input. The conversion doesn't perform magic. Even at nearly 3000 kb/s, the video quality is not all that remarkable. I find that good quality 1920x1080 content has bit rates in the 5000 kb/s range, with very good quality more like 7000-8000 kb/s. So the input bit rate of about 1900 kb/s got raised to almost 3000 kb/s. I think that's simply an artifact of the conversion of the video from av1 to h264 & not something to get all excited about.

The audio track has undergone a complete change. The input is Opus but the output is AAC. This is a result of the conversion of the input to mp4. Also, the audio bit rate has dropped to 128kb/s, which is a common value for mp4 audio. Again, I don't think this is significant. Opus bit rate of 138 kb/s must be equivalent to AAC bit rate of 128 kb/s. It's a small change in absolute numbers & another thing to not get excited about.

ffmpeg.log

ffprobe audio WEBM.txt

Wild Willy

unread,

Aug 30, 2022, 12:21:40 AM8/30/22

to Video DownloadHelper Q&A

These are the results of the download:

And here's the VLC screenshot of the very talented (and oh by the way very hot) Yuja playing the always enjoyable Brahms #2:

#07.jpg

You can see the video is just fine. You'll have to trust me that the audio is just as fine.

Now why go to all this trouble? Like I said, it was an intellectual challenge. It is in no way preferable to doing this with VDH. Just for comparison, here's what VDH offered:

#08.jpg

The VDH menu was long enough to require scrolling so it takes 2 images to show it all. VDH starts by showing some small mp3s. There's no mp3s in the Network Monitor. I don't know where VDH found those. I also don't know what they are. Toward the bottom, there's 2 entries that are obviously ads, durations 0:15 & 1:55. The concerto is also there but it's an mkv. There's no mkvs in the Network Monitor. This mkv is smaller than the mp4 I generated, about 75% of the size. Interesting. Perhaps that's an argument for preferring VDH. It would have been interesting to see what I would have gotten if I had told ffmpeg to generate an mkv as the output file. I'm not going back to do it again to find out.

In this case, I found some mp4s & some webms. In other cases, I have found just webms. I have only just recently figured this all out so I haven't done this with a lot of YouTube content. I also don't plan on doing it much. I don't know what is typical, or even if anything can really be characterized as typical when it comes to YouTube. We've seen so much inconsistency with YouTube content over the years. I think a large portion of what you would do in any other case would involve some improvising. I offer this as a possible alternative way to get YouTube content that is otherwise not possible to acquire. It is not very complicated but it is tedious & time-consuming. Like I said before, VDH is always preferable if it works.

mjs

unread,

Aug 31, 2022, 11:21:56 PM8/31/22

to Video DownloadHelper Q&A

That's a lot to do just to be able to do the download in ffmpeg. I don't know how you got it to work in ffprobe as I copied the url with the single

percents and ffprobe shows up a HTTP error 403 forbidden. I copied the other one with the double percents but still the same result.

Did you know you could just download the video and audio separately by opening the url with &range=numbers removed in a new tab.

Then right click on each one and save video/audio as. Maybe you already did know that and wanted a tougher challenge. The audio would still be webm though.

Wild Willy

unread,

Sep 1, 2022, 1:02:07 AM9/1/22

to Video Download Helper Google Group

I suspect the URL I posted expired. There is a section of the URL that says ?expire= so
I suspect you have to go to the video page yourself & go through the process same as I
did to get fresh URLs that actually will be processed successfully by ffprobe today. I
was pretty sure when I posted the URL that nobody else was going to be able to use it. I
was even a bit worried I was taking so long that it was going to become invalid for me in
the middle of writing the tutorial. I posted the URL in order to show how ugly it is. I
also wanted to show it had the &range= thing we've encountered elsewhere. I have a
feeling &range= is not going to be what we see everywhere. It might be, I don't know,
&segment= or &chunk= or ?piece=, something else. You just have to be on the lookout for
something of this nature. I also wanted to show the business of doubling the percent
signs. That is very specific to Windows .bat files. If you're not using a .bat file
specifically in Windows, I don't know if that step is necessary. I don't think it's
necessary even on Windows if you're not using a .bat file, but in that case, you are a
masochist so you can probably deal with the associated pain. I have experience with
neither Linux nor Mac. There must be equivalent things that you do on those platforms.
If you (or anybody) is so inspired, I would welcome further tutorials here for those
environments. Maybe the % trick isn't applicable there. Maybe there are other tricks
involved. I have to rely on others to explain such things.

I know about the deal of opening the object minus the &range= bit in a new tab & using
"Save Video As..." on the Firefox popup context menu. I've posted examples myself of
that in a few other threads in this forum. My goal here was specifically to use ffmpeg.
One reason was to prove it is feasible. Another was to show a way of merging separate
audio & video from the outset without the need for any post-processing, for example by
either VDH or ffmpeg.

Yes, it is a lot of trouble. That's what I said myself earlier. I'm probably going to
do something like this only in extreme cases. Pretty much all YouTube content I
encounter is perfectly amenable to VDH & that's what I will use pretty much all the time.

On the other hand, the approach is probably applicable to just about any site. There's
nothing particularly tied to YouTube in what I did. So I will be able to point to this
as advice to follow when a user finds a case in which there is no m3u8 HLS manifest & no
mpd DASH manifest. As long as there's no DRM protection on the thing & no specific
measures taken by the web site to hide their content (like purposely non-published
passwords, the dreaded 403 Forbidden - access denied case), this approach should be
generally applicable. In the end, any web site has to transmit its content to your
browser & that will show up in the Network Monitor. If other, more convenient (far more
convenient), approaches don't work, there's always the possibility of trying this.

Wild Willy

unread,

Mar 4, 2023, 1:24:03 AM3/4/23

to Video DownloadHelper Q&A

I asked. I got an answer:

https://github.com/GyanD/codexffmpeg/issues/88

Code this as a parameter on ffprobe:

-strict experimental

You can put it before or after -protocol_whitelist, doesn't matter. If there are
captions, this will eliminate the error message in the ffprobe report. The captions will
show up as Streams within the Programs reported by ffprobe. I haven't tried it yet with
items that have captions for multiple languages. I expect those would show up
individually so that each Program would exhibit one Stream for each language of captions.

The same -strict parameter needs to go on the ffmpeg command you use to download the
item. It needs to be a global parameter, not an input parameter. I discuss global vs
input upthread here. I have already implemented the -strict parameter on all my
invocations of both ffprobe & ffmpeg. If there are no captions, no harm done. At least,
that's my assumption at the moment. I may discover that indeed there are ramifications I
have not yet encountered. But I'm going to keep it in place until I discover it causes
problems.

You would have a choice how to download the captions:

1. Run ffmpeg with one -map parameter selecting the captions only, in your preferred
language, & target the output to a file with extension .vtt. This is what I have tested
& it works. It is also my personal preference since I like to have the option of editing
the captions. Run a second invocation of ffmpeg with 2 -map parameters selecting the
video & audio Streams & target this to a file with an extension of .mp4 or whatever
format the item is.

2. Run ffmpeg with THREE -map parameters selecting the Streams for video, audio, &
captions in your preferred language. Target your file to .mp4 or whatever the format of
the item is. This will burn the captions into the video. Perhaps you see that as a
convenience. I would miss being able to edit the captions. Truth be told, you could
always extract the captions from the .mp4 file at a later time, edit the captions, & then
burn them back in. That's up to you.

I think you can select multiple language captions in a single ffmpeg invocation. This
would give you a multilingual video. You can use a control in VLC to cycle through the
various languages. I have not tried this so if you do, do post about it here.

Wild Willy

unread,

Mar 16, 2023, 9:58:39 PM3/16/23

to Video DownloadHelper Q&A

I have recently added a parameter to all my invocations of ffmpeg. This was a result of a corrupt packet causing my recording of a live opera to stop in the middle of the show. I now use this parameter for all my invocations of ffmpeg, both livestream & not.

-fflags +discardcorrupt

This is a global parameter to ffmpeg so you have to position it properly within the command string, which is to say before any input file parameters. I discuss global & input parameters upthread here in a post dated 2022/7/25.

Wild Willy

unread,

Mar 23, 2023, 9:40:11 AM3/23/23

to Video DownloadHelper Q&A

I wish we had the ability to edit existing posts. But Google in their infinite wisdom withdrew that feature a few years ago. Thanks a whole lot . . . NOT.

I got it backwards above. The -strict parameter is an INPUT parameter, NOT a global parameter.

Reply all

Reply to author

Forward

0 new messages