FFV1 vs other formats for preservation

3,304 views
Skip to first unread message

davstev

unread,
Sep 24, 2012, 1:20:56 PM9/24/12
to archiv...@googlegroups.com
> Hi,
 

> As an increasingly digital archive, one of the questions that we are dealing with here at CCA in Montreal surrounds the ideal preservation encoding for video.

> We are not alone in this quandry. We

are considering committing to JPEG2000 lossless compression, or AVI uncompressed. I

> noticed that Archivematica will normalize to FFV1 for preservation.

> 

This is slightly unusual in relation to what I have seen other archives going with.

> Can you shed some light on your reasoning for choosing this format? It

> might open up another option for us to consider; I know that FFV1 has good elements that Motion JPEG2000 also has.
 
Anyone who can comment on what benefit or risk is inherent to FFV1 as a preservation choice over AVI uncompressed or JPEG2000 Lossless would be welcome.

> 

> 

> Thanks very much.

> 

> 

> David

> 

> 

> 

> David Stevenson

> 

> Restaurateur

> 

> Conservator

> 

> 

> 

> Centre Canadien d’Architecture

> 

> Canadian Centre for Architecture

> 

> 1920, rue Baile, Montréal, Québec  H3H 2S6

> 

> T 514 939 7001 x 1204

> 

> F 514 939 7020

> 

> www.cca.qc.ca

> 

Dave Rice

unread,
Sep 24, 2012, 4:01:52 PM9/24/12
to archiv...@googlegroups.com
Hi David,
Firstly I'm not speaking for Artefactual, but have worked with both ffv1 and jpeg2000 in preservation environments and continue to use both. Comments follow inline,


On Sep 24, 2012, at 1:20 PM, davstev wrote:

> Hi,
 
> As an increasingly digital archive, one of the questions that we are dealing with here at CCA in Montreal surrounds the ideal preservation encoding for video.
> We are not alone in this quandry. We
are considering committing to JPEG2000 lossless compression, or AVI uncompressed.

This is somewhat an unequal comparison. jpeg2000 is a codec but must rely on a container (either MXF or the MP4 family, such as QuickTime, JP2, Motion JPEG 2000, etc) for some of its technical information (colorspace, etc). AVI uncompressed insinuates raw video data with AVI but there is wide variety of pixel formats here such as uyvy422 or yuyv422, etc. For these two options there is overlapping source material they could feasibly support but also support that is distinct to the two.

For instance AVI does not work well with source material that uses variable frame rate. AVI also does not have a standardized way to express display aspect ratio, though some players will support this data from the codec (for instance the DV in DV/AVI can express 16/9 or 4/3 aspect ratio. Uncompressed in AVI is going to assume square pixels, so a 720x486 video stream in AVI will not be able to express aspect in a standardized way so players will presume a 720/486 aspect. AVI also doesn't have a standardized method to incorporate timecode or caption tracks, basically AVI is one of the earliest and simplest digital video container formats and leaves a lot of the details to the codec. In the case of uncompressed video data within AVI the codec will not provide much beyond actual video data.

jpeg2000 lossless compression is a video codec, so presumably issues like timecode, captions, metadata, aspect ratio, etc can be left to the container.


I
> noticed that Archivematica will normalize to FFV1 for preservation.
>
This is slightly unusual in relation to what I have seen other archives going with.

I think FFV1 is becoming the less unusual choice. I participated in a panel at SAA about use of FFV1 for digital audiovisual preservation. There are audiovisual vendors that now support the codec. The Federal Agencies Digitization Guidelines initiative reviewed the codec in a recent audiovisual working group meeting. The library of congress published information on ffv1 on their digitalpreservation.gov site (http://www.digitalpreservation.gov/formats/fdd/fdd000349.shtml, http://www.digitalpreservation.gov/formats/fdd/fdd000343.shtml). I saw it plugged in the last AV Insider issue: http://www.prestocentre.org/avinsider. FFV1 is also incorporated as the preservation codec in other preservation systems, such as http://www.dva-profession.mediathek.at/.


> Can you shed some light on your reasoning for choosing this format?

I think initially FFV1 was implemented partly as a result of some research for an audiovisual digitization project at City of Vancouver archive. They were seeking to improve on the prior normalization selection in Archivematica for video, mpeg2. jpeg2000 was considered but there were challenges in vendor support and identifying a method for processing to it that aligned with Archivematica's licensing method. There are a variety of jpeg2000 tools available but there were compatibility issues. Some jpeg2000 utilities were colorspace specific (the most widespread jpeg2000 applications are in still images which mainly utilize rgb colorspace whereas most video is source from YUV materials). For the YUV based jpeg2000 applications available mainly would only work in 8 bit or 16 bit sampling, whereas SDI video is 10 bit. I tested one of the jpeg2000 toolsets that supports 10 bit and YUV encoding but could not get it to work losslessly with 10 bit yuv.

Whereas jpeg2000 is a process we looked moreso at the objectives. The objectives were to encode video losslessly so that the data that the original video decoded to could be exactly recreated by a lossless derivative along with all the timing. We wanted to use a lossless codec that could flexible support diverse sets of source video (so broad support for pixel formats, bit depths, chroma subsampling patterns, and colorspace). Many lossless codecs support only narrow ranges of pixel formats, so this narrowed the list to jpeg2000, ffv1, and to some extent h264. Given that Archivematica already incorporated ffmpeg which contains libavcodec, which is where the ffv1 codec is based, this was the suggestion.


It
> might open up another option for us to consider; I know that FFV1 has good elements that Motion JPEG2000 also has.

FFV1 is a codec and Motion JPEG 2000 is a container and thus have very different technical elements. Motion JPEG 2000 is intended to contain jpeg2000 which can be used similarly to FFV1. FFV1 is a lossless codec and jpeg2000 can be used as a losslses codec. Beyond being lossless video codecs there are some scope differences, ffv1 can store its own aspect ratio data whereas I don't think jpeg2000 can. Jpeg2000 uses wavelet coding where ffv1 uses entropy coding. See: http://x264dev.multimedia.cx/archives/317. jpeg2000 incorporates error concealment and so do recent versions of ffv1; however, although ffv1 decoders can report on damaged frames that utilize error concealment I don't know if any jpeg2000 actually report on use of error concealment.


Anyone who can comment on what benefit or risk is inherent to FFV1 as a preservation choice over AVI uncompressed or JPEG2000 Lossless would be welcome.

As mentioned above avi/uncompressed vs ffv1 vs jpeg2000 lossless is an unever comparison since this is a mix of codecs to codecs&containers. AVI has some limitations which could dismiss it as a suitable preservation format for some types of source material, but an analysis shouldn't dismiss the option of a raw video codec because of AVI. Generally I'd recommend that an uncompressed codec may be the safer choice for an archive with limited digital preservation infrastructure since the technology dependences, consequences of digital damage, and requirements for access may be less severe than with lossless codecs, but in the environment using a repository system like Archivematica the benefits of lossless encoding (in a disciplined environments of checksums and AIP management) would likely outweigh the simplicity offered by raw video codecs. Better may be to compare uncompressed (but define which type you're referring too) vs ffv1 vs jpeg2000. And then compare containers based on the outcome of the codec comparison.

I should also mention that FFV1 version 3 may be nearing release. It is now available in ffmpeg under experimental status as testing completes. The version 3 adds multithreaded encoding, which should speed up encoding by about 2.5 times. For long term digital preservation FFV1 version 3 also incorporates mandated checksums per frame. Thus if any digital damage occurs to the file in storage or transmission, an ffv1 decoder should be able to identify which frames were affected, which makes the preservationist's response much more specific than if only enabled with a whole file checksum.

Best Regards,
Dave Rice

davstev

unread,
Sep 24, 2012, 5:17:32 PM9/24/12
to archiv...@googlegroups.com
Hi Dave,
 
Thank you so much for such a thoughtful and informed answer. Now I need to digest what you've written. Your response is in keeping with what I have often heard: there's no one size fits all solution, only what works for one's own institution. I appreciate that you've stated the merits of all of these. We are leaning toward uncompressed video in AVI because of the ease of it, its lossless nature, the fact that we won't be embedding metadata into the file, and the widespread browser support for AVI. However, the file sizes that are created are huge, in comparison to anything losslessly compressed. It is painful to digitize a VHS tape of mediocre-at-best quality only to realize that that half hour became tens of gigabytes. There is much to be said for lossless compression when considering the issue of file sizes alone. IT at my workplace has already sounded in on their concern about AVI uncompressed file sizes.
 
I get a bit tangled when I see your wording "I'd recommend that an uncompressed codec may be the safer choice...", as I thought that all codecs are necessarily compressed, but may be losslessly compressed.
 
In considering FFv1, I may get in touch with some contacts at the City of Vancouver archives, they may also be able to sound in on this given their use of FFv1.
 
Thanks again, I may pop up again with a comment or question after I've digested all of your offering.
 
Best regards,
David

Dave Rice

unread,
Sep 24, 2012, 5:33:06 PM9/24/12
to archiv...@googlegroups.com

On Sep 24, 2012, at 5:17 PM, davstev wrote:

Hi Dave,
 
Thank you so much for such a thoughtful and informed answer. Now I need to digest what you've written. Your response is in keeping with what I have often heard: there's no one size fits all solution, only what works for one's own institution.

Eh, shared standards ain't that bad either.


I appreciate that you've stated the merits of all of these. We are leaning toward uncompressed video in AVI because of the ease of it, its lossless nature,

Regarding lossless nature of uncompressed video, there are probably a few different approaches to defining 'lossless', but the definition that I advocate for in video encoding is that the source video and the resulting lossless video should both decode to identical data. That means that the 'lossless' version should match the source video in colorspace, chroma subsampling, pixel format, bit depth, channel count, etc. If the result is designated to always be the same specific type of uncompressed video than the results may not actually be lossless if there is variety to the source. For instance converting an older Road Pizza Quicktime file which uses 5 bit RGB to 8 bit YUV 4:2:2 uncompressed video is not lossless (the resulting file would subsample the color data, force a colorspace change, and resample the data). Similarly converting an RGBA animation video (with alpha channel) to uncompressed YUV is not lossless since the alpha channel is omitted in the conversion. With video tape there exists a variety of chroma subsampling patterns, often 4:2:2 but also 4:1:0, 4:2:0, 3:1:1, etc. If the samples need to be processed in between the decoder (whether a deck or a codec) and the encoder than the intent of losslessness may be lost. I think that using 'uncompressed' as a specific preservation term is troublesome since it doesn't infer bit depth, chroma subsampling, colorspace, alpha channel presence, etc. Also some source digital formats, like DV will incorporate additional data such as camera metadata, flags of the deck's reading success, start and stop markers, etc that are not easy to represent in uncompressed form.


the fact that we won't be embedding metadata into the file, and the widespread browser support for AVI. However, the file sizes that are created are huge, in comparison to anything losslessly compressed. It is painful to digitize a VHS tape of mediocre-at-best quality only to realize that that half hour became tens of gigabytes.

Compounding the mediocreness of VHS with the additional mediocoreness of lossy compression just increases the total mediocreness of the result. I suppose in this case I'd consider reducing the bit depth down to 8 rather than introducing lossy compression.


There is much to be said for lossless compression when considering the issue of file sizes alone. IT at my workplace has already sounded in on their concern about AVI uncompressed file sizes.

I suspect this is a short term concern ;)


I get a bit tangled when I see your wording "I'd recommend that an uncompressed codec may be the safer choice...", as I thought that all codecs are necessarily compressed, but may be losslessly compressed.

I just mean that it's technically easier to manage uncompressed files as compared to lossless files, since the decoders are simpler and more widespread.


In considering FFv1, I may get in touch with some contacts at the City of Vancouver archives, they may also be able to sound in on this given their use of FFv1.

I'd also recommend the Mediathek Osterreiche, they've had the longest use of ffv1 that I know of in archival settings.


Thanks again, I may pop up again with a comment or question after I've digested all of your offering.

Certainly, I'm happy to help on this discussion.
Dave

Dave Rice

unread,
Sep 24, 2012, 5:36:21 PM9/24/12
to archiv...@googlegroups.com
ug, sorry the threading got lost in this response...

Peter B.

unread,
Sep 25, 2012, 11:04:52 AM9/25/12
to archiv...@googlegroups.com
Hello everyone!

I am working at the Austrian Mediathek [1], and I am the developer of its digital video archiving solution DVA-Profession [2], and responsible for communicating the needs between "the archive side" and FFv1's original author, Michael Niedermayer.

In 2010, we decided to use FFv1 as video codec in an AVI container as our current archival format and we're using it in our daily work since the beginning of 2011, and so far, it perfectly integrated in all our use cases.
I'll supply more details soon (or even more on demand), but I didn't want to flood you with background information before even saying "Hello" ;)

Dave Rice already mentioned the most important things regarding this issue, and I'd like to add that I agree with him in almost every point.
Although I do agree that often there might no "one size fits all", I think it's in everyone's interest to find a shared standard in common work domains.

Personally, it's very interesting to see how easy it can be for any archive to suddenly have a sustainable video format they can actually work with, without requiring any big company to make it happen.

Greetings from Vienna,
Peter B.

== References:
[1] http://www.mediathek.at/
[2] http://dva-profession.mediathek.at/

Misty De Meo

unread,
Sep 25, 2012, 12:17:13 PM9/25/12
to archiv...@googlegroups.com
Hi, David,

In addition to Dave's points, I would also point out that FFV1 is much easier to access. Support for JPEG2000 (either in Motion JPEG2000, or in MXF) is often quite spotty, both in commercial software and in open-source software. Somewhat ironic for a preservation format. ;) Support tends to be inconsistent as well, with videos which work in one piece of software failing to work in another. Good software support and flexibility would be a very important criterion for me when selecting a preservation format. 

A colleague of mine is in the process of rethinking his institution's preservation master standard, since he's found that accessing the JPEG2000-MXF files produced by SAMMA hardware has been quite challenging. Preserving the masters isn't very useful if it's hard to actually do anything with them.

FFV1's support in ffmpeg is a key factor for me – ffmpeg is so versatile that being able to work with FFV1-encoded video in ffmpeg guarantees that the preservation masters can be both easily played and easily converted to any other desired format.

Misty

--
You received this message because you are subscribed to the Google Groups "archivematica" group.
To view this discussion on the web visit https://groups.google.com/d/msg/archivematica/-/sd97zweu0gsJ.
To post to this group, send email to archiv...@googlegroups.com.
To unsubscribe from this group, send email to archivematic...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/archivematica?hl=en.

Dave Rice

unread,
Sep 25, 2012, 12:28:31 PM9/25/12
to archiv...@googlegroups.com
Hi Misty, David,


On Tuesday, September 25, 2012 12:17:18 PM UTC-4, Misty De Meo wrote:
Hi, David,

In addition to Dave's points, I would also point out that FFV1 is much easier to access. Support for JPEG2000 (either in Motion JPEG2000, or in MXF) is often quite spotty, both in commercial software and in open-source software. Somewhat ironic for a preservation format. ;) Support tends to be inconsistent as well, with videos which work in one piece of software failing to work in another. Good software support and flexibility would be a very important criterion for me when selecting a preservation format. 

A colleague of mine is in the process of rethinking his institution's preservation master standard, since he's found that accessing the JPEG2000-MXF files produced by SAMMA hardware has been quite challenging. Preserving the masters isn't very useful if it's hard to actually do anything with them.

The combination of MXF and jpeg2000 requires some additional dependencies. It's possible to have an application that supports jpeg2000 and supports MXF but doesn't support them in combination. This is because jpeg2000 is not sufficiently self-descriptive on its own as a codec but relies on the container for some of the key metadata needed to enable decoding. The MP4 family, like MOV, JP2, Motion JPEG 2000, and others use the 'colr' atom to store this data. MXF is entirely different from these containers architecturally and doesn't utilize the 'colr' atom, thus there is a specific SMPTE standard that addresses how to use jpeg2000 in MXF. Within open source applications I have only seen support for that SMPTE spec in the gstreamer-bad-plugins. Others many players of MXF/jpeg2000 will falsely presume RGB (since that tends to be the default encoding for jpeg2000), but all this does cause some interoperability issues. Feasibly using jpeg2000 in Motion JPEG 2000 or QuickTime may offer more support, but there's still an issue that many decoders are specific to certain sets of colorspace and bit depth, where often archivists need access to a variety of pixel formats to fully represent a collection. For instance Quicktime can decode jpeg2000 through it's old kadaku based decoder but it only decodes RGB so YUV based material is simply decoded wrong.
 
FFV1's support in ffmpeg is a key factor for me – ffmpeg is so versatile that being able to work with FFV1-encoded video in ffmpeg guarantees that the preservation masters can be both easily played and easily converted to any other desired format.

And [plug] come to the ffmpeg4archivists workshop at #AMIA12 taught by me and Misty!
Dave
 

Peter B.

unread,
Sep 26, 2012, 2:29:09 PM9/26/12
to archiv...@googlegroups.com


On Tuesday, September 25, 2012 6:28:31 PM UTC+2, Dave Rice wrote:

The combination of MXF and jpeg2000 requires some additional dependencies. It's possible to have an application that supports jpeg2000 and supports MXF but doesn't support them in combination. [...]

This is something which I find particularly interesting: The "industry-proposed standard container/codec combination" is so badly supported - even among proprietary applications, and still having interoperability issues, although the codec is 12 years old already.
I do have my assumptions why that is so, but I'd really like to hear your opinions as well.

As Misty mentioned:
it's actually difficult to access the "considered to be state-of-the-art for video archiving"-files from Samma *already* - and that, although there are so many people allegedly behind the MXF/JPEG2k combination, promoting it for archiving purposes.

Putting that in comparison to FFv1, which not only works *already* on a wider range of applications and setups, but also comes with sourcecode under a free license, which makes it virtually impossible to have this format ever become technically inaccessible.

Pb

davstev

unread,
Sep 26, 2012, 3:21:22 PM9/26/12
to archiv...@googlegroups.com
Hi all,
 
Thanks for your input! I hope this thread is remaining topical for Archivematica. With this FFV1 option I am reminded that my heistance to commit to jpeg2000 is justified.
 
Regarding FFV1, are any or all of you using it using it with a Matroska wrapper? 
 
Any idea why FFV1 isn't more commonly acknowledged as an archival option? There are evidently many supporters who are ready to espouse it's virtues over the others. 
 
David

Peter B.

unread,
Sep 27, 2012, 3:50:58 AM9/27/12
to archiv...@googlegroups.com


On Wednesday, September 26, 2012 9:21:22 PM UTC+2, davstev wrote:
Regarding FFV1, are any or all of you using it using it with a Matroska wrapper? 

As far as I know, the  "City of Vancouver Archives" are using FFv1 in a Matroska container as their preferred archival format.
Here at the Austrian Mediathek we might also go for MKV in the (near) future, but that mainly depends on support of MKV within the applications we need. The more non-proprietary applications we're using, the easier it will be.


 Any idea why FFV1 isn't more commonly acknowledged as an archival option? There are evidently many supporters who are ready to espouse it's virtues over the others. 

My personal experience from talking to (a) other archivists and (b) vendors of archiving products is as follows:

*) Broadcasting archives:
They have an image to lose and a production toolchain to satisfy - and therefore prefer other formats. They are very reluctant to using anything which is not commonly considered to be a "valid solution" in that domain. They often have enough money to throw at something to "make it work somehow", even if there is an incredible amount of unnecessary overhead involved. Additionally, the managers making the decisions which "solution" (=products) to use, are very often not the ones actually working with it. And once a decision has been made, it's hard to go back, because of the high probability of vendor lock-in, and especially because they might lose their face committing that they might have made a wrong decision.

*) Archive-archives:
Most of them do not have the in-house tech-knowledge to argue "why" they would use something which is not a commonly known solution among audio/video professionals. Especially not when it gets digital. It is rather the exception that archivists could even distinguish between a container and a codec for digital video. Most importantly, they don't have a lot of money to poker with, so they usually go and do it the "safe way". This means that they use what the others (=usually broadcasters) do.
It's sad to see that offering a viable solution to a problem like digital video archiving is rather being shot at than looked at.

*) Vendors of archiving products:
There are not that many of them in general, but if someone is developing for "an archive market", they usually target broadcasters - and offer what broadcasters demand - which, in practice, is what their production toolchain can handle. This is probably the main reason why it's more interesting for vendors to support lossy formats (DVCPRO50, J2K lossy) and the MXF container, rather than something which would be better from an archivists point of view.

I know of at least one vendor who clearly told me:
"Why should I take the hazzle to support FFv1 and e.g. Matroska, if none of my customers even demand it? And: It's open-source. Our customers think it's unprofessional, and we don't want this to change, because we're doing business with software here. Are we understood?"

It seems that BBC is actually one of the few in that domain who understand that open source can be very professional:
BBC created their own video-archiving system "Ingex" themselves. They're using "ffmbc" (=FFmpeg customized for broadcast and professional usage) as part of Ingex and have its main developer on their part-time payroll.


I think that if archive-archives would demand solutions that suit the purpose of long-term-preservation, rather than live with the side-products they're being offered, things would look differently.

davstev

unread,
Sep 27, 2012, 11:25:17 AM9/27/12
to archiv...@googlegroups.com
Peter,
 
Your arguments are insightful and compelling. Thanks for your contributions.
 
We ran a test last evening transcoding to FFV1 from DV, using code sourced from FFMPEG and it went well. No glitches, and it produced good images and reasonably sized files.  
 
David
Reply all
Reply to author
Forward
0 new messages