Hi David,
Firstly I'm not speaking for Artefactual, but have worked with both ffv1 and jpeg2000 in preservation environments and continue to use both. Comments follow inline,
On Sep 24, 2012, at 1:20 PM, davstev wrote:
> Hi,
> As an increasingly digital archive, one of the questions that we are dealing with here at CCA in Montreal surrounds the ideal preservation encoding for video.
> We are not alone in this quandry. We
are considering committing to JPEG2000 lossless compression, or AVI uncompressed.
This is somewhat an unequal comparison. jpeg2000 is a codec but must rely on a container (either MXF or the MP4 family, such as QuickTime, JP2, Motion JPEG 2000, etc) for some of its technical information (colorspace, etc). AVI uncompressed insinuates raw video data with AVI but there is wide variety of pixel formats here such as uyvy422 or yuyv422, etc. For these two options there is overlapping source material they could feasibly support but also support that is distinct to the two.
For instance AVI does not work well with source material that uses variable frame rate. AVI also does not have a standardized way to express display aspect ratio, though some players will support this data from the codec (for instance the DV in DV/AVI can express 16/9 or 4/3 aspect ratio. Uncompressed in AVI is going to assume square pixels, so a 720x486 video stream in AVI will not be able to express aspect in a standardized way so players will presume a 720/486 aspect. AVI also doesn't have a standardized method to incorporate timecode or caption tracks, basically AVI is one of the earliest and simplest digital video container formats and leaves a lot of the details to the codec. In the case of uncompressed video data within AVI the codec will not provide much beyond actual video data.
jpeg2000 lossless compression is a video codec, so presumably issues like timecode, captions, metadata, aspect ratio, etc can be left to the container.
I
> noticed that Archivematica will normalize to FFV1 for preservation.
>
This is slightly unusual in relation to what I have seen other archives going with.
I think FFV1 is becoming the less unusual choice. I participated in a panel at SAA about use of FFV1 for digital audiovisual preservation. There are audiovisual vendors that now support the codec. The Federal Agencies Digitization Guidelines initiative reviewed the codec in a recent audiovisual working group meeting. The library of congress published information on ffv1 on their
digitalpreservation.gov site (
http://www.digitalpreservation.gov/formats/fdd/fdd000349.shtml,
http://www.digitalpreservation.gov/formats/fdd/fdd000343.shtml). I saw it plugged in the last AV Insider issue:
http://www.prestocentre.org/avinsider. FFV1 is also incorporated as the preservation codec in other preservation systems, such as
http://www.dva-profession.mediathek.at/.
> Can you shed some light on your reasoning for choosing this format?
I think initially FFV1 was implemented partly as a result of some research for an audiovisual digitization project at City of Vancouver archive. They were seeking to improve on the prior normalization selection in Archivematica for video, mpeg2. jpeg2000 was considered but there were challenges in vendor support and identifying a method for processing to it that aligned with Archivematica's licensing method. There are a variety of jpeg2000 tools available but there were compatibility issues. Some jpeg2000 utilities were colorspace specific (the most widespread jpeg2000 applications are in still images which mainly utilize rgb colorspace whereas most video is source from YUV materials). For the YUV based jpeg2000 applications available mainly would only work in 8 bit or 16 bit sampling, whereas SDI video is 10 bit. I tested one of the jpeg2000 toolsets that supports 10 bit and YUV encoding but could not get it to work losslessly with 10 bit yuv.
Whereas jpeg2000 is a process we looked moreso at the objectives. The objectives were to encode video losslessly so that the data that the original video decoded to could be exactly recreated by a lossless derivative along with all the timing. We wanted to use a lossless codec that could flexible support diverse sets of source video (so broad support for pixel formats, bit depths, chroma subsampling patterns, and colorspace). Many lossless codecs support only narrow ranges of pixel formats, so this narrowed the list to jpeg2000, ffv1, and to some extent h264. Given that Archivematica already incorporated ffmpeg which contains libavcodec, which is where the ffv1 codec is based, this was the suggestion.
It
> might open up another option for us to consider; I know that FFV1 has good elements that Motion JPEG2000 also has.
FFV1 is a codec and Motion JPEG 2000 is a container and thus have very different technical elements. Motion JPEG 2000 is intended to contain jpeg2000 which can be used similarly to FFV1. FFV1 is a lossless codec and jpeg2000 can be used as a losslses codec. Beyond being lossless video codecs there are some scope differences, ffv1 can store its own aspect ratio data whereas I don't think jpeg2000 can. Jpeg2000 uses wavelet coding where ffv1 uses entropy coding. See:
http://x264dev.multimedia.cx/archives/317. jpeg2000 incorporates error concealment and so do recent versions of ffv1; however, although ffv1 decoders can report on damaged frames that utilize error concealment I don't know if any jpeg2000 actually report on use of error concealment.
Anyone who can comment on what benefit or risk is inherent to FFV1 as a preservation choice over AVI uncompressed or JPEG2000 Lossless would be welcome.
As mentioned above avi/uncompressed vs ffv1 vs jpeg2000 lossless is an unever comparison since this is a mix of codecs to codecs&containers. AVI has some limitations which could dismiss it as a suitable preservation format for some types of source material, but an analysis shouldn't dismiss the option of a raw video codec because of AVI. Generally I'd recommend that an uncompressed codec may be the safer choice for an archive with limited digital preservation infrastructure since the technology dependences, consequences of digital damage, and requirements for access may be less severe than with lossless codecs, but in the environment using a repository system like Archivematica the benefits of lossless encoding (in a disciplined environments of checksums and AIP management) would likely outweigh the simplicity offered by raw video codecs. Better may be to compare uncompressed (but define which type you're referring too) vs ffv1 vs jpeg2000. And then compare containers based on the outcome of the codec comparison.
I should also mention that FFV1 version 3 may be nearing release. It is now available in ffmpeg under experimental status as testing completes. The version 3 adds multithreaded encoding, which should speed up encoding by about 2.5 times. For long term digital preservation FFV1 version 3 also incorporates mandated checksums per frame. Thus if any digital damage occurs to the file in storage or transmission, an ffv1 decoder should be able to identify which frames were affected, which makes the preservationist's response much more specific than if only enabled with a whole file checksum.
Best Regards,
Dave Rice