Interpreting results

119 views
Skip to first unread message

Calvin Lawrence

unread,
Sep 10, 2015, 2:18:59 PM9/10/15
to droid-list
Hello,

Can anyone tell me what file attributes are used by Droid to characterize a file and perform signature verification?  I want to understand how to quantify the severity of signature mismatches.  For example, I routinely get mismatches with M4V files from iPhones.  I tried changing the file extension to MP4 but that had no effect.  The Droid report shows that the format name is Quicktime but the file extension is M4V.  Is it as simple as Droid wants to see the extension MOV for Quicktime and when it doesn't it issues a mismatch? Or is there something else that caused the mismatch?

Thank you,
Calvin

ross-spencer

unread,
Sep 10, 2015, 6:49:52 PM9/10/15
to droid-list
Hi Calvin,

In your case, if you renamed your files with the extension MOV or QTM then it would remove the extension mismatch but you're probably looking at identifying M4V more specifically. You can see the patterns that DROID matches here: http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=658&strPageToDisplay=signatures

And the primary record is here: http://www.nationalarchives.gov.uk/PRONOM/x-fmt/384 - The link above *should* take you to the signatures tab. 

You will see a number of signatures DROID uses under the signature tab. DROID doesn't currently tell you which it matches against so a useful tool to help you might be Richard Lehane's Siegfried (SF): https://github.com/richardlehane/siegfried 

While I haven't an M4V I do have access to files with .MOV extension in the opf-format-corpus and I can generate the identification:

C:\Working\git\opf-format-corpus\format-corpus\video\Quicktime>sf xdcam-ex-1080i50.mov
---
siegfried   : 1.3.0
scandate    : 2015-09-11T10:42:10+12:00
signature   : pronom.sig
created     : 2015-08-27T19:32:17+12:00
identifiers :
  - name    : 'pronom'
    details : 'DROID_SignatureFile_V82.xml; container-signature-20150327.xml; built without reports'
---
filename : 'xdcam-ex-1080i50.mov'
filesize : 699984
modified : 2015-08-27T18:48:05+12:00
errors   :
matches  :
  - id      : pronom
    puid    : x-fmt/384
    format  : 'Quicktime'
    version :
    mime    : 'video/quicktime'
    basis   : 'extension match; byte match at 0, 12 (signature 8/11)'
    warning :


Which if I look up PRONOM *should* be the eight sequence:

Byte sequences
Position typeAbsolute from BOF
Byte order 
Value000000{1}6674797071742020
NameQuicktime 8
DescriptionLooking for two atoms: 'wide' and 'mvhd'

If you run SF against your file it will tell you specifically which match you are seeing, and then it might be worth considering if PRONOM needs a separate record for M4V, or if it would be good enough to ask TNA to add M4V as an extension to this record above to remove your 'mismatches'. Without a larger AV/Container knowledge myself I can't really comment without more research.

I hope that helps. 

Ross

Calvin Lawrence

unread,
Sep 12, 2015, 3:22:02 PM9/12/15
to droid-list
Thanks for your reply.  I changed the file extension from M4V to MOV and Droid longer indicated a mismatch.  I had a similar situation with some old NEF (Nikon RAW) files.  Droid identified them as TIF.  Changing the file extension from NEF to TIF eliminated the mismatch.  As a general rule is changing file extensions to eliminate Droid signature mismatches a good idea?  It seems ripe for unintended consequences.

Dclipsham

unread,
Sep 12, 2015, 5:19:59 PM9/12/15
to droid...@googlegroups.com
Hi Calvin,

Generally speaking, DROID is pretty dumb - by which I mean it has a known set of patterns (as provided by the PRONOM signature file), and has no built in learning, or predictive, or possibility-based mechanism. By default, if it encounters a signature hit, it will give a positive. If it doesn't, it will go to the next best: an extension match, else it will report nothing.

In your case, you have a signature match, but an extension mismatch. Generally this is a prompt for investigation, rather than a 'good' or 'bad'. It may be important to understand why there is a mismatch.

In pre-windows days, and indeed in most unix-based environments, including most linux brews and Mac OS, adding a file extension (such as .doc, .mov etc) was not a requirement or even an expectancy, but for many types of file, it became a convention. DROID recognises the file from its known signatures, but if it doesn't match it matches its known extensions.

MP4 is a video and audio container, developed by the International Organization for Standardization, but built to a degree upon Apple's QuickTime format (there's a whole history behind its current state I'm unable to adequately outline). The file format contains an 'ftyp' (file type) descriptor, which tends to be used to determine the exact type of format a given file is, which may typically be used by media viewing programs to assist with playback. There are official ftyp designations, and many unofficial ones, but a fairly comprehensive list can be found here: http://www.ftyps.com/

What you likely have is a legitimate .m4v file, commonly known as Apple iTunes Video, which itself is a legitimate MP4, that contains an 'ftyp' identifier tag currently unknown to PRONOM, which is the technical registry that drives DROID.

It may be desirable for PRONOM (pro...@nationalarchives.gov.uk) to add this identifier tag for MP4, or to give M4V its own unique identifier within PRONOM.

Speaking more widely, I mentioned earlier that an extension mismatch should be a cue for investigation, particularly before ingesting into an archive. It is usually important to understand why such a mismatch has occurred. To give some examples:

A file with no extension that identifies though DROID as a Word file (.doc) - unusual but not wholly unheard of. Older versions of MS Word did not necessarily expect an extension.

A file with an extension .jpg that identifies as a .exe - this would be alarming: you essentially have a Windows Executable file that is masquerading as a JPEG image file.

A file that identifies as an MPEG Video-2 file, but has an extension .mpx - this is unlikely to be a problem - it is likely (but by no means certain) either the generating software gave it an unusual extension, or the person saving it did. It may warrant a deeper look.

DROID is intended to give a good, at-a-glance look at the contents of a given file collection. It has no sense of file validity or conformance to any particular file format specification, but it will help to determine if you have something unexpected, perhaps broken, or unforeseen. Depending on local policy, an extension mismatch could be treated as a red herring, a distraction, or a cause for serious concern.

Any feedback on DROID and PRONOM is welcome: pro...@nationalarchives.gov.uk
Reply all
Reply to author
Forward
0 new messages