After being inspired by Jenny Mitcham’s blog post about signature development and the encouragement of my colleague at Archives New Zealand, Ross Spencer, I decided to try my hand at creating a signature as well. I noticed there were a few folders of unidentified PDFs in the govdocs corpus, and, after some investigation, realized some were wrapped in a lightweight container format called AppleSingle that Apple developed to store Mac OS files across different filesystems (http://fileformats.archiveteam.org/wiki/AppleSingle). I didn’t see a puid for AppleSingle in PRONOM, and decided to take a crack.
In the documentation (see http://macjournals.com/special/AppleSingle-Double.pdf http://apple2online.com/web_documents/ft_e0.0001_applesingle.pdf and https://tools.ietf.org/html/rfc1740) and confirmed by the sample files, the AppleSingle version 2 format can be identified as such:
A 4 byte Magic Number that is always at the beginning of the file
00 05 16 00
Followed by a 4 byte version number
00 02 00 00.
(Version one’s version byte sequence is 00 01 00 00, but as I didn’t have any version 1 files at the ready, I didn’t have anything to test against. I’ll attach my sig for version 1 as well, but with that caveat)
Those 8 bytes are followed by a 16 byte filler. All my sample
files had fillers that were all zeroes, but I saw some documentation that said that
home file system data could be kept in this filler, so I left that part of the
header out of the signature.
The ‘regular’ body of the format (ie PDF) follows the AppleSingle header, so AppleSingle acts like kind of a container, I suppose. I don’t know if PRONOM/Droid is open to introducing a new kind of container format, or what the history of PRONOM and these kinds of Apple containers is (after looking around, I found references to a few others that Apple created, AppleDouble and MacBinary that I want to look into as well). For preservation purposes, it should probably at least be noted in any future PRONOM entry that any files identified as this format deserve investigation into the payload, because it’s likely that AppleSingle contains another file type.
The signature I’ve created for Version 2 works for the AppleSingle files in the govdocs corpus. I also ran it against Ross Spencer’s skeleton suite, and the entire govdocs corpus and got no false positives.
As part of this thread I’m looking for:
I am attaching some of the sample files from govdocs, as well the signature for version 1 and version 2, again noting that I don’t have any version 1 AppleSingle files. It just seemed funny to write a sig for version 2 without their being a version 1!
Thanks for your help and consideration,
Andrea Byrne
The references you've provided are pretty clear so I'm more than happy to add the entry for AppleSingle, noting in the format description that investigation of the payload would be prudent. Were you able to find any reference to the differences between versions 1 and 2?
For the extra information for the PRONOM entry:
The official mimetype is listed as application/applefile
Description:
Apple Computers created the AppleSingle format (as well as the AppleDouble format) to represent and preserve the attributes of files across files systems that do not support the same attributes of the file’s home systems. AppleSingle is primarily a storage format and contains both a file’s contents and attributes. It consists of a header and one or more optional data entries followed by the byte stream of the file it is storing. It maintains the original Macintosh filename and file type(ie. Mydoc.pdf). The AppleSingle format can store a number of different kinds of formats.
Developed by: Apple Computers
Documentation: https://tools.ietf.org/html/rfc1740 http://apple2online.com/web_documents/ft_e0.0001_applesingle.pdf http://macjournals.com/special/AppleSingle-Double.pdf
Signature description:
BOF: “….....” 4 byte AppleSingle Magic Number, followed by 4 byte version number
I’m still doing some research on the format, and would be happy to flesh the entry out more if needed.
Andy Jackson was kind enough to search the UK Web Archive for files that include the AppleSingle magic number. Because of ANZ proxy stuff, I’ve been unable to download the entire payload so far, but I have a few more examples (that aren’t pdf!) I can share from that collection. As far as I know from what I’ve read and looked at, the relationship is 1:1.
MacBinary does look tricky, and I was going to try and tackle that next. I have a few examples from the govdocs corpus, and the spec for macbinary II (http://files.stairways.com/other/macbinaryii-standard-info.txt)says:
It is possible to write a much more robust routine, by checking the
following:
Offsets 101-125, Byte, should all be 0.
Offset 2, Byte, (the length of the file name) should be in the range of 1-63.
Offsets 83 and 87, Long Word, (the length of the forks) should be in the
range of 0-$007F FFFF
So… fun! I’ll try and take a look into that 'BinHex Binary Text' format, too. These Apple header type files are a rabbit hole I’m happy to fall into.
You may find rfc 1740 useful.
https://tools.ietf.org/html/rfc1740#appendix-A
It contains summary descriptions of AppleSingle and AppleDouble formats.
Regards
Matt
--
You received this message because you are subscribed to a topic in the Google Groups "droid-list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/droid-list/G35r2kwtPCY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to droid-list+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/droid-list/04525178-4265-4564-8ba6-b855ed6bda51n%40googlegroups.com.