How to identify / track if a PUID of an item has changed over years?

40 views
Skip to first unread message

prasad rao

unread,
Jun 2, 2014, 4:05:23 AM6/2/14
to droid...@googlegroups.com
Hi Experts,

I am not sure whether this is a valid question, but here goes ..

I understand that DROID generates an items PUID on the basis of its internal byte sequence ( Start & End sequences upto X bytes ) and the MIME Type of the file.

Is it possible that the PU ID representation of any such file can change over years in the PRONOM registry?

For example: If a plain word document was represented as "fmt/411", can one safely assume that this PRONOM representation will never change?

If it does, how does one track it in an automated fashion? ( Any pointers are appreciated )

We are looking to integrate DROID with our DMS solution and it is quite important that any such changes are tracked and the relevant metadata is updated in our solution.

Regards
Prasad Rao

Dclipsham

unread,
Jun 2, 2014, 5:30:42 AM6/2/14
to droid...@googlegroups.com
Hi Prasad, thank you for your interest in DROID and PRONOM,
 
We aim to create a single PUID for each distinct format we add to the PRONOM registry. A PUID may contain information such as file extensions, mime/media types, descriptions, links to specifications and most importantly from our perspective, signature sequences for identification.
 
On occassion we do have need to update certain entries within a PUID. This is because our research is not static and we may discover problems, or improvements we can apply to a particular entry. Often these are identified and contributed by members of the wider DROID/PRONOM user community. It may be that a particular signature sequence is too weak, so identification is clashing with a different format, or that our initial understanding of the construction format was flawed - most often if a file format technical specification is unavailable and we have therefore had to make certain presumptions about an identifiaction mechanism.
 
We do not take such changes lightly, and endeavour to do this only where such a change improves that accuracy of the registry.
 
Since March 2010, we have recorded these changes on our 'release notes' page on The National Archives website - http://www.nationalarchives.gov.uk/aboutapps/pronom/release-notes.xml. for more significant changes (i.e. those which may fully alter the identifation of a particular format, then we will write a more detailed explanation. We're always happy to answer any questions about why we may have made a change.
 
In terms of tracking chages to the registry, one could perhaps take a 'diff' between two consecutive signature files. Our historical signature files are also available on The National Archives website: http://www.nationalarchives.gov.uk/aboutapps/pronom/droid-signature-files.htm
 
I hope this helps. Please let me know if you have further queries or any concerns.
 
David Clipsham (Support Engineer at The National Archives, presently with lead responsibility for the PRONOM registry)
Reply all
Reply to author
Forward
0 new messages