In line with the shift towards more accessible services I'm hearing of pushes towards adoption of PDF/UA (Universal Accessibility), but PRONOM currently doesn't identify them.
Identification is easy enough - e.g. for PDF/UA-1 we want to find a <pdfuaid:part> tag with a value of '1'
However, a file can be both, neither, or one of PDF/UA file and PDF/A, which complicates priority somewhat.
In the past a general aim for PRONOM has been to avoid known clashes so that each identification event gets a single outcome, which is managed through setting priorities where appropriate. For example in a typical case the more specific PDF/UA would get priority over the less specific PDF 1.4.
However it will be of value to know if a file is both, for example, PDF/A-2u and PDF/UA-1
So do we:
a) create a new PDF/UA-1 entry (with priority over PDF 1.x) and accept that sometimes we might find files that identify as both as this and some PDF/A variant
b) create a PDF/UA-1 entry, and further entries named something like 'PDF/A-2u with PDF/UA-1 compliance'. In this case I believe we'd need entries for each of the 8 PDF/A- 1-3 variants, and these would need priority over those plus the various PDF 1.x entries.
It all gets very tangled.
I'm keen to hear others' thoughts.
David