I'm new to XTF and I'm interested in what kinds of formats XTF has
support for and what level of support is provided out-of-the-box. I've
been told by an apparently knowledgeable acquaintance that XTF will
work with any XML format (which seems to good to be true). I'm
thinking in particular of:
* ePub and XPS files with DRM (from Adobe and Microsoft respectively)
* TEI generally (I'm from a TEI background)
* TEI corpus files (which break both the one-document-per-file
assumption and the sane-file-size assumption)
* Office Open XML files (i.e. modern Microsoft Office files)
* Open Document files (i.e. modern OpenOffice.org files)
* Parallel texts (i.e. texts originally published as facing-page
translations but digitised in ways that maintain the synchronisation
between the two texts) These are challenging because they have a
single title which has two renditions, each in a different language.
* Texts containing characters than don't qualify for inclusion in
Unicode. See for example the 'wh' ligature in
http://www.nzetc.org/tm/scholarly/tei-Auc1911NgaM-t1-body-d4.html
which resorts to the use of images
* Texts containing characters that do qualify for inclusion in Unicode
but not yet supported by the sevlet container. These are typically
broken by frameworks that always attempt to do unicode normalisation
on reading character data. See for example the mangling of the Linear
B script on
http://www.nzetc.org/tm/scholarly/name-003325.html
Any hints?
cheers
stuart