You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to aaDH: Digital Humanities
Dear colleagues,
Nick Thiebeger and I are looking at some problems in recovering information from legacy materials where formatting of text is carrying a high information load. For example, one of the types of material we are looking at is interlinear glossed text of the kind in the example below. We are exploring the possibility that scanning such material to ALTO XML format would make it possible to access the formatting information and we are very keen to be in touch with any colleagues who have used that technology and would be willing to share their experience with us.
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to aaDH: Digital Humanities
Hi Simon, Nick,
You may know this, and I may be misreading your email. But just in case not... ALTO/XML is used in newspaper digitisation and the specialists at the National Library of Australia will be conversant.
fyi. see the National Library of New Zealand open data set (Papers Past) pilot - the data has been curated and made available with some information on METS and ALTO in XML.