Nick White
unread,Jun 5, 2014, 12:41:25 PM6/5/14Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesser...@googlegroups.com
Hi all,
I recently posted this to the tesseract-dev list, and thought it
might be of interest to some people here too.
I'm preparing to release a bunch of ground truth text & scans, and
the scans contain a mix of Latin and Ancient Greek, mostly in
separate columns. As I'm only interested in testing the Ancient
Greek training for Tesseract, UZN files to identify Ancient Greek
zones, which can then be used when testing the training with
Tesseract, so the Latin can be completely ignored.
It's quite basic, but may perhaps be useful to people here, either
as is, or as an example of a complete program using the C-API for
Tesseract.
Run it without arguments for details of how to use it, example usage
for the above usecase would be:
uznforlang myscan.png grc+lat grc 0.5 > myscan.uzn
Any comments would be very welcome.
Nick