--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a5704736-173a-4e21-a532-26595d94589b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/00b095b7-3428-4146-91dc-53e60959d4bc%40googlegroups.com.
Tesseract’s API allows you to get at the character level coordinates. One idea is to look at the vertical position of the characters and try to identify the subscripts by their position. If detected, you could extract the gylph programmatically and run Tesseract on it as a single character, which might give more accurate output.
art
--