Match text output to uzn

40 views

Skip to first unread message

blubzel

unread,

Nov 7, 2016, 9:29:07 AM11/7/16

to tesseract-ocr

Hi,

i am using tesseract to extract data from tables in documents. Therefor i specify the zones for all cells in an uzn file.

I can match the extracted data to the individual cells, if there are no empty cells. But if there are empty cells they are not represented in the output text file.

So the Question is, is there a way to match the data in the output text file to the regions specified in the uzn file? Or otherwise include empty regions as blank line (or any other form) into the output text file? The only other way i found is to process every cell one by one.