You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
So, this is an extension to my problem in case someone skipped the title for the spacing problem. Pretty much I want to analyze the spacing problem using hocr, but hocr only gives bounding box for word output. So I would like to know if there is a file in tessdata/configs that I can modify to get the character bounding box output from hocr, so far I have not found a post through Google Search so I am not sure if such a technique exist. Ignoring the api way for now.
Chang Alden
unread,
Nov 12, 2015, 8:55:12 AM11/12/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
It seems it has to do with enabling the api.GetBoxText option, anyone know how to get it to work?
Chang Alden於 2015年11月12日星期四 UTC+8上午9時43分42秒寫道:
Chang Alden
unread,
Nov 12, 2015, 10:18:06 AM11/12/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Alright I got it, just type makebox in option, it seems everything else in the configs folder can be accessed this way as well.
Chang Alden於 2015年11月12日星期四 UTC+8下午9時55分12秒寫道:
Helmut Wollmersdorfer
unread,
Nov 12, 2015, 1:13:11 PM11/12/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Sorry, in which option do you write it? Sound like the shell console, and you get a box-file. Or have you found how to get single character boxes in hOCR?
Chang Alden
unread,
Nov 12, 2015, 11:15:18 PM11/12/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Hi,
With "makebox" you get the coordinates for the box for each character of the image you input and scanned with tesseract, it is not in html format (sorry about the confusion). I didn't work on training so I didn't know such option exists.
Helmut Wollmersdorfer於 2015年11月13日星期五 UTC+8上午2時13分11秒寫道:
Helmut Wollmersdorfer
unread,
Nov 15, 2015, 4:50:14 PM11/15/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message