tessedit_char_whitelist not working for euro symbol

51 views
Skip to first unread message

Daniël Vos

unread,
Aug 3, 2018, 2:53:45 PM8/3/18
to tesseract-ocr
Hi all,

I'm having some trouble reading receipts using tesseract. For my specific use case I have defined a file letters under /tessdata with the following contents:

tessedit_char_whitelist abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,.0123456789:-#/€$%


Then when I call tesseract as follows, the result file never contains euro symbols:

tesseract kobalt.jpg result -l nld letters

It is strange to me because running without the config file (tesseract kobalt.jpg result -l nld) does find euro symbols. Am I doing something wrong or does tesseract not support euro symbols in the whitelist? I am running on windows 10 using tesseract 3.05.02.
Reply all
Reply to author
Forward
0 new messages