Ambiguity of smaller than character "<"

67 views
Skip to first unread message

Meltem Çetiner

unread,
Mar 16, 2016, 2:54:32 AM3/16/16
to tesseract-ocr
Hi all,

I trained tesseract and It works pretty good except smaller than character. All < chars are read as C. And I tried to solve issue as;
-I created ambigs file. It didnt fix.
-I deleted all C chars . This time all < chars are read as E and 4 .

I have mostly < chars more than C or E or 4. I would expected C chars are read as < chars . What could be the reason ? Do you have any idea?

Thank you, Have a nice day

Edson Luis Moretti

unread,
Mar 16, 2016, 12:37:14 PM3/16/16
to tesseract-ocr
Do you have the "<" character in the Whitelist parameter of Tesseract?
If you don't, this could be the problem

Meltem Çetiner

unread,
Mar 16, 2016, 5:21:29 PM3/16/16
to tesser...@googlegroups.com
Thanks for reply =) How can I check  it? I checked unicharset and also traineddata .They have thıs smaller than character . When I looked at the some config options I also added this character in digits file, but the result was same. 

--
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/FgRjb_1q5KM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/fa6c60c1-7781-4b62-998d-bc5437dba427%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Meltem  Çetiner

Meltem Çetiner

unread,
Mar 17, 2016, 4:28:32 AM3/17/16
to tesser...@googlegroups.com
I gave it as tesseract parameter. And I also tried with another dataset ; It works, It can read less than chars. Then It is not a tesseract problem .Therefore I have a new question ;  What could be the problem of my data? Because As I said before I have more less than char samples than C or 4 chars.
--
Meltem  Çetiner
Reply all
Reply to author
Forward
0 new messages