Phantom Characters

30 views
Skip to first unread message

Stefan Greiner

unread,
Feb 12, 2016, 2:10:42 AM2/12/16
to tesseract-ocr
I'm using Tesseract 3.04 (953523b) with Tess4J API in a Java application.

Since the upgrade to Tesseract 3.04 (953523b) I'm getting phantom characters with the current german "deu.traineddata".

For example: The word "Marineverband" is recognised in different Versions:

-> Marine_a_verband
-> Marine_e_verband
-> Marine_verband
-> Marine_ayerband

screenshot added with source data

I get a lot of these phantom characters, I'll add different examples in the following days.
Tesseract_screen_001.jpg
Reply all
Reply to author
Forward
0 new messages