Poor result with horizontal lines in text

39 views
Skip to first unread message

Moritz Leiss

unread,
Oct 23, 2014, 10:25:34 AM10/23/14
to tesser...@googlegroups.com
Hi folks,
i've been playing a while with tesseract and opencv to get the best out of my scans. 
But lately i came across this problem:
I need to scan a bunch of documents which are printed by an old needle-printer(I suppose), which
has a thin "no-ink"-line horizontally through the text (s. attached pichture).
With these documents i get no or very poor results. 
Could some one point me in the right direction how to get tesseract to read them? Is there some
image-preprocessing I could do? Or do I have to train tesseract this "broken font"? (...that would
be bad, because this line is not alway at the same position within the font).

All help welcome :)

Thanks,
Mo

Moritz Leiss

unread,
Oct 23, 2014, 10:41:17 AM10/23/14
to tesser...@googlegroups.com
Sorry, forgot the Attachmet.. here it is :)
Btw. the output is:
"E...

A_.
1..

O
1..

2
2

1..
0
A1

1..
A1
A1
1..
8

O
0
5
no
:2.
8

5
"
hor_line_problem.png
Reply all
Reply to author
Forward
0 new messages