question on training tesseract for arbitrary big images

24 views
Skip to first unread message

morteza neishaboori

unread,
Jun 27, 2014, 4:51:01 AM6/27/14
to tesser...@googlegroups.com
Hi,
I'm new to tesseract
I want to use OCR to detect small words in images containing indoor signs and etc
you can find some sample images in the link below to get the idea

I used default tesseract english training, for all of these photos I get an empty result
now I want to know if I'm doing something wrong?!
or it's possible at all to do OCR in such images using tesseract?!

I will be happy to have some hints.

Kind Regards
Mori

Nick White

unread,
Jun 27, 2014, 1:08:38 PM6/27/14
to tesser...@googlegroups.com
Hi Mori,

On Fri, Jun 27, 2014 at 01:51:01AM -0700, morteza neishaboori wrote:
> I want to use OCR to detect small words in images containing indoor signs and
> etc
> you can find some sample images in the link below to get the idea
> https://drive.google.com/folderview?id=0B3dLM0w0EeD-RFZVc1NjaGNqUlE&usp=sharing
>
> I used default tesseract english training, for all of these photos I get an
> empty result
> now I want to know if I'm doing something wrong?!
> or it's possible at all to do OCR in such images using tesseract?!

It is possible, but first you need to remove as much of the non-text
as possible, and only give Tesseract the image of the text. There
are other people who have done similar things on this list, I
recommend you look through the archives to find more information on
good ways of doing this.

Nick
Reply all
Reply to author
Forward
0 new messages