how to pass image "directly" to tesseract

1,368 views
Skip to first unread message

zl2k

unread,
Mar 28, 2011, 12:53:59 PM3/28/11
to tesseract-ocr
hi, all

My application will generate bunch of separate binarized characters
and I need to feed the ocr engine for each of them. It will be very
costly if save each of them on disk as a tiff file and then call
tesseract. Is there a by pass so that my application (C++) can
directly call ocr and pass the image to it? Your comments are highly
appreciated.

zl2k

Saurabh Gandhi

unread,
Mar 28, 2011, 10:59:00 PM3/28/11
to tesser...@googlegroups.com, zl2k
what format is your image in? it is very much possible to pass the raw image data to tesseract directly instead of saving it on the disk...

--
Regards,
Saurabh Gandhi





--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
To unsubscribe from this group, send email to tesseract-oc...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.


Quan Nguyen

unread,
Mar 29, 2011, 1:07:09 AM3/29/11
to tesseract-ocr
You certainly can use tessdll.dll of Tess 2.04. A 3.0x DLL is not yet
available.

Dmitri Silaev

unread,
Mar 29, 2011, 2:51:01 AM3/29/11
to tesser...@googlegroups.com, zl2k
See "baseapi.h", functions

TesseractRect()
SetImage() (reportedly, Pix* variant is more likely to be supported)
SetRectangle()

Then

Recognize()
or Get*Text()

Since your images are binary, use 1 bit per pixel.

Having implemented this approach, you will get image passing work
extremely fast.

Warm regards,
Dmitri Silaev

Dmitri Silaev

unread,
Mar 29, 2011, 3:02:42 AM3/29/11
to tesser...@googlegroups.com, zl2k
Also you can refer to recent post at
https://groups.google.com/d/msg/tesseract-ocr/EFQrmyVbPKM/4oXJSreuO1MJ
or serach for the "Wrappers for tessearct3.01?" topic

Warm regards,
Dmitri Silaev

Vicky Budhiraja

unread,
Mar 29, 2011, 2:00:57 AM3/29/11
to tesser...@googlegroups.com
Hello,

As a suggestion, this is what you can do:
- Go to tesseractmain.cpp and look for the constructor
- Check the function call SetImage(), which takes first param as uinT8 type
buffer, which is the image data
- Pass on your own buffer

You need to write the routines for bringing in your binary data (bunch of
separate binarized characters). For that you can use the tessDLLs and supply
your structs in tesserect system, directly.

Hope this helps!

--
Vicky Budhiraja
http://www.sitarasoft.com/

hi, all

zl2k

--

zl2k

unread,
Mar 29, 2011, 10:23:19 AM3/29/11
to tesseract-ocr


On Mar 29, 1:00 am, "Vicky Budhiraja" <vicky.budhir...@sitarasoft.com>
wrote:
> Hello,
>
> As a suggestion, this is what you can do:
> - Go to tesseractmain.cpp and look for the constructor
> - Check the function call SetImage(), which takes first param as uinT8 type
> buffer, which is the image data
> - Pass on your own buffer
>
> You need to write the routines for bringing in your binary data (bunch of
> separate binarized characters). For that you can use the tessDLLs and supply
> your structs in tesserect system, directly.
>
> Hope this helps!
>
> --
> Vicky Budhirajahttp://www.sitarasoft.com/
>
>
>
>
>
>
>
> -----Original Message-----
> From: tesser...@googlegroups.com [mailto:tesser...@googlegroups.com]
>
> On Behalf Of zl2k
> Sent: Monday, March 28, 2011 22:24
> To: tesseract-ocr
> Subject: how to pass image "directly" to tesseract
>
> hi, all
>
> My application will generate bunch of separate binarized characters
> and I need to feed the ocr engine for each of them. It will be very
> costly if save each of them on disk as a tiff file and then call
> tesseract. Is there a by pass so that my application (C++) can
> directly call ocr and pass the image to it? Your comments are highly
> appreciated.
>
> zl2k
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to tesser...@googlegroups.com.
> To unsubscribe from this group, send email to
> tesseract-oc...@googlegroups.com.
> For more options, visit this group athttp://groups.google.com/group/tesseract-ocr?hl=en.

Thanks for all the informative comments to help me to get the clue.
Regards,
zl2k

srinivasan

unread,
May 4, 2011, 8:58:18 AM5/4/11
to tesseract-ocr
Hi,

Im using tesseract in android. I set the image using SetImage(const
unsigned char* imagedata,int width, int height,
int bytes_per_pixel, int bytes_per_line),
then calling getUTF8Text() but Im getting empty string.

What is the problem here?

Help me out.

thanks,
srinivasan
Reply all
Reply to author
Forward
0 new messages