how to integrate tesseract ocr with Opencv

8,831 views
Skip to first unread message

shari

unread,
Oct 31, 2012, 6:31:14 AM10/31/12
to tesseract-ocr
Dear sir,

The problem is that:

I have a image say "sampleImage.png" which has many text boxes in
it.These text boxes in turn have content it.Now the aim is to read the
content of the desired box.I have explored that tesseract is one of
the best OCR's at present which is used to read data from the image.

Assuming that my opencv code is able to the locate the content present
in the form,But am not able to read the content as am finding
difficult to link opencv and tesseract to do rest of the computation.

In otherwords,

->How to call tesseract OCR within opencv code?
->Which are the header files that have to be included to call
tesseract libraries within the opencv?
->Having done this,finally how to compile the code?any libraries to be
linked on compiling?

Could you please help me in this regard


Thanks a loads in advance!

Nick White

unread,
Oct 31, 2012, 7:44:45 AM10/31/12
to tesser...@googlegroups.com
Hi Shari,

You need to use Tesseract's API. There isn't much documentation on
it, but look through baseapi.h.
http://code.google.com/p/tesseract-ocr/source/browse/trunk/api/baseapi.h

How to compile and link your code depends on the platform. Have you
read the documentation on the wiki? If you have and are unsure of
anything, let us know.

Nick

Phlip

unread,
Oct 31, 2012, 4:34:56 PM10/31/12
to tesseract-ocr
On Oct 31, 3:31 am, shari <shariwi...@gmail.com> wrote:

> ->How to call tesseract OCR within opencv code?

Call tesseract.exe (or equivalent) with a config file option on its
command line:

tesseract.exe sampleImage.png ocrText +config

Inside the config file write the line "tessedit_create_hocr 1".

Use your favorite programming language's "system()" command to shell
to the tesseract executable.

Read the output file, ocrText.html, into an XML reader.

Parse that XML document with an XPath expression, such as //
span[ contains(@title, 'bbox') ]

Parse the returned title to inspect the bbox coordinates, such as
'bbox 7 1 216 56'.

Use a little arithmetic to find bounding boxes inside the boxes that
OpenCV has detected.

> ->Which are the header files that have to be included to call
> tesseract  libraries within the opencv?
> ->Having done this,finally how to compile the code?any libraries to be
> linked on compiling?

I wouldn't mess with source code until exhausting external integration
options. If you need any other detail besides bounding boxes Tess
might have a configuration for it. Read ccmain/tesseractclass.cpp to
learn all the configuration options.

--
Phlip

Andres

unread,
Nov 1, 2012, 10:15:52 AM11/1/12
to tesser...@googlegroups.com
I'm not sure about what you really want, but if you need to pass an OpenCV IplImage or a cv::Mat to Tesseract from inside a program I can give you the snippet.

Cheers,

Andres



2012/10/31 Phlip <phli...@gmail.com>

--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Phlip

unread,
Nov 1, 2012, 12:29:20 PM11/1/12
to tesseract-ocr
On Nov 1, 7:16 am, Andres <andrej...@gmail.com> wrote:

> I'm not sure about what you really want, but if you need to pass an OpenCV
> IplImage or a cv::Mat to Tesseract from inside a program I can give you the
> snippet.

To the OP, we are now talking about three levels of integration...

- shell to tesseract with system('tesseract.exe')
- plug a terresact DLL into your program (Andres's point)
- compile a C program with the tesseract source.

The OP seemed to be discussing the latter. I prefer the former
because, like Machiavelli, I prefer to exhaust persuasion first, and
only then use brute force. So far, everything I need is available from
a command line. Projects with different performance constraints might
need lower options on the list.

--
Phlip
http://zeekland.zeroplayer.com/

Andres

unread,
Nov 1, 2012, 7:13:18 PM11/1/12
to tesser...@googlegroups.com
Here you have a snippet of that. If someone wants to put this in the wiki, I could improve it a little (better comments, etc.). Just let me know.

    cv::Mat image = cv::imread(file_path , 0); // image should be 8 bpp, 1 channel
    std::string lang = "my_trained_file"; // file on disk is "my_trained_file.traineddata"
    tesseract::TessBaseAPI tess_api;
    tess_api.Init("./", lang.c_str(), tesseract::OEM_DEFAULT);
    tess_api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7)); // using 7 here, see available modes in tesseractmain.cpp or here https://groups.google.com/forum/?fromgroups#!searchin/tesseract-ocr/psm/tesseract-ocr/JW7xKH_pH_U/0AYmLVsLqj8J
    tess_api.TesseractRect( image.data, 1, image.step1(), 0, 0, image.cols, image.rows);
    const char *txt = tess_api.GetUTF8Text();
    char *boxes = tess_api.GetBoxText(0);

Cheers,

Andres


2012/11/1 Phlip <phli...@gmail.com>

sharisha shanbhag

unread,
Nov 2, 2012, 1:33:46 AM11/2/12
to tesser...@googlegroups.com
Hi Nick,

Also to talk about "How to compile and link your code depends on the platform".
Its a Linux platform.To be specific it is Ubuntu 12.04 version.

Shari


Nick

Nick White

unread,
Nov 2, 2012, 5:20:11 AM11/2/12
to tesser...@googlegroups.com
On Thu, Nov 01, 2012 at 10:01:06PM -0700, shari wrote:
> I read the article from the link provided from your previous mail.But the
> article says we need to have a license copy of the tesseract-OCR related file
> in order to make them working within our code.Is that true?

A license copy? Tesseract is distributed under the Apache 2.0
license, so you don't need any extra permissions to do whatever you
want with it (basically). You certainly don't need to buy a license
or anything silly like that.

sharisha shanbhag

unread,
Nov 2, 2012, 5:53:08 AM11/2/12
to tesser...@googlegroups.com
hey Nick,

U r right.I too understand by what is an apche license.
But the truth is that if i include "tesseractmain.h" in my opencv code it gives me an fatal error.what could be the reason?
Am still not able to figure out how to integrate opencv and tesseratc ocr?

shari



shari

unread,
Nov 2, 2012, 6:59:32 AM11/2/12
to tesseract-ocr
what if my image is IPlmage *img;then?

how does the code modify?

On Nov 2, 4:13 am, Andres <andrej...@gmail.com> wrote:
> Here you have a snippet of that. If someone wants to put this in the wiki,
> I could improve it a little (better comments, etc.). Just let me know.
>
>     cv::Mat image = cv::imread(file_path , 0); // image should be 8
> bpp, 1 channel    std::string lang = "my_trained_file"; // file on
> disk is "my_trained_file.traineddata"    tesseract::TessBaseAPI
> tess_api;    tess_api.Init("./", lang.c_str(),
> tesseract::OEM_DEFAULT);
> tess_api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7)); //
> using 7 here, see available modes in tesseractmain.cpp or herehttps://groups.google.com/forum/?fromgroups#!searchin/tesseract-ocr/p...
>    tess_api.TesseractRect( image.data, 1, image.step1(), 0, 0,
> image.cols, image.rows);    const char *txt = tess_api.GetUTF8Text();
>   char *boxes = tess_api.GetBoxText(0);
>
> Cheers,
>
> Andres
>
> 2012/11/1 Phlip <phlip2...@gmail.com>

zdenko podobny

unread,
Nov 2, 2012, 9:14:41 AM11/2/12
to tesser...@googlegroups.com
On Fri, Nov 2, 2012 at 10:53 AM, sharisha shanbhag <shari...@gmail.com> wrote:
hey Nick,

U r right.I too understand by what is an apche license.
But the truth is that if i include "tesseractmain.h" in my opencv code it gives me an fatal error.what could be the reason?

There are several examples how to use tesseract API in forums and issues e.g. 

For simple usage you need to include baseapi.h from tesseract and allheaders.h from leptonica.

BTW: python-tesseract[1] (python wrapper for tesseract) included patch[2] that enable to use OpenCV image in tesseract. You should be able to adapt it for your need.

-- 
Zdenko

Andres

unread,
Nov 2, 2012, 5:56:50 PM11/2/12
to tesser...@googlegroups.com
So if you have:

IplImage *img;

load your image in img, delete this line:

cv::Mat image = cv::imread(file_path , 0); 

and replace with:

cv::Mat image(img); 

And remember:
- you will have to deallocate (release) img by your own, because in this case cv::Mat won't do that for you.
- image has to be 8 bpp.



2012/11/2 shari <shari...@gmail.com>

zdenko podobny

unread,
Nov 7, 2012, 4:58:18 PM11/7/12
to tesser...@googlegroups.com


On Wed, Nov 7, 2012 at 8:02 AM, shari <shari...@gmail.com> wrote:
hey all,

Thanks for actively participating in this discussion on how to integrate tesseract ocr with opencv library.
But the solution provided is not yet achieved as linking the two(tesseract ocr-opencv)has become a challenge.

And so,have attached a image which gives u a clear idea about what kind of input is
The numbers are only marked for your ease understanding which are nothing but the location of the text available.

Suppose its a input image where  i have to read the contents within the box.

If there is nothing just the rectangle with text, you can pass image to tesseract.
If there is something around rectangle with text (e.g. you want to ignore everything outside rectangle) you need to:
  1. identify rectangle coordinates (with some opencv function or maybe with GetConnectedComponents from tesseract api)
  2. use SetRectangle after SetImage from tesseract api
 
then how to code that.am not using any of the python tesseract library as in my case programming language used is c++.

It would be of great help if somebody comes up within the snippet to achieve this.

Do you mean something like this?

#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <opencv2/opencv.hpp>

int main() {
    tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
    if (api->Init("/usr/src/tesseract-ocr/", "eng"))  {
      fprintf(stderr, "Could not initialize tesseract.\n");
      return 1;
    }

    IplImage *img = cvLoadImage("/home/user/sampleimage.png");
    if ( img == 0 ) {
      fprintf(stderr, "Cannot load input file!\n");
      return 1;
    }
    api->SetImage((unsigned char*)img->imageData, img->width,
                   img->height, img->nChannels, img->widthStep);

    // be aware of tesseract coord systems starting at left top corner!
    api->SetRectangle(129, 184, 484, 108);
    char* outText = api->GetUTF8Text();
    printf("OCR output:\n\n");
    printf(outText);

    api->Clear();
    api->End();
    delete [] outText;
    delete api;
    cvReleaseImage(&img);

    return 0;
}


--
Zdenko
Reply all
Reply to author
Forward
0 new messages