Tesseract OCR in C# Application

852 views
Skip to first unread message

Raj

unread,
Apr 8, 2009, 3:32:09 AM4/8/09
to tesseract-ocr
Hi Friends,

I have generated the tessdata with the required 8 files to
recognize 7-segment display.and
when i use the command

"Tesseract.exe input.tif output.txt -l sun" where sun is the
language name.

the Tesseract OCR engine is Working perfectly and procuding 100%
results as expected without any ambiguity.

Now, I want to use this in C# Application.

For this i'm using EmguCV 1.5.0.1 where there is an example in C# for
OCR.This example is given with tessdata for English Language.

I replaced the all the 8 files in tessdata with the files what i have
created.and when i executed the application it is not producing any
result and running successfully without any errors.

Has anybody tel me why it is not giving results when i use it in the
application and working perfectly when it is not run in the
application.

Please Help me in solving the problem.

Thanks in advance.

Regards,
Sunder Raj .M




Lothar

unread,
Apr 8, 2009, 5:39:10 AM4/8/09
to tesseract-ocr
Did you try to use tessnet2 instead of EmguCV?

You will find it here: http://www.pixel-technology.com/freeware/tessnet2/

Best regards,
Lothar

Parmesh

unread,
Apr 8, 2009, 5:52:24 AM4/8/09
to tesser...@googlegroups.com
Hi Lothar,
 
Given link is not working.
 
Cheers,
Parmesh


Parmesh

unread,
Apr 8, 2009, 5:54:46 AM4/8/09
to tesser...@googlegroups.com
Hey sorry, it is working now
--
Thanks&Regards
Parmesh. A
voice:+919866443307

Raj

unread,
Apr 9, 2009, 1:32:24 AM4/9/09
to tesseract-ocr
HI Lother,

Thanks for the reply.

In the OCR applicaiton i have included Tessnet2 dll's reference.

when i run the (in C#) application with tessdata for english language,
the application is working perfectly and is producing output but with
some ambiguity.

To avoid that i created the 8 files of Tessdata and i confirmed that
it is working fine(100% results)when using the command
"tesseract.exe input.tif output.txt -l sun" .

So, i thought of replacing the tessdata for english from my OCR
application with the tessdata what i have created.

But the application is not producing any result and the application is
terminating when executing this statement "ocr.init()".

But when i change the tessdata with the tessdata of English language
again it is producing result.

I have no clue of this strange behaviour.

and it seems no problem with the tessdata what i have created. b'coz
it is producing results when excuting independent of the C#
application.

Pls,suggest me where i'm going wrong.

Thanks in advance.
Regards.
Sunder Raj.M
> > Sunder Raj .M- Hide quoted text -
>
> - Show quoted text -

mrigendra lal shrestha

unread,
May 5, 2009, 7:47:19 AM5/5/09
to tesser...@googlegroups.com
hey sundar,
i also had similar problem with tesseract. it worked very nicely in
command prompt but not at all in vc++ dot net.
well u should first convert the image into 1bpp(1 bit per pixel)
format (search in google to convert image into 1bpp....i forgot the
name but im sure that its(code) there.....i thinks it's a VB code so u
have to modify a little)
use that code to convert your image into 1bpp bitmap image then pass
it to tesseract for further processing......this really works.....if
not mail me back....i'll send u a link to my (me and my supervisor)
version of tesseract that was built using c++ dot net 2003.....though
its not complete but works fine for english language!!!

note: 1 bpp is not supported in advance version rather than dot net 2003!!!!

cheers,
yaamchha!!!

Raj

unread,
May 6, 2009, 6:18:15 AM5/6/09
to tesseract-ocr
Hi shrestha,
Thanks for the reply.
As i said i'm using EmguCV 1.5.0.1(cross platform dot net wrapper for
openCV) now my c# application is producing correct results.

As u said, i'm first converting the image into single color(Gray
scale) and then processing the image for noise elimination.The result
i'm then passing it to tesseract OCR Engine.

It is working perfectly.

my only problem now is that the image should not have any noise if it
does it is producing result with some extra characters.

Right now i'm working on noise elimination from the image before
passing it to tesseract.

Anyway thanks for the reply.

I would definately like to see ur application.

waiting for ur reply.




On May 5, 4:47 pm, mrigendra lal shrestha <mrigenshres...@gmail.com>
wrote:
> hey sundar,
> i also had similar problem with tesseract. it worked very nicely in
> command prompt but not at all in vc++ dot net.
> well u should first convert the image into 1bpp(1 bit per pixel)
> format  (search in google to convert image into 1bpp....i forgot the
> name but im sure that its(code) there.....i thinks it's a VB code so u
> have to modify a little)
> use that code to convert your image into 1bpp bitmap image then pass
> it to tesseract for further processing......this really works.....if
> not mail me back....i'll send u a link to my (me and my supervisor)
> version of tesseract that was built using c++ dot net 2003.....though
> its not complete but works fine for english language!!!
>
> note: 1 bpp is not supported in advance version rather than dot net 2003!!!!
>
> cheers,
> yaamchha!!!
>
> On 4/8/09, Raj <mail2sun....@gmail.com> wrote:
>
>
>
>
>
> > Hi Friends,
>
> >        I have generated the tessdata with the required 8 files to
> > recognize 7-segment display.and
> > when i use the command
>
> >        "Tesseract.exe input.tif output.txt -l sun" where sun is the
> > language name.
>
> > the Tesseract OCR engine is Working perfectly and procuding 100%
> > results as expected without any ambiguity.
>
> > Now, I want to use this in C# Application.
>
> > For this i'm using EmguCV 1.5.0.1 where there is an example in C#  for
> > OCR.This example is given with tessdata for English Language.
>
> > I replaced the all the 8 files in tessdata with the files what i have
> > created.and when i executed the application it is not producing any
> > result and running successfully without any errors.
>
> >  Has anybody tel me why it is not giving results when i use it in the
> > application and working perfectly when it is not run in the
> > application.
>
> > Please Help me in solving the problem.
>
> > Thanks in advance.
>
> > Regards,

SteveP

unread,
May 6, 2009, 1:49:51 PM5/6/09
to tesseract-ocr
I ran into this once where the application terminated in ocr.init...
I remember there was something about needing to have/copy the exe file
into the directory at the higher level. Maybe it mattered whether it
was a Release build or Debug. I have to keep this quick; you can
look in my previous posts for more details. Remi Thomas has also
posted some things to check for.
> > - Show quoted text -- Hide quoted text -
Reply all
Reply to author
Forward
0 new messages