Integrating Tesseract with another open source project

220 views
Skip to first unread message

Thilanka

unread,
May 21, 2010, 1:21:55 PM5/21/10
to tesseract-ocr
Hi,

I'm working with a the Sahana OCR project for my gsoc session.
In this I'm planning to use Tesseract for the character recognition in
the Sahana OCR project(is it an opensource project). The Sahana OCR
code has written in Visual C++. We cannot use the Tesseract exe for
our project. So I'm planing to join the Tesseract code with the Sahana
OCR code. But I don't have a good understanding about the Tesseract
Architecture and how I can integrate the two sources codes of the
Sahana and Tesseract together. So can some one please helpm me on this
problem.

Regards,
Thilanka.



--
http://coders-view.blogspot.com/
http://thilankagekawuluwa.blogspot.com/
http://twitter.com/thilanka_k

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
To unsubscribe from this group, send email to tesseract-oc...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

Zdenko Podobný

unread,
May 22, 2010, 4:40:13 AM5/22/10
to tesser...@googlegroups.com
see http://code.google.com/p/tesseract-ocr/wiki/ReadMe:

Another important change is that you should really be using TessBaseAPI if you are linking with another program. In Linux (non-Windows) the main library is now libtesseract_api.a instead of the old libtesseract_full.a. In windows, use the define TESSDLL_IMPORTS before including baseapi.h in your code to get the symbols of the TessBaseAPI class.

Zd.

Thilanka

unread,
May 22, 2010, 1:38:39 PM5/22/10
to tesseract-ocr
Hi Zdenko,

Thank you very much for the tips. I'll contact you if
I face any problem on this.

Regards,
Thilanka.

On May 22, 1:40 pm, Zdenko Podobný <zde...@gmail.com> wrote:
> seehttp://code.google.com/p/tesseract-ocr/wiki/ReadMe:
>
>     Another important change is that you should *really* be using
>     TessBaseAPI if you are linking with another program. In Linux
>     (non-Windows) the main library is now libtesseract_api.a instead of
>     the old libtesseract_full.a. In windows, use the define
>     TESSDLL_IMPORTS before including baseapi.h in your code to get the
>     symbols of the TessBaseAPI class.
>
> Zd.
>
> Dn(a 21.05.2010 19:21, Thilanka  wrote / napísal(a):
>
>
>
> > Hi,
>
> >         I'm working with a the Sahana OCR project for my gsoc session.
> > In this I'm planning to use Tesseract for the character recognition in
> > the Sahana OCR project(is it an opensource project). The Sahana OCR
> > code has written in Visual C++. We cannot use the Tesseract exe for
> > our project. So I'm planing to join the Tesseract code with the Sahana
> > OCR code. But I don't have a good understanding about the Tesseract
> > Architecture and how I can integrate the two sources codes of the
> > Sahana and Tesseract together. So can some one please helpm me on this
> > problem.
>
> > Regards,
> > Thilanka.
>
> > --
> >http://coders-view.blogspot.com/
> >http://thilankagekawuluwa.blogspot.com/
> >http://twitter.com/thilanka_k
>
>
>
>  smime.p7s
> 5KViewDownload

zdenko podobny

unread,
May 23, 2010, 2:53:00 PM5/23/10
to tesser...@googlegroups.com
Hi,

it will be better to use forum than to contact me (I am not programmer - I am just user that try to read documentation :-) )

Zd.

david.torne

unread,
May 23, 2010, 7:40:17 PM5/23/10
to tesseract-ocr
I am a programmer (Engineer) also and integrating tessearct to the
gttext.googlecode.com project.

I found this, have a look...

http://code.google.com/p/tesseract-ocr/issues/detail?id=297

Good Luck in your Google Summer of Code project!!

I have tested tessearct and it is fun to see how sometimes big letters
like H are l - l etc...
Quite out of context yet. Waiting the new release.
(ey, of course I did not put easy instances)

Cheers



On May 21, 7:21 pm, Thilanka <lgtkausha...@gmail.com> wrote:
> Hi,
>
>         I'm working with a the Sahana OCR project for my gsoc session.
> In this I'm planning to use Tesseract for the character recognition in
> the Sahana OCR project(is it an opensource project). The Sahana OCR
> code has written in Visual C++. We cannot use the Tesseract exe for
> our project. So I'm planing to join the Tesseract code with the Sahana
> OCR code. But I don't have a good understanding about the Tesseract
> Architecture and how I can integrate the two sources codes of the
> Sahana and Tesseract together. So can some one please helpm me on this
> problem.
>
> Regards,
> Thilanka.
>
> --http://coders-view.blogspot.com/http://thilankagekawuluwa.blogspot.com/http://twitter.com/thilanka_k

Thilanka

unread,
May 25, 2010, 12:49:14 PM5/25/10
to tesseract-ocr
HI Zdenko Podobny,


I tried to write sample code as your instructions. I have add all the
header files and the lib files to the project.
Then I have included the baseapi.h and wrote as follows.

#define TESSDLL_IMPORTS
#include "stdafx.h"
#include "baseapi.h"
#include <string>

using namespace std;

int main(int argc, char **argv)
{
string outfile;
tesseract::TessBaseAPI api;

return 0;
}

When I added the code <tesseract::TessBaseAPI api;> it gives 3 errors
saying that

1>Testing_tesseract.obj : error LNK2019: unresolved external symbol
"__declspec(dllimport) public: virtual __thiscall
tesseract::TessBaseAPI::~TessBaseAPI(void)" (__imp_??
1TessBaseAPI@tesseract@@UAE@XZ) referenced in function _main
1>Testing_tesseract.obj : error LNK2019: unresolved external symbol
"__declspec(dllimport) public: __thiscall
tesseract::TessBaseAPI::TessBaseAPI(void)" (__imp_??
0TessBaseAPI@tesseract@@QAE@XZ) referenced in function _main
1>C:\Users\hp\Documents\Visual Studio 2008\Projects\Testing_tesseract
\Debug\Testing_tesseract.exe : fatal error LNK1120: 2 unresolved
externals.

So how can I solve this problem. Can you please help me on this
problem. Thanks in advance.

Regards,
Thilanka

On May 22, 1:40 pm, Zdenko Podobný <zde...@gmail.com> wrote:
> seehttp://code.google.com/p/tesseract-ocr/wiki/ReadMe:
>
>     Another important change is that you should *really* be using
>     TessBaseAPI if you are linking with another program. In Linux
>     (non-Windows) the main library is now libtesseract_api.a instead of
>     the old libtesseract_full.a. In windows, use the define
>     TESSDLL_IMPORTS before including baseapi.h in your code to get the
>     symbols of the TessBaseAPI class.
>
> Zd.
>
> Dn(a 21.05.2010 19:21, Thilanka  wrote / napísal(a):
>
>
>
> > Hi,
>
> >         I'm working with a the Sahana OCR project for my gsoc session.
> > In this I'm planning to use Tesseract for the character recognition in
> > the Sahana OCR project(is it an opensource project). The Sahana OCR
> > code has written in Visual C++. We cannot use the Tesseract exe for
> > our project. So I'm planing to join the Tesseract code with the Sahana
> > OCR code. But I don't have a good understanding about the Tesseract
> > Architecture and how I can integrate the two sources codes of the
> > Sahana and Tesseract together. So can some one please helpm me on this
> > problem.
>
> > Regards,
> > Thilanka.
>
> > --
> >http://coders-view.blogspot.com/
> >http://thilankagekawuluwa.blogspot.com/
> >http://twitter.com/thilanka_k
>
>
>
>  smime.p7s
> 5KViewDownload

Jimmy O'Regan

unread,
May 25, 2010, 1:08:46 PM5/25/10
to tesser...@googlegroups.com
On 25 May 2010 17:49, Thilanka <lgtkau...@gmail.com> wrote:
> HI Zdenko Podobny,
>
>
> I tried to write sample code as your instructions. I have add all the
> header files and the lib files to the project.

It seems from your error that the headers are included ok, but the lib
isn't - that's what you should be checking.

--
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.

david.torne

unread,
May 26, 2010, 9:09:11 PM5/26/10
to tesseract-ocr
I added tessdll.bin in project properties->linker->input
and in C++ also In properties the folder where is located the binary.
I directly pointed to main tessarac folder in disk.
Message has been deleted

zdenko podobny

unread,
Oct 4, 2013, 1:39:10 PM10/4/13
to tesser...@googlegroups.com
sample code is on wiki (APIExample)

Zdenko


On Fri, Oct 4, 2013 at 4:40 PM, Veerendra Jonnalagadda <jveer...@gmail.com> wrote:
Hi
I am looking for sample code for extraction of text from image, Kindly share sample codes soi that it would be helpful in understanding....
Regards
Veerendra 

--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
 
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages