Using tesseract-ocr from c/c++ programs

203 views
Skip to first unread message

Tenzin Dendup

unread,
Sep 29, 2009, 2:25:35 AM9/29/09
to tesser...@googlegroups.com
Hi,

I have trained characters of language in tesseract and i have also
written a simple program in C to do segmentation. Once the
segmentation is done, how can i use tesseract-ocr from within my c
program to do recognition. Do I have to run the tesseract command from
within C or is there some other way? I am using tesseract-ocr 2.04 on
Debian GNU/Linux.

--Tenzin

Svetlin Nakov

unread,
Sep 29, 2009, 3:18:10 AM9/29/09
to tesser...@googlegroups.com
You have the entire source code. You can modify it to fit your needs.

I usually take the tessearct source code as base and remove anything I don't
need and write additional functionality in its main method.

Svetlin Nakov
Managing Partner
Consulting and Information Technology Agency
http://www.citagency.eu

现彪 邵

unread,
Sep 29, 2009, 3:47:41 AM9/29/09
to tesseract-ocr
Im new to Tesseract-OCR.

Add "system("tesseract_path/tesseract image_file output_file");" at
the end
of you C program should be the easiest way.

You can also add you C program into Tesseract-OCR. Just write the
pixel data
of you segmented image into the buffer of tesseract.

On 9月29日, 下午3时18分, "Svetlin Nakov" <svet...@nakov.com> wrote:
> You have the entire source code. You can modify it to fit your needs.
>
> I usually take the tessearct source code as base and remove anything I don't
> need and write additional functionality in its main method.
>
> Svetlin Nakov
> Managing Partner
> Consulting and Information Technology Agencyhttp://www.citagency.eu
>
>
>
> -----Original Message-----
> From: tesser...@googlegroups.com [mailto:tesser...@googlegroups.com]
>
> On Behalf Of Tenzin Dendup
> Sent: Tuesday, September 29, 2009 9:26 AM
> To: tesser...@googlegroups.com
> Subject: Using tesseract-ocr from c/c++ programs
>
> Hi,
>
> I have trained characters of language in tesseract and i have also
> written a simple program in C to do segmentation. Once the
> segmentation is done, how can i use tesseract-ocr from within my c
> program to do recognition. Do I have to run the tesseract command from
> within C or is there some other way? I am using tesseract-ocr 2.04 on
> Debian GNU/Linux.
>
> --Tenzin- 隐藏被引用文字 -
>
> - 显示引用的文字 -

xbshao

unread,
Sep 29, 2009, 4:03:03 AM9/29/09
to tesseract-ocr
btw,

I am wondering how to do the segmentation. Does your C code
automatically locate the characters in the image, or manually crop the
image with UI?

any help would be appreciate, thx in advance
> > - 显示引用的文字 -- 隐藏被引用文字 -
>
> - 显示引用的文字 -

Tenzin Dendup

unread,
Sep 29, 2009, 4:46:34 AM9/29/09
to tesser...@googlegroups.com
The language that I am working on has characters stacking and
overlapping one on top of another. some stack characters touch, some
dont. So, I used a simple way using Vertical histogram, horizontal
histogram and connected component to separate the characters. Those
that are inseparable, i trained in tesseract as it is.
Reply all
Reply to author
Forward
0 new messages