Tracking the operations in Tesseract ( Debugging Process)

1,363 views
Skip to first unread message

sibi kanagaraj

unread,
Jul 25, 2014, 1:08:51 AM7/25/14
to tesser...@googlegroups.com
Dear all ,

I would like to see hoe tesseract works . Say , how line segmentation happens , how word recognition and classification happens .etc . how am I supposed to "see" it . The command "tessaract" with input and output files give me output . But , I need to see step by step execution . In short ,  a debugging process with various watch points . Please let me know . I am very eager to learn the engine than the stand alone output .

-Sibi

zdenko podobny

unread,
Jul 25, 2014, 3:15:46 AM7/25/14
to tesser...@googlegroups.com
Have a look at ViewerDebugging wiki[1] Dmitri Silaev blog[2] and maybe Slides from Tutorial on Tesseract presented at DAS2014[3].


Zdenko


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0c2fc40f-d85c-4f3c-9dba-1769b7680084%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

sibi kanagaraj

unread,
Jul 25, 2014, 10:47:53 AM7/25/14
to tesser...@googlegroups.com
Hi Zdenko ,

Thank you for the reply . I would check them and post back the results and queries . I am testing the engine for Tamil . If I am able to see the module by module work then it would be of great help of me to remove ambiguities and work on it .

Once again thank you for the wonderful links .

-Sibi

sibi kanagaraj

unread,
Jul 31, 2014, 2:33:10 AM7/31/14
to tesser...@googlegroups.com
Hi ,

This is with respect the the debugging process .


I have followed the steps given here .

///
"

Building and installing

On Linux:
  • Copy piccolo2d-core-3.0.jar and piccolo2d-extras-3.0.jar to tesseract/java.
  • cd java
  • make ScrollView.jar
  • Set the SCROLLVIEW_PATH environment variable to point to your java directory containing all 3 jar files."

///

Here the problem which I facing is that

////////////////////////////////////////////////////////////////////
[root@localhost java]# make ScrollView.jar
make: *** No rule to make target `ScrollView.jar'.  Stop
///////////////////////////////////////////////////////////////////

What am I supposed to do at this instance to get it cleared ?

- Sibi

zdenko podobny

unread,
Aug 2, 2014, 4:23:41 PM8/2/14
to tesser...@googlegroups.com
You need to provide more information.... What version of tesseract do you use? How did you configured tesseract? etc...

Zdenko


sibi kanagaraj

unread,
Aug 27, 2014, 10:28:57 AM8/27/14
to tesser...@googlegroups.com
Hello Zednko ,

Sorry for my late(very late) response . Initially I was working with Fedora 19 now I have switched to Ubuntu .

After the switch ,

1.I installed Tesseract using

sudo apt-get install tesseract-ocr

2.Then I downloaded two jar files(
piccolo2d-core-3.0.jar and piccolo2d-extras-3.0.jar) .Using nautalius moved them to  tesseract/java.

3.After that I cd into Java

4.Then I gave the make  ScrollView.jar

5. It gave me error as

make: *** No rule to make target `ScrollView.jar'.  Stop.

Extra information :
In the tesseract.spec file , I see that the

Name:           tesseract
Version:        3.00
Release:        1%{?dist}
Summary:        Raw Open source OCR Engine

If its needed to go to tesseract-ocr-3.02.02.tar.gz  and download it and build it along with leptonica also I am ready to do it .

zdenko podobny

unread,
Aug 27, 2014, 5:43:46 PM8/27/14
to tesser...@googlegroups.com
Yes, you will need to download tesseract source code, and configure it:
./autogen.sh && ./configure
then "make  ScrollView.jar" should work for you. 


Zdenko


sibi kanagaraj

unread,
Oct 12, 2014, 12:58:29 AM10/12/14
to tesser...@googlegroups.com
Hi Zdenko ,

You nailed it .

Here is what I did initially .

Having followed this ,
https://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging
I ,
1.Downloaded the 2 jar files
2.Created a new folder in Tesseract-ocr which was under /usr/share
3.Hence now I have /usr/share/tesseract-ocr/java
4.Then I cd into java
5.Ran the command make ScrollView.jar
//
sibi@Sibi:/usr/share/
tesseract-ocr/java$ ls
piccolo2d-core-3.0.jar  piccolo2d-extras-3.0.jar
sibi@Sibi:/usr/share/tesseract-ocr/java$ make ScrollView.jar
make: *** No rule to make target `ScrollView.jar'.  Stop.
//

Later on I cloned the source
https://code.google.com/p/tesseract-ocr/source/checkout

Probably that must help in it  as I see a java folder inside Tesseract-ocr .

I have another doubt , probably that would go out of scope of this question , so will create another thread and link it if necessary .

Pranav Parmar

unread,
Jan 11, 2020, 4:37:29 AM1/11/20
to tesseract-ocr
Hi Sibi !
Did you manage to understand the working process of Tesseract ?
I have installed the debugging tools but i cannot understand the step by step process that Tesseract implements for each input.
I want to understand the working and train tesseract on new fonts efficiently.
Reply all
Reply to author
Forward
0 new messages