Tesseract Training Error (How can i train Handwriting)

804 views
Skip to first unread message

Tobias Schwarz

unread,
Apr 22, 2014, 7:15:15 AM4/22/14
to tesser...@googlegroups.com

Hi,

I just started the first steps in tesseract so I am really a nobbie ,

the idea is to teach tesseract for Handwriting. 

I use 

tesseract v3.02

cowboxer v1.02

_______________________________________

after i failed to install the the most of the boxtools for tesseract

because of missing dll files and Errors  like this:   -1-

------------------------------------------------------------------


I found the Cowboxer tool, witch as far as I can say works perfectly well.

I have already created  a tif and boxfile for training and given the same name to it.

The next Step now is really a problem, I tried to start training over the Command Line the two ways, like described in the manual here

https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3


Is anyone familiar with the problem and knows how what can be to solve it??    

Or did Handwriteteaching befor and can offer some helpfull practice tipps, tricks, documentation??


-2-

C:\Users\alias\Desktop\Tessdata>tesseract [deu].[handschrift].exp[0].tif [deu

.[handwriting].exp[0] box.train.stderr

Tesseract Open Source OCR Engine v3.02 with Leptonica

Cannot open input file: [deu].[ handwriting].exp[0].tif

 

C:\Users\alias\Desktop\Tessdata>tesseract [deu-frak].[ handwriting].exp[0].tif

[deu-frak].[ handwriting].exp[0] box.train

Tesseract Open Source OCR Engine v3.02 with Leptonica

Cannot open input file: [deu-frak].[ handwriting].exp[0].tif

--------------------------------------------------------------------------------------------------------------------


  -1-

See the end of this message for details on invoking

just-in-time (JIT) debugging instead of this dialog box.

************** Exception Text **************

System.ArgumentException: The UNC path should be of the form \\server\share.

   at System.IO.Path.NormalizePath(String path, Boolean fullCheck, Int32 maxPathLength)

   at System.IO.File.InternalCopy(String sourceFileName, String destFileName, Boolean overwrite)

   at SerakTesseractTrainer.TessMain.copyFiles()

   at SerakTesseractTrainer.MainForm.AddImagesToProject(Object sender, EventArgs e)

   at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)

   at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)

   at System.Windows.Forms.Control.WndProc(Message& m)

   at System.Windows.Forms.ButtonBase.WndProc(Message& m)

   at System.Windows.Forms.Button.WndProc(Message& m)

   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)

 

 

************** Loaded Assemblies **************

mscorlib

    Assembly Version: 4.0.0.0

    Win32 Version: 4.0.30319.1 (RTMRel.030319-0100)

    CodeBase: file:///C:/Windows/Microsoft.NET/Framework64/v4.0.30319/mscorlib.dll

----------------------------------------

SerakTesseractTrainer

    Assembly Version: 1.0.0.0

    Win32 Version: 1.0.0.0

    CodeBase: file:///F:/Tesseract/Tesseract%20InstallationTraining/BoxToolsData/SerakTesseractTrainer_1.exe

----------------------------------------

System.Windows.Forms

    Assembly Version: 4.0.0.0

    Win32 Version: 4.0.30319.1 built by: RTMRel

    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Windows.Forms/v4.0_4.0.0.0__b77a5c561934e089/System.Windows.Forms.dll

----------------------------------------

System.Drawing

    Assembly Version: 4.0.0.0

    Win32 Version: 4.0.30319.1 built by: RTMRel

    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Drawing/v4.0_4.0.0.0__b03f5f7f11d50a3a/System.Drawing.dll

----------------------------------------

System

    Assembly Version: 4.0.0.0

    Win32 Version: 4.0.30319.1 built by: RTMRel

    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System/v4.0_4.0.0.0__b77a5c561934e089/System.dll

----------------------------------------

System.Xml

    Assembly Version: 4.0.0.0

    Win32 Version: 4.0.30319.1 built by: RTMRel

    CodeBase: file:///C:/Windows/Microsoft.Net/assembly/GAC_MSIL/System.Xml/v4.0_4.0.0.0__b77a5c561934e089/System.Xml.dll

----------------------------------------

 

************** JIT Debugging **************

To enable just-in-time (JIT) debugging, the .config file for this

application or computer (machine.config) must have the

jitDebugging value set in the system.windows.forms section.

The application must also be compiled with debugging

enabled.

 

For example:

 

<configuration>

    <system.windows.forms jitDebugging="true" />

</configuration>

 

When JIT debugging is enabled, any unhandled exception

will be sent to the JIT debugger registered on the computer

rather than be handled by this dialog box.

 


Nick White

unread,
Apr 23, 2014, 10:46:03 AM4/23/14
to tesser...@googlegroups.com
Hi Tobias,

You're misreading the wiki slightly. The parts in square brackets in
commands mean "replace this with your actual names as appropriate".

So on the wiki:
tesseract [lang].[fontname].exp[num].tif [lang].[fontname].exp[num] box.train
Means something like this (for example):
tesseract deu.times.exp0.tif deu.times.exp0 box.train

That's a common idiom for computer instructions, so remember it ;)

Nick
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to tesseract-oc...@googlegroups.com.
> To post to this group, send email to tesser...@googlegroups.com.
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/msgid/
> tesseract-ocr/1d2c33e5-8edd-4943-aa10-dc8f9175daef%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Awsomo :(

unread,
Apr 25, 2014, 5:18:32 AM4/25/14
to tesser...@googlegroups.com
Thank you i will keep that in Mind.

After i renamed the tif and boxfiles to:

deu.handwriting.exp0.tif 
and 
deu.handwriting.exp0.box 

i have an result now.

The Image/Tif file opens up after exicuting the command:

deu.handwriting.exp0.tif deu.handwriting.exp0 box.train

the issue now is that in the _Tessdata Folder_ is no _tr_File_generated/shown  and it doesn´t show and report to the training like this

Awsomo :(

unread,
Apr 25, 2014, 6:40:13 AM4/25/14
to tesser...@googlegroups.com
and now its getting worse,
i tried again now i am getting message but that aint better..

 

 

Nick White

unread,
Apr 29, 2014, 10:16:17 AM4/29/14
to tesser...@googlegroups.com
On Fri, Apr 25, 2014 at 02:18:32AM -0700, Awsomo :( wrote:
> the issue now is that in the _Tessdata Folder_ is no _tr_File_generated/shown
> and it doesn´t show and report to the training like this

The .tr files are saved in your working directory. So in this
example they will be in the folder:
C:\Users\Michael\Documents\Visual Studio 2008\Projects\Project1\OCRTest\TIFFMaker

Nick White

unread,
Apr 29, 2014, 10:22:46 AM4/29/14
to tesser...@googlegroups.com
On Fri, Apr 25, 2014 at 03:40:13AM -0700, Awsomo :( wrote:
> and now its getting worse,
> i tried again now i am getting message but that aint better..

The "FAILURE! Couldn't find a matching blob" message means that the
places in the box file that a character is described don't appear to
match up with the image provided. A few of these will be inevitable,
but for every character, something must be wrong. Check your box
file with a GUI box editor (see this wiki page
https://code.google.com/p/tesseract-ocr/wiki/AddOns).

Awsomo :(

unread,
May 2, 2014, 6:00:01 AM5/2/14
to tesser...@googlegroups.com
there is a problem also,
-Visual C+ 2005 installed.
-Visual C+ 2008 installed.
-Visual C+ 2010 installed.

I tried allready to install all of them, is missing or the debugging should be activated 

, within the installation where errors reported like mingwm10.dll
 

i tried allready to fix this, i downloaded the mingwm10.dll didn´t work out 
tried to find information how to activate debugging and how to open visual c+ User interface. 

<configuration>
    <system.windows.forms jitDebugging="true" />
</configuration>
if i execute this over the Command Field i get an syntaxerror.

this was not helpfull to i mean witch dialog they are talking about i have an error dialog where i can´t select options..


or The UNC path should be of the form \\server\share

i see no way to fix this..

Awsomo :(

unread,
May 6, 2014, 7:11:47 AM5/6/14
to tesser...@googlegroups.com

Thank you for respond i am really grateful for that. I solved that problem now!

jrsh...@vizzitec.com

unread,
Jun 3, 2014, 5:58:08 AM6/3/14
to tesser...@googlegroups.com
Hi Awsomo,

                 I just need to know how to solve this JIT  error in the  Serak Tesseract Trainer while pressing Testing Tesseract
i have box file and tif file But still having this error. check the attached Image.

Thanks in advance
Error.jpg

jrsh...@vizzitec.com

unread,
Jun 3, 2014, 6:03:24 AM6/3/14
to tesser...@googlegroups.com

How to slove this issues?




On Tuesday, May 6, 2014 4:11:47 AM UTC-7, Awsomo wrote:
Error.jpg
Reply all
Reply to author
Forward
0 new messages