mftraining produces "Missing font_properties"

1,811 views
Skip to first unread message

Eyal

unread,
May 17, 2011, 3:08:53 AM5/17/11
to tesser...@googlegroups.com
Hi,

I tried to train some letters & when I ran the mftraining with the parameters:
mftraining -U unicharset -O lang.unicharset font1.tr I recieved an error message: "Missing font_properties".

I'm working on windows 7, visual studio 2010.

When I used the already compiled mftraining.exe for windows I do NOT getting this error & I'm getting decent results from the trained file.

Just to be sure the text I have is not problematic, I did the same test on eurotext.tif and I'm still getting the error.

Any clue?

Thank you!

zdenko podobny

unread,
May 17, 2011, 3:17:19 AM5/17/11
to tesser...@googlegroups.com
Yes I have a clue: you did not read documentation [1] neither you did not "google" for solution ;-)

Zdenko

 
Thank you!

--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Eyal

unread,
May 17, 2011, 5:58:24 AM5/17/11
to tesser...@googlegroups.com
Quite a good guess, but I'm very disappointed to to say - I DID read the documentation...

And I even run the following command:

mftraining -F font_properties -U unicharset font1.tr

And I got results which don't show any error... :

Reading font1.tr ...

Writing Merged Microfeat ...Done!

The font_properties file contains one line as follows:

font1 0 0 0 0 0

And then I run the command:

mftraining -U unicharset -O lang1.unicharset font1.tr

And I'm getting the following results:

Reading font1.tr ...
font1 has no defined properties.
!"Missing font_properties entry is a fatal error!":Error:Assert failed:in file ..\training\mftraining.cpp, line 287

Another guess?

zdenko podobny

unread,
May 17, 2011, 11:55:09 AM5/17/11
to tesser...@googlegroups.com
Why you did not run 'mftraining -F font_properties -U unicharset -O lang1.unicharset font1.tr??? 
There is written that font_properties is required for 3.01 (e.g. you do not need to use '-F font_properties' for 3.00, but you need to use it for 3.01. 

BTW: 'mftraining --help' will show other options for mftraining

Zdenko

Eyal

unread,
May 18, 2011, 7:15:02 AM5/18/11
to tesser...@googlegroups.com
WOW!!!

It worked.

If you'll look again at the training manual, you'll see that there wasn't a combination of both -F & -O and that's why I didn't write such command.

I start to think - It seems sometimes there's a need to understand the instuctions and not just run them like a robot...

Thank you very much!!!

78yrsold

unread,
May 18, 2011, 7:58:00 AM5/18/11
to tesseract-ocr
congratulations! it would be nice to give full commandline used by
you for
benefit of users of the forum.

Eyal

unread,
May 18, 2011, 10:22:59 AM5/18/11
to tesser...@googlegroups.com
The only difference is the line I changed, but I'll send the whole batch I wrote:

tesseract.exe font1.tif font1 nobatch box.train

unicharset_extractor font1.box

mftraining -F font_properties -U unicharset  -O lang1.unicharset font1.tr 
cntraining font1.tr

copy normproto lang1.normproto
copy inttemp lang1.inttemp
copy pffmtable lang1.pffmtable
copy Microfeat lang1.Microfeat 

combine_tessdata lang1.

Good luck to all of you.

Sriranga(78yrsold)

unread,
May 18, 2011, 10:29:37 AM5/18/11
to tesser...@googlegroups.com
Eval,
Thank you very much for your prompt reply. I find it very useful for the users.
Awaiting your whole batch you wrote. Keep it up good work.
Wish you Good Luck and Success in your good mission,
-sriranga(78yrs)

--

Sriranga(78yrsold)

unread,
May 18, 2011, 12:55:24 PM5/18/11
to tesser...@googlegroups.com
Dear Eval,
 small modification =
instead of "copy normproto lang1.normproto" why not use as
"rename normproto lang1.normproto" etc.
With Best Wishes,
-sriranga(78yrs)

On Wed, May 18, 2011 at 7:52 PM, Eyal <tza...@gmail.com> wrote:

--

Eyal

unread,
May 19, 2011, 3:24:54 AM5/19/11
to tesser...@googlegroups.com
Sriranga,

I wrote "rename" in my first version of the batch but when I already had a file with the same name, the rename command failed.

Anything else?

Eyal.

Sriranga(78yrsold)

unread,
May 19, 2011, 6:19:05 AM5/19/11
to tesser...@googlegroups.com
Eval,
I tested with "rename" it works for me. Only what i have done is replaced "copy"
in the commandline with " rename" nothing else. extract of cmd is reproduced below for ready reference.

J:\New000-r527>dir no*.*
 Volume in drive J is Disk J
 Volume Serial Number is D067-55AF

 Directory of J:\New000-r527

05/17/2011  04:59 PM            10,584 normproto
               1 File(s)         10,584 bytes
               0 Dir(s)   1,951,952,896 bytes free

J:\New000-r527>rename normproto lan1.normproto

J:\New000-r527>dir la*.*
 Volume in drive J is Disk J
 Volume Serial Number is D067-55AF

 Directory of J:\New000-r527

05/17/2011  04:59 PM            10,584 lan1.normproto
               1 File(s)         10,584 bytes
               0 Dir(s)   1,951,952,896 bytes free
J:\New000-r527>

2)Better to create single  .bat file for all the commandlines you wrote.
I am not well versed with creation of .bat files -which automate all functions of commandlines.

-sriranga(78yrs)


--

Eyal

unread,
May 19, 2011, 8:24:43 AM5/19/11
to tesser...@googlegroups.com
Just copy all the lines in one file - call it train.bat & double click it.

The importance of the "copy" is when you'll have allready a file with that name.

Try to create a file named: heb.inttemp and put in it some text.

you'll see that the training will fail.

Good luck,

Eyal

Sriranga(78yrsold)

unread,
May 19, 2011, 9:42:57 AM5/19/11
to tesser...@googlegroups.com
Eval,
Thanks for the valuable guidance.
Yes. If " copy" is used  the conttents of the existing file will be replaced.
With Best of GOOD LUCK,
-sriranga(78yrs)


Eyal

--

zdenko podobny

unread,
May 19, 2011, 10:03:13 AM5/19/11
to tesser...@googlegroups.com
On Wed, May 18, 2011 at 1:15 PM, Eyal <tza...@gmail.com> wrote:
WOW!!!

It worked.

If you'll look again at the training manual, you'll see that there wasn't a combination of both -F & -O and that's why I didn't write such command.

I will try to improve wiki pages (e.g. AddOns) in next days. If you or others have some project (that use tesseract) or if you have found some bugs/mistake on wiki - just let me know ;-)

--
Zdenko
 

Robert Komar

unread,
May 19, 2011, 12:46:21 PM5/19/11
to tesser...@googlegroups.com

I think "MOVE /Y" should work better than COPY or RENAME in
that case.

Cheers,
Rob Komar

Nick White

unread,
Sep 20, 2012, 6:44:36 AM9/20/12
to tesser...@googlegroups.com
Hi Delli,

You need to make a little file called font_properties, as explained
in the training guide here:
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3#font_properties_(new_in_3.01)

Nick

On Thu, Sep 20, 2012 at 02:03:54AM -0700, delli wrote:
> Hi ,
> I am trying to train the data by using tesseract ,am getting the problem
> in mftrainig like
> C:\Program Files\Tesseract-OCR\training>mftraining -F font_properties -U
> unichar
> set -O eng.unicharset eng.fontfile.exp0.tr
> Failed to load unicharset from file unicharset
> Building unicharset for mftraining from scratch...
> Failed to load font_properties
> Reading eng.fontfile.exp0.tr ...
> fontfile has no defined properties.
> !"Missing font_properties entry is a fatal error!":Error:Assert failed:in file
> .
> .\training\mftraining.cpp, line 281
>
> i have attached the files also please tell me that where i am doing the mistake
> ,
> sorry for my bad english
>
> thanks in advance,
> Delli
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to tesser...@googlegroups.com
> To unsubscribe from this group, send email to
> tesseract-oc...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en

> 018020300405060070 01800203904005006A070 01020030490500607080
> 010200304B0500607C080090A 01020030470506 010002 010260300405 010
> 001702003040005060 00100203004050600700 01002903A040506007B08C
> 0102H0G3I0040506007J08090A0B00K0D0E0F 0010000002030000
> 01026003907A408CB5 000100200300405 018029030405000607
>




zdenko podobny

unread,
Sep 20, 2012, 10:12:34 AM9/20/12
to tesser...@googlegroups.com
On Thu, Sep 20, 2012 at 11:03 AM, delli <dilliba...@gmail.com> wrote:
Hi ,
    I am trying to train the data by using tesseract ,am getting the problem  in mftrainig  like
C:\Program Files\Tesseract-OCR\training>mftraining -F font_properties -U unichar
set -O eng.unicharset eng.fontfile.exp0.tr
Failed to load unicharset from file unicharset

 
Building unicharset for mftraining from scratch...
Failed to load font_properties
fontfile has no defined properties.
!"Missing font_properties entry is a fatal error!":Error:Assert failed:in file .
.\training\mftraining.cpp, line 281

i have attached the files also please tell me that where i am doing the mistake ,
sorry for my bad english 

thanks in advance,
Delli 

--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en



--
Zdenko

bottu dilli babu

unread,
Sep 21, 2012, 12:15:12 AM9/21/12
to tesser...@googlegroups.com
Hi Zdenko,

I run the unicharset_extractor, but i am failing to create the font properties file ,i gave the command as in the training process but it is giving error like unexpected symbol please help me 


Thanks in advance 
Delli 

zdenko podobny

unread,
Sep 21, 2012, 2:50:01 AM9/21/12
to tesser...@googlegroups.com
post your unicharset file and font_properties files.

-- 
Zdenko

bottu dilli babu

unread,
Sep 24, 2012, 2:34:16 AM9/24/12
to tesser...@googlegroups.com
Hi Zdenko,
 
Here am sending the unicharset file and font_properties file and also please tell me how to create unicharambigs i follwed training process but i dint get it
 
sorry for late reply.
 
Thanks,
Delli

bottu dilli babu

unread,
Sep 24, 2012, 2:44:18 AM9/24/12
to tesser...@googlegroups.com
font_properties
unicharset

zdenko podobny

unread,
Sep 24, 2012, 5:01:11 PM9/24/12
to tesser...@googlegroups.com
unicharset file you sent in not correct output from unicharset_extractor for eng.fontfile.exp0.box. How did you created it?
font_properties do not follow wiki[1].
your eng.fontfile.exp0.tif do now follow wiki[2] ("It is ABSOLUTELY VITAL...")

Francisco Loché Costa

unread,
Sep 24, 2012, 2:51:59 PM9/24/12
to tesser...@googlegroups.com
Did you call the shapeclustering instruction after mftraining? And the unicharset extractor?

If not, these are the instructions:

>unicharset_extractor [font].[name].exp0.box
>shapeclustering -F font_properties -U unicharset [font].[name].exp0.tr

Works for me on tesseract 3.02.

2012/9/20 delli <dilliba...@gmail.com>
Hi ,
    I am trying to train the data by using tesseract ,am getting the problem  in mftrainig  like
C:\Program Files\Tesseract-OCR\training>mftraining -F font_properties -U unichar
set -O eng.unicharset eng.fontfile.exp0.tr
Failed to load unicharset from file unicharset
Building unicharset for mftraining from scratch...
Failed to load font_properties
fontfile has no defined properties.
!"Missing font_properties entry is a fatal error!":Error:Assert failed:in file .
.\training\mftraining.cpp, line 281

i have attached the files also please tell me that where i am doing the mistake ,
sorry for my bad english 

thanks in advance,
Delli 

--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en



--
  Francisco Loché Costa,
  Ingeniero Técnico de Telecomunicación, esp. Telemática.

Quan Nguyen

unread,
Nov 12, 2012, 9:38:07 AM11/12/12
to tesser...@googlegroups.com
The Powershell script train.ps1 on AddOns page can help automate the training process.

http://code.google.com/p/tesseract-ocr/wiki/AddOns
Reply all
Reply to author
Forward
Message has been deleted
0 new messages