How can JTessBoxEditor generate lstm files ?

582 views
Skip to first unread message

fadif...@gmail.com

unread,
May 12, 2018, 7:40:27 AM5/12/18
to tesseract-ocr
I am trying to add a few new characters to the arabic character set and 
train for them by fine tuning using JtessBoxEditor v2 beta.

The box/tiff pairs are generated succesfully, but when I apply the executable trainer a .tr file and ara.traineddata are generated instead of .lstm file. According to docs, a lstm file should be generated in order to start lstmtraining. Please, tell me where am I wrong?.

Quan Nguyen

unread,
May 14, 2018, 10:02:22 PM5/14/18
to tesseract-ocr
As of today, it supports only legacy training (i.e., 3.0x version).

Training for 4.0x is described in the Training Wiki.

Fadi Fawzi

unread,
May 17, 2018, 3:35:50 AM5/17/18
to tesser...@googlegroups.com
Thanks  Quan
But is there a simple way to do training  process on WINDOWS, or I must adhere to Linux (Ubuntu) ?

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/00cd6b54-3ed2-45e4-afbf-aa3c3f166e74%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Quan Nguyen

unread,
May 17, 2018, 5:11:54 PM5/17/18
to tesseract-ocr
Those .sh shell scripts would not run on Windows environment. You may need Cygwin or Windows Subsystem for Linux. Hope others who have experience on this will chime in.


On Thursday, May 17, 2018 at 2:35:50 AM UTC-5, Fadi Fawzi wrote:
Thanks  Quan
But is there a simple way to do training  process on WINDOWS, or I must adhere to Linux (Ubuntu) ?
On Tue, May 15, 2018 at 5:02 AM, Quan Nguyen <nguy...@gmail.com> wrote:
As of today, it supports only legacy training (i.e., 3.0x version).

Training for 4.0x is described in the Training Wiki.


On Saturday, May 12, 2018 at 6:40:27 AM UTC-5, fadif...@gmail.com wrote:
I am trying to add a few new characters to the arabic character set and 
train for them by fine tuning using JtessBoxEditor v2 beta.

The box/tiff pairs are generated succesfully, but when I apply the executable trainer a .tr file and ara.traineddata are generated instead of .lstm file. According to docs, a lstm file should be generated in order to start lstmtraining. Please, tell me where am I wrong?.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

Joshua Willmot

unread,
May 18, 2018, 2:03:21 PM5/18/18
to tesseract-ocr
I am using Windows Subsystem for Linux (Ubuntu). It works in exactly the same way as it would on normal Ubuntu. 

ShreeDevi Kumar

unread,
May 18, 2018, 10:13:16 PM5/18/18
to tesser...@googlegroups.com
I use WSL with Moboxterm on Windows 10.

fadif...@gmail.com

unread,
May 19, 2018, 5:18:24 AM5/19/18
to tesseract-ocr
Thanks for all. I will use Cygwin for now.
Reply all
Reply to author
Forward
0 new messages