Trainer GUI for Tesseract version 4.0

2,025 views
Skip to first unread message

Nalin Linux

unread,
Jun 24, 2017, 9:29:03 AM6/24/17
to tesseract-ocr
I where developing a Tesseract trainer GUI which makes Tesseract training easier for end users and research scholars.  
It was working for version 3.04. Now I am concerned about the relevance of this trainer GUI for Tesseract version 4.0. 

Please watch following video which shows my trainer GUI for version 3.04 

Please let me know the necessity of upgrading this trainer for Tesseract 4.0.
Any suggestion are welcome.  

Tesseract Trainer GUI Github Page : https://github.com/Nalin-x-Linux/lios-3

ShreeDevi Kumar

unread,
Jun 24, 2017, 9:37:32 AM6/24/17
to tesser...@googlegroups.com
You can update it for 3.05.01

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1ec6e44b-d358-4a51-bf34-d9ab3e50642e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nalin Linux

unread,
Jun 24, 2017, 9:55:05 AM6/24/17
to tesseract-ocr


On Saturday, June 24, 2017 at 7:07:32 PM UTC+5:30, shree wrote:
You can update it for 3.05.01

I am quit impressed with Tesseract 4.0. And it's working fine for my language (Malayalam). Is this trained data for version 4.0 listed in https://github.com/tesseract-ocr/tessdata 
created from old language data itself ? (https://github.com/tesseract-ocr/langdata).   What about creating a training GUI for version 4.0 ? I have two months of time at my disposal for developing such a GUI. 
Please let me know the relevance of this project or else let me switch to another relevant free and opensource project. 

Thanking you Nalin.  
 

ShreeDevi Kumar

unread,
Jun 24, 2017, 10:56:54 AM6/24/17
to tesser...@googlegroups.com
Take a look at https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 for an overview of  training for 4.0. Follow the tutorials to get a feel of the training process - you can try for English as well as Malayalam.

In terms of  trainer GUI, I think that it will probably work for `fine tune` training.

Areas where you could contribute re 4.0 training would be in creating box files in 4.0 format from scanned images.

Also look at jtessboxeditor which offers tesseract training gui - though not for 4.0.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
Reply all
Reply to author
Forward
0 new messages