Re: [tesseract-ocr] Cube training tools

98 views
Skip to first unread message

ShreeDevi Kumar

unread,
Dec 5, 2014, 7:34:32 AM12/5/14
to tesser...@googlegroups.com, tesser...@googlegroups.com, Ray Smith

specifically, message from Ray Smith dated 7/15/13

"Cube is a perfect example. It doesn't do much useful, yet now everybody wants it documented, so there is no way I can commit another half-baked experiment that isn't production-ready that everybody will want documented. I have 3 new classifiers in addition to cube that haven't delivered on their early promise. It really is hard to beat the current classifier, although I am starting to understand why a little better.
The good news is that I really really want to get the Google version of the code cleaned up and synced with the outside world this quarter, as there are some improvements in there worth having.
​"​
specifically, message from Ray Smith dated Oct 30, 2014
regarding plans for 3.04 release


ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Fri, Dec 5, 2014 at 1:33 PM, Emil Julius <hoved...@gmail.com> wrote:
Hey, I'm currently planning on writing some training tools for the Cube engine. But I would like to be sure that I'm not reinventing the wheel, as the only documentation I was able to find was: https://code.google.com/p/tesseract-ocr-extradocs/wiki/Cube
Which, I believe is written by one of the guys in this google group?
I'm currently prioritizing tools for:
* cube.size (one of the 2 bigram files)
* cube.bigrams

The tool for cube.bigrams is gonna be designed to take a plain text input file, and then calculate the bigrams and their frequency, then output in the according file format

I'm still trying to figure out a smart way to train the cube.size files, help is very welcome ;-).

Also, what's the current state of the Tesseract project in general?

Sincerly

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9fe19b81-527b-4aa0-8959-17526dfafee7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages