Random numbers

622 views
Skip to first unread message

Ray Smith

unread,
Apr 24, 2014, 6:21:05 PM4/24/14
to tesser...@googlegroups.com
Issue 1134 has highlighted an upcoming issue for the next version (3.04, or maybe 4.00).

The next version is likely to include a new classifier implementation that uses neural nets. (bye bye cube probably.) As such, the training process makes hefty use of random numbers.

Now it looks like the intersection of decent* random number functions in the standard libraries of linux and windows alone is the empty set. According to the linux manual rand_r and erand48 are obsolete, and random_r is recommended, but that doesn't exist in windows. The rand_r use in the current code is already #ifdefed out for windows and android, so that isn't portable either.

Anyone know of anything small and portable that meets the minimum spec below? Or do I have to roll my own?

*Min spec:
Must operate on an externally provided buffer, so that multiple threads can obtain independent, reproducible streams.
Must be reasonably random, but don't really care about either the size of the space or the true randomness. This isn't for encryption!

If there is no common standard library function, a simple LCG will do: http://en.wikipedia.org/wiki/Linear_congruential_generator

Ray.

Nick White

unread,
Apr 29, 2014, 12:01:57 PM4/29/14
to tesser...@googlegroups.com
Hi Ray,

The new training tools don't support Windows now anyway. I'm
tempted to say we should consider dropping support for training on
Windows, or not do the work to make new training tools work on it.
It's certainly worthwhile having Tesseract run on Windows, but if
people are serious about training it perhaps it's alright to require
them to use a reasonable operating system...

We do see people doing training on Windows systems on the mailing
list, but they're generally the ones who have a great deal of
difficulty figuring out how to use the command line. Maybe people
are actually doing good, useful training from Windows, but I don't
can't think of anyone.

Nick
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to tesseract-de...@googlegroups.com.
> To post to this group, send email to tesser...@googlegroups.com.
> Visit this group at http://groups.google.com/group/tesseract-dev.
> To view this discussion on the web visit https://groups.google.com/d/msgid/
> tesseract-dev/
> CAGuE8nU-%2BDOKUhFrJbY%2BKBLN8QaeDcf5fDT8dp2tHTFP1dLGNQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages