* iulib -- basic image processing
* ocropus -- OCR-specific functionality (libraries and some
command line programs)
* ocroswig -- bindings of iulib and ocropus to Python
* ocropy -- Python library and command line tools
* pyopenfst -- Python bindings of the OpenFST library
Please see the InstallTranscript to see how this is installed.
There is plenty of new functionality:
* all recognition can now be carried out from Python
* there are top-level commands for recognition and training
written in Python
* classifiers now can cope with large character sets
* there are tools for clustering and correcting character shapes
* there is support for ligatures
* there are numerous bug fixes
* training is possible on very large datasets (many millions of
samples)
We will be calling this release 0.4.4, since there is still some
functionality missing for what we want to call 0.5:
* the Python tools do not yet do a good job at upper/lower case
modeling (but we have good prototype code that just needs to be
integrated)
* the language models need to be tested and improved
* we need to integrate the book-adaptive recognition tools into
the Python code
* Unicode support needs to be integrated into the Python loops
* the main loop of the RAST layout analysis will be rewritten in
Python
* there will be some new layout analysis that works for distorted
pages
* we need to integrate our orientation detection and text/image
segmentation code
* we want to get rid of the makefiles
Install instructions are here:
http://code.google.com/p/ocropus/wiki/InstallTranscript
Tom
We'll probably provide a single tarball
--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
Hi,
essentially, the next release is already out; just follow the instructions on the web site: