OCRopus / OCRopy Python-based OCR package using recurrent neural networks.

OCRopus is really a collection of document analysis programs, not a turn-key OCR system.

In addition to the recognition scripts themselves, there are a number of scripts for ground truth editing and correction, measuring error rates, determining confusion matrices, etc. OCRopus commands will generally print a stack trace along with an error message; this is not generally indicative of a problem (in a future release, we'll suppress the stack trace by default since it seems to confuse too many users).

To recognize pages of text, you need to run separate commands: binarization, page layout analysis, and text line recognition.

The OCRopus OCR system is hosted at: https://github.com/tmbdev/ocropy

Showing 1-19 of 829 topics
avoid columns segmentation fabio forno 9/9/15
OCRopus 0.6: possibility for command-line use only & for operating with Tesseract Martin Reynaert 9/8/15
Demo code available? jbest 9/8/15
No Output after ocroscript recognize Debayan 9/8/15
rtrain - input file selected at random? Ankit Agarwal 8/28/15
How to avoid segmenting pages into columns? Gyula Sámuel Karli 7/30/15
Why is Tesseract so much more popular than Ocropus? maxim...@gmail.com 7/16/15
OCRopus / ocropy updates Tom 7/8/15
Border Noise Removal Everest 6/22/15
Training error when using ocropus-rtrain Faida 6/17/15
Book layout element recognition Christoph Holtermann 6/5/15
Network in details 贺盼 3/24/15
can i recognize Chinese by ocropus ? yang 3/16/15
implementation details of ctc in lstm.py Ajinkya Kulkarni 3/9/15
Trying to understand implementation details of lstm.py sudeep raja putta 3/4/15
ocropus-rpred compile error in python2.7 on windows Sen.T 2/28/15
clstm python setup.py install errors 贺盼 1/14/15
Blog post on running & training Ocropus Dan Vanderkam 1/12/15
Had this project been closed? why i could not get the correct source code from the moved website? Libin Sui 12/17/14
More topics »