OCRopus 0.6 (pre1)

364 views
Skip to first unread message

Tom

unread,
Aug 17, 2012, 10:31:35 PM8/17/12
to ocr...@googlegroups.com
OCRopus 0.6pre1 has been released.  It features much simpler installation, fewer dependencies, and improved recognition rates.  This is the first all-Python release.  Please follow the instructions on http://www.ocropus.org/ (installation is really just a couple of simple steps).

There are three scripts you should run after installation:

(1) "run-test" runs a simple recognition test
(2) "run-box-training" (in fraktur-boxes) trains a Fraktur recognizer from Tesseract-style box files
(3) "run-uw3-500" (in uw3-500) shows how training works on line-by-line transcribed data

Tom

Sriranga(78yrsold)

unread,
Aug 17, 2012, 11:58:18 PM8/17/12
to ocr...@googlegroups.com
TOM,
I am happy for releasing latest version. I may kindly be clarified whether verssion 0.6 pre is able to train Kannada lang(Indic)-UTF-8?
With warmest regards,
-sriranga(79yrs)


Tom

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/cqNJKNpg718J.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Tom

unread,
Aug 18, 2012, 4:01:43 AM8/18/12
to ocr...@googlegroups.com
Well, in principle, all the pieces are there.  If you have created Tesseract box files for training, you can adapt the example in fraktur-boxes to get started.

Tom

Sriranga(78yrsold)

unread,
Aug 18, 2012, 4:07:35 AM8/18/12
to ocr...@googlegroups.com, Aravinda VK
Tom,
Thanks for the clarification. Yes I had created number of tesseract box files for Kannada for generating traineddata file in tesseract-ocr. I shall study the example fraktur-boxes for starting and feedback to you soon.
With warmest Regards,
-sriranga(79yrs)

To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/CAfTMJ63JNkJ.

Sriranga(78yrsold)

unread,
Aug 19, 2012, 11:01:25 AM8/19/12
to ocr...@googlegroups.com
Tom,
Small doubt - in the absence of  installed ocropus(previous version 5.0) or deleted -
whether  I have to download "hg clone https://code.google.com/p/ocropus/" first and then follow the steps below:
$ sudo apt-get install mercurial curl python-scipy python-matplotlib python-tables firefox
    $ hg clone
-r ocropus-0.6pre3 https://code.google.com/p/ocropus
    $ cd ocropus
/ocropy
    $ python setup
.py download_models
    $ sudo python setup
.py install
    $
./run-test
Awaiting guidance - since I several attempts could not installed
satisfactory and then deleted all.
with regards,
-sriranga(79yrs)


On Sat, Aug 18, 2012 at 8:01 AM, Tom <tmb...@gmail.com> wrote:

Tom

--

Robert Schmidt

unread,
Aug 20, 2012, 11:26:17 AM8/20/12
to ocr...@googlegroups.com
On Friday, August 17, 2012 10:31:35 PM UTC-4, Tom wrote:
OCRopus 0.6pre1 has been released.  It features much simpler installation, fewer dependencies, and improved recognition rates.  This is the first all-Python release.  Please follow the instructions on http://www.ocropus.org/ (installation is really just a couple of simple steps).

 

Thanks for this!

I'm relatively new to ocropus, but have tried to use it a couple of times in the past. This release seems to be going in the right direction for me.

Biggest problem seems to be finding version relevant documentation. I'm happy you have updated some of the main pages with some good examples.

stinger

unread,
Aug 21, 2012, 5:17:57 PM8/21/12
to ocr...@googlegroups.com
I've run through the installation process, and am trying to run the run-test script.  I keep getting the error "expected a segmentation with white background" when running ocropus-ngraphs (see output below) - it's failing because the max of 255 is being checked against an expected value of 0xffffff.  However, if I modify the code, and change 0xffffff to 0xff the run-test script works.  Not sure if this is a bug?

+ true
+ true language model application
+ true
+ ocropus-ngraphs 'temp/????/??????.lattice'
loading /usr/local/share/ocropus/en-mixed-4.ngraphs
processing 92 files
temp/0001/010001.lattice =NGRAPHS= 21.29    BOOK REVIEIP
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-ngraphs", line 294, in <module>
    rseg = ocrolib.read_line_segmentation(rname)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/toplevel.py", line 194, in argument_checks
    result = f(*args,**kw)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 207, in read_line_segmentation
    result = make_seg_black(image)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/toplevel.py", line 190, in argument_checks
    raise e
ocrolib.toplevel.CheckError:
CheckError for argument 'image' in call to function: '<function make_seg_black at 0x3810aa0>'
<ndarray-13b67fd0 (60, 583) int32 [1,255]> of type <type 'numpy.ndarray'>: expected a segmentation with white background

stinger

unread,
Aug 21, 2012, 5:41:41 PM8/21/12
to ocr...@googlegroups.com
Also, when running the run-uw3-500 script, it fails at ocopus-tsplit with the following error - as if the book.h5 file is corrupt?  Any help is appreciated:

ocropus-tsplit -d book.h5 -o book.tsplit --maxsplit 100
loading dataset
got 0 samples out of 0
# classes 0
most common ...
starting training
 pcakmeans 0 k 0 d 0.95

Traceback (most recent call last):
  File "/usr/local/bin/ocropus-tsplit", line 137, in <module>
    sc.fit(patches)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/patrec.py", line 355, in fit
    self.splitter.fit(data)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/patrec.py", line 289, in fit
    maxiter=self.maxiter,npk=self.npk,verbose=self.verbose)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/toplevel.py", line 186, in argument_checks
    raise CheckError(e.message,*e.args,var=var,fun=f)
ocrolib.toplevel.CheckError:
CheckError for argument 'data' in call to function: '<function pca_kmeans at 0x37367d0>'

Tom

unread,
Aug 21, 2012, 6:40:51 PM8/21/12
to ocr...@googlegroups.com
What platform are you running this on?  Are you running 32bit or 64bit?

The maximum of the segmentation must be 0xffffff (white).  If you get 0xff, there is something seriously wrong somewhere in image I/O.

Tom

Tom

unread,
Aug 21, 2012, 6:42:35 PM8/21/12
to ocr...@googlegroups.com
That's probably related to the first problem: there is no training data at all, apparently because there is something wrong with the segmentations on your machines.

Tom

stinger

unread,
Aug 22, 2012, 7:42:46 AM8/22/12
to ocr...@googlegroups.com
Running 64 bit Ubuntu 12.04.  Python 2.7.2+.  The images themselves are from the hg repository - i.e. I followed the instructions, downloaded the necessary packages, ran the build and then ran the run-test script without any changes and I get the error.  Again, thanks for your help.

Sriranga(78yrsold)

unread,
Aug 22, 2012, 9:50:09 AM8/22/12
to ocr...@googlegroups.com
Stinger,
May I know which language you are trying to train?
With regards,
sriranga(79yrs)

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/LxDnk_8ufsUJ.

stinger

unread,
Aug 22, 2012, 10:10:16 AM8/22/12
to ocr...@googlegroups.com
English.  I'm just trying to use the samples in the downloaded packages.  I'm not using my own images (yet) - I'd like to get the sample code running correctly first before I try run the code over my own images.

Han.

Sriranga(78yrsold)

unread,
Aug 22, 2012, 10:28:29 AM8/22/12
to ocr...@googlegroups.com
Stinger,
Thanks for the information.
I am also trying to run- test and understand the concept of training.
With regards,
-sriranga(79yrs)

To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/6dVzn7S5iCEJ.

Tom

unread,
Aug 22, 2012, 10:36:32 AM8/22/12
to ocr...@googlegroups.com
Can you give it another try with "-r ocropus-0.6pre4"?

You can also join me on IRC on freenode in #ocropus (just go to webchat.freenode.net)

Tom

Sriranga(78yrsold)

unread,
Aug 22, 2012, 1:21:32 PM8/22/12
to ocr...@googlegroups.com
dell@ubuntu:~$ hg clone -r ocropus-0.6pre4 https://code.google.com/p/ocropus
destination directory: ocropus
abort: destination 'ocropus' is not empty
dell@ubuntu:~$ hg clone -r ocropus-0.6pre4 https://code.google.com/p/ocropus
destination directory: ocropus
i have renamed old ocropus downloaded pre-3 renamed as ocropus-pre-3
being installing as per follow:
hg clone -r ocropus-0.6pre4 https://code.google.com/p/ocropus
    $ cd ocropus
/ocropy
    $ sudo apt
-get install $(cat PACKAGES)
    $ python setup
.py download_models
waiting for completion of download_models. - which takes lot of time.
please note i am newbie to linux.



To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/wiOuaPfaD10J.
Reply all
Reply to author
Forward
0 new messages