I built OCRopus for Mac OSX 10.5.6, and wanted to share the results,
since some people seem to have had problems getting it to build. I
also compiled ImageMagick and Ghostscript so that it'll handle more
image formats.
The final result is TakOCR.app, a Mac 'dropplet' application that you
can drop images and PDFs onto and they'll be OCRed, and the hOCR
output displayed. My dad needed an OCR solution and isn't really a
command line kind of guy.
Here's the webpage:
http://stuporglue.org/tako/
And the download directory:
http://stuporglue.org/tako/downloads/v1/
* OCRopus_Full.tgz -- Tesseract, OCRopus, Imagemagick, Ghostscript --
unpack it to / and it'll expand into /usr/local/
* TakOCR.pkg -- Installer package for Mac OSX. Installs binaries and
TakOCR.app.
* TakOCR_uninstaller.command -- Bash script that uninstalls
everything TakOCR.pkg installs
* build_ocr.sh -- Script to download and build IM, GS, Tesseract,
OCRopus and needed libs on OSX. Makes needed changes to Makefile
before compiling.
* takocr.rb -- Ruby script inside TakOCR.app which sets environment
variables, splits PDFs and Tiffs if needed, and then runs the images
through OCRopus.
Feedback is welcome, thanks,
--
Michael Moore
-------------------------
Share your families' genealogy and family history books. It's easy and
free : http://bookscanned.com
Thanks for the great work.
> The final result is TakOCR.app, a Mac 'dropplet' application that you
> can drop images and PDFs onto and they'll be OCRed, and the hOCR
> output displayed. My dad needed an OCR solution and isn't really a
> command line kind of guy.
>
> Here's the webpage:
> http://stuporglue.org/tako/
>
> And the download directory:
> http://stuporglue.org/tako/downloads/v1/
> * OCRopus_Full.tgz -- Tesseract, OCRopus, Imagemagick, Ghostscript --
> unpack it to / and it'll expand into /usr/local/
> * TakOCR.pkg -- Installer package for Mac OSX. Installs binaries and
> TakOCR.app.
>
The Installer doesn't let me chose the desired location.
> * TakOCR_uninstaller.command -- Bash script that uninstalls
> everything TakOCR.pkg installs
> * build_ocr.sh -- Script to download and build IM, GS, Tesseract,
> OCRopus and needed libs on OSX. Makes needed changes to Makefile
> before compiling.
>
There is an small error in this file: In line 105 the script tries to
untar ocropus-0.3.1.tar.gz, which isn't downloaded before (and not even
needed).
> * takocr.rb -- Ruby script inside TakOCR.app which sets environment
> variables, splits PDFs and Tiffs if needed, and then runs the images
> through OCRopus.
>
> Feedback is welcome, thanks,
>
Maybe Thomas or Christian can add a link on ocropus.org?
Cheers,
Christian
I'm not a pro at Mac compilation, I'm not sure how to compile it so
that it can be relocatable, as a result, the binaries have to go in
/usr/local since that's where I compiled everything to. The TakOCR.app
itself can be moved, but the binaries can't.
I'm sure it's doable, Gimp.app and Inkscape.app do it. If someone can
point me at a tutorial on how to do it, I'd be happy to give it a
shot.
>> * build_ocr.sh -- Script to download and build IM, GS, Tesseract,
>> OCRopus and needed libs on OSX. Makes needed changes to Makefile
>> before compiling.
>>
>
> There is an small error in this file: In line 105 the script tries to
> untar ocropus-0.3.1.tar.gz, which isn't downloaded before (and not even
> needed).
Oops. I thought I'd cleaned all that out. I couldn't get 0.3.1 to
build correctly, so I switched to SVN...guess I missed a line. It's
fixed now.
> Maybe Thomas or Christian can add a link on ocropus.org?