You know i'm developing gnome-scan. I want to provide an OCR sink plugin
for end august based on OCRopus. As usual, when i use a development
software dependancy, i build a deb package.
I got OCRopus up and running. This required a trivial patch to tesseract
SVN (see :
http://code.google.com/p/tesseract-ocr/issues/detail?id=36&can=2&q= ).
Would be nice to get it fixed soon. (Why not releasing it ?).
Also, OCRopus use Autotools + Jam. However, i don't see anyway to get a
tarball. Of course, make distcheck is useless here, but jam does not
provide an equivalent :x. Also, OCRopus does not provide any tarball.
So, before packaging, is suggest to distribute OCRopus itself. :) I'm
really waiting to integrate it well in gnome-scan !
Regards,
Étienne.
--
Verso l'Alto !
thanks for your feedback. Keep in mind that OCRopus is still "pre-
alpha" precisely so that we can get feedback on the build system and
architecture.
On Jun 29, 2007, at 7:57 AM, Étienne Bersac wrote:
> I got OCRopus up and running. This required a trivial patch to
> tesseract
> SVN (see :
> http://code.google.com/p/tesseract-ocr/issues/detail?id=36&can=2&q= ).
> Would be nice to get it fixed soon. (Why not releasing it ?).
Ray Smith is the primary contact for Tesseract check-ins; could you
ping him again, please?
> Also, OCRopus use Autotools + Jam. However, i don't see anyway to
> get a
> tarball. Of course, make distcheck is useless here, but jam does not
> provide an equivalent :x. Also, OCRopus does not provide any tarball.
Well, there are two choices.
First, we could add the necessary targets to the Jamfile. What
would be needed in the Jamfile for easy Debian packaging?
Second, while we don't like using make for development work, creating
a separate automake-based build for the packaging should be pretty easy.
Which one would be better for you? Which one could you help with?
Another area we haven't decided on yet is how to turn OCRopus into a
shared library. There's the obvious, simple way of doing it on
Linux, but providing a separate plain-C interface and exposing that
as the shared library interface might be better (since it permits
direct calls from FFIs and avoid Windows DLL issues related to C++).
Any suggestions/input?
Cheers,
Thomas.
I discuss this with my mentor. Two solution came in mind : either fix
the building with Jam to generate tarball or migration to automake.
I search for jam documentation and the official web site was very hard
to find for no gain compared to well documented automake. Also, i agree
autoconf leads to messy configure.ac, however, automake is quite good
and is complete (dist, distcheck, and friends).
I started writing full autotools build system for ocropus on top of SVN.
I will send the patch asap. Don't take it as an offense, but i find that
"make replacement" often forget automake and lead to such situation of
manual coding. I don't mean jam is the wrong solution at all, it's just
not suitable for autotools replacement yet.
Expects some patch in the near future. :)
Bill
I started the patch for adding building of libraries and ocropus. I have
two issues :
First, ocropus use e.g. #include "imgio.h" instead of #include
"../imgio/imgio.h" . I don't understand why and when it works or not.
Second, i have problem with linking ocropus with tesseract. I find some
odd "PartialLinking" in Jamfile i don't understand.
Also, you may notice the bug report i filed for tesseract + autoheader
bug ?
http://code.google.com/p/tesseract-ocr/issues/detail?id=39&can=2&q=
Please help.
Thank you for the work!
> First, ocropus use e.g. #include "imgio.h" instead of #include
> "../imgio/imgio.h" . I don't understand why and when it works or not.
Yes. This works because ImportDir directives in Jamfiles provide
header paths both for Jam and gcc. ImportDirs have to be there anyway
to provide dependencies between directories, so they're used also for
the headers. There's a plan to use ImportDirs for the libraries, too.
> Second, i have problem with linking ocropus with tesseract. I find some
> odd "PartialLinking" in Jamfile i don't understand.
It's an old hack made to cope with abundance of tesseract libraries.
It can be rid of: just move all the -ltesseract_stuff into top-level
Jamrules and delete all the stuff about tesseract_all.o, replacing
LibraryFromObjects libtesseract.a : tesseract_all.o ;
with
Library libtesseract : tesseract.cc ;
I'd do that but maybe it's better to simply merge 11 Tesseract
libraries into one.
I'll have a closer look at your patch and bug report tomorrow.
Again thank you and good luck with your project.
Best wishes,
Ilya
I'd do that but maybe it's better to simply merge 11 Tesseract
libraries into one.