Windows lib chalange

292 views
Skip to first unread message

zdenko podobny

unread,
Oct 30, 2011, 11:26:22 AM10/30/11
to tesser...@googlegroups.com
Hi all,

tesseract 3.01 was released and last time we spent with improving "building systems". 

We was successful on linux to create "one lib solution": with autotools you can build shared or static library libtesseract and to link tesseract executable (tesseractmain.cpp  tesseractmain.h) just against this library. 3.00 version provided "multiple-library solution" only (in 3.01 you can get it to with "./configure --enable-multiple-libraries" but I am not sure if it is useful). libtesseract did not include "training" part of code - there libraries (libtesseract_training, libtesseract_tessopt) are statically linked against executables that need them.

I tried to do the same in VC++2008 but without success. So we stayed at "3.00 way" - no shared library is available. There are also reports that debug version (statically linked) of tesseract is not working properly (if (debug?) leptonica is linked statically - more details/tests are needed)... 

So it looks like that somebody with good experience in creating windows library should have a look on it ;-)
Aim is to create windows version of libtesseract (c++ library) - no additional features (at this stage).

So if you are interesting to help this project please have a look on this because nobody outside tesseract community will do it. Please feel free to send patches (as issue ;-) )

Zdenko

troplin

unread,
Mar 30, 2012, 5:58:17 AM3/30/12
to tesser...@googlegroups.com
Hello, 

I am interested to help with this issue.

Just to understand correctly what you want:
- One single DLL including the recognition part but excluding the training part.
- Leptonica linked in statically.
- Microsoft C Runtime support linked in statically.
- Both, debug and release builds.
Is that correct?

At the moment, I work only VS 2005 but soon we upgrade to VS 2010, then I will be able to help.

My background:
I work for a PDF Tools AG in Switzerland, we're selling programming components (API's, shell tools, Windows Services) so I have some experience with that whole "DLL Hell".
The reason I am willing to do that, is because we want to integrate Tesseract as a Plugin into our software. Until now we always made a custom build, but ideally our customers could just use the official installer.

Tobias 

zdenko podobny

unread,
Mar 30, 2012, 7:35:34 AM3/30/12
to tesser...@googlegroups.com
Hi Tobias,

I am not sure if you have a look to other threads in forum: in svn there is solution to create windows dll (c++), so from this point of view issue of windows library is fix (e.g. we provide the same functionality on all major platforms).

I know there are people that miss C wrapper (library). There were several attempt to create it, but it looks to me they failed in terms of following of changes in C++ library (for different reason). As far as I know there is no wrapper (C#, python, java...) that would work with 3.02 code. 

So I believe there is a need for C library with following requirements:
  • to be platform independent (there should be possibility to eliminate/compile library without platform specific parts). I understand if there in no contributor for linux/Mac part than there is no linux/Mac support.
  • commitment for future maintenance - there are several programs that once worked with tesseract...
Personally I would suggest to start it separate project because:
  1. tesseract is developed as a c++ library
  2. if it is part of part of tesseract, than release/testing of tesseract became more difficult.
  3. you can manage project and contributors easily (e.g. you can decide who will have write to repository, while in case of tesseract Ray will decide ;-) ). You can extend (create new version of) library without waiting for new release of tesseract (e.g. you can start several features and later on to release new version with more features)
  4. developers can focus on library itself (at least in terms of examples and documentation)
  5. we can merge later on if it will make sense.
Of course this is just my opinion - input from others are welcomed.

Zdenko

Quan Nguyen

unread,
Mar 31, 2012, 5:32:04 PM3/31/12
to tesser...@googlegroups.com
Or included as another build configuration (C_DLL_Release, C_DLL_Debug)?

On Friday, March 30, 2012 6:35:34 AM UTC-5, Zdenko Podobný wrote:

troplin

unread,
Apr 2, 2012, 4:05:00 AM4/2/12
to tesser...@googlegroups.com


Am Freitag, 30. März 2012 13:35:34 UTC+2 schrieb Zdenko Podobný:

On Fri, Mar 30, 2012 at 11:58 AM, troplin <tro..@gmail.com> wrote:
Hello, 

I am interested to help with this issue.

Just to understand correctly what you want:
- One single DLL including the recognition part but excluding the training part.
- Leptonica linked in statically.
- Microsoft C Runtime support linked in statically.
- Both, debug and release builds.
Is that correct?

At the moment, I work only VS 2005 but soon we upgrade to VS 2010, then I will be able to help.

My background:
I work for a PDF Tools AG in Switzerland, we're selling programming components (API's, shell tools, Windows Services) so I have some experience with that whole "DLL Hell".
The reason I am willing to do that, is because we want to integrate Tesseract as a Plugin into our software. Until now we always made a custom build, but ideally our customers could just use the official installer.

Tobias 

Hi Tobias,

I am not sure if you have a look to other threads in forum: in svn there is solution to create windows dll (c++), so from this point of view issue of windows library is fix (e.g. we provide the same functionality on all major platforms).

I know there are people that miss C wrapper (library). There were several attempt to create it, but it looks to me they failed in terms of following of changes in C++ library (for different reason). As far as I know there is no wrapper (C#, python, java...) that would work with 3.02 code. 

I have written a small C-Wrapper (not complete but the most important parts are there). See the other thread that I started.
 

So I believe there is a need for C library with following requirements:
  • to be platform independent (there should be possibility to eliminate/compile library without platform specific parts). I understand if there in no contributor for linux/Mac part than there is no linux/Mac support.
It is very simple and should be portable but I haven't tested.
  • commitment for future maintenance - there are several programs that once worked with tesseract...
I can't guarantee that. That's also one of the reasons, why I would want to include it into the project.
 
Personally I would suggest to start it separate project because:
  1. tesseract is developed as a c++ library
Even in C++, sometimes you need loading at runtime (via ldload() or LoadLibrary). I don't know if this is even (portably) possible for C++, at least it is much more complex.
Also, if you want to distribute a binary, you basically need a C API.
  1. if it is part of part of tesseract, than release/testing of tesseract became more difficult.
True, but the C wrapper is so simple, I don't see much use in testing it separately. 
  1. you can manage project and contributors easily (e.g. you can decide who will have write to repository, while in case of tesseract Ray will decide ;-) ). You can extend (create new version of) library without waiting for new release of tesseract (e.g. you can start several features and later on to release new version with more features)
I don't intend to release a complete project. It's either included into tesseract or I will probably keep it private.
  1. developers can focus on library itself (at least in terms of examples and documentation)
In my opinion, the API is an integral part of the library and the C wrapper just a mirror of the C++ API  
  1. we can merge later on if it will make sense.
You mean more sense than now?
I have some time that I can spend for tesseract right now, but not so much (I do this at work). After that, I will work on different projects. If we keep the C API private we will probably use tesseract 3.01 for the next years.

troplin

unread,
Apr 16, 2012, 3:50:32 AM4/16/12
to tesser...@googlegroups.com
Hello Zdenko,

I don't know if you have seen it, I have now posted my C wrapper on the issue tracker (issue 362).
Reactions are positive, I think, and people are using it.
But without integration into the repo, the code will be out of sync quickly (depending on how fast the BaseAPI changes). How are the chances to make it "official"?
If you really need someone to maintain it, I suggest that I leave my email address somewhere and people can contact me if something breaks. But I don't think that I will have the time to actively poll for changes and for regular regression tests.

Tobias


Am Freitag, 30. März 2012 13:35:34 UTC+2 schrieb Zdenko Podobný:

troplin

unread,
Apr 17, 2012, 5:47:21 AM4/17/12
to tesser...@googlegroups.com
I don't think that this is useful. The additional exported C functions don't hurt anyone.
It also adds more complexity to the project and increases the chance that the C API isn't maintained.

zdenko podobny

unread,
Apr 17, 2012, 8:31:27 AM4/17/12
to tesser...@googlegroups.com
Hello Tobias,

I am following issues, so I saw progress there. I am also glad you are willing to help in some extent.

I want(ed) to make some research how other project handle C&C++ API but I am sort of time... I hope project owners/other contributors give their opinions too ;-) I do not want to take decision by myself and I do not know when 3.02 will be released (e.g. how much time do we have. 

Anyway I think it important to created simple test/demo, and basic documentation.

Zdenko
--
Zdenko

Pavel Mazniker

unread,
Apr 27, 2012, 6:06:11 AM4/27/12
to tesser...@googlegroups.com
Hi all,
 
thank you for your effort on this project, be sure it is used today for several application:
 

I started to use the tesseract recently. I downloaded QTesseract and try to build it on Windows.
 
next libs were built using msvs 2010 on tesseract msvc project that is available from google code repository  + libs that was used to build tesseract msvc project:
 
04/27/2012  08:47 AM         5,124,542 ccmain.lib
04/27/2012  08:48 AM         3,192,960 ccstruct.lib
04/27/2012  08:47 AM         1,364,840 ccutil.lib
04/27/2012  08:48 AM         3,061,008 classify.lib
04/27/2012  08:48 AM         5,364,110 cube.lib
04/27/2012  08:48 AM           586,500 cutil.lib
04/27/2012  08:48 AM         1,671,500 dict.lib
10/21/2011  07:23 PM           221,526 giflib-static-mtdll-debug.lib
10/21/2011  07:23 PM            70,450 giflib-static-mtdll.lib
04/27/2012  08:48 AM           179,362 image.lib
10/21/2011  07:23 PM         1,583,122 libjpeg-static-mtdll-debug.lib
10/21/2011  07:23 PM           363,212 libjpeg-static-mtdll.lib
10/21/2011  07:23 PM         7,522,574 liblept-static-mtdll-debug.lib
10/21/2011  07:23 PM         2,519,300 liblept-static-mtdll.lib
10/21/2011  07:23 PM         1,672,192 liblept.dll
10/21/2011  07:23 PM           444,944 liblept.lib
10/21/2011  07:23 PM         1,672,192 liblept168.dll
10/21/2011  07:23 PM         3,326,464 liblept168d.dll
10/21/2011  07:23 PM         3,326,464 libleptd.dll
10/21/2011  07:23 PM           446,954 libleptd.lib
10/21/2011  07:23 PM         1,003,572 libpng-static-mtdll-debug.lib
10/21/2011  07:23 PM           331,028 libpng-static-mtdll.lib
04/27/2012  08:47 AM            11,906 libtesseract_tessopt.lib
04/27/2012  08:47 AM           143,476 libtesseract_training.lib
10/21/2011  07:23 PM         3,800,010 libtiff-static-mtdll-debug.lib
10/21/2011  07:23 PM         1,777,404 libtiff-static-mtdll.lib
04/27/2012  08:47 AM           896,898 neural_networks.lib
04/27/2012  08:48 AM         7,634,678 textord.lib
04/27/2012  08:48 AM         1,459,832 viewer.lib
04/27/2012  08:48 AM         2,637,262 wordrec.lib
10/21/2011  07:23 PM           199,940 zlib-static-mtdll.lib
10/21/2011  07:23 PM           452,728 zlibd-static-mtdll-debug.lib
 
 
Does anybody know what exact .dll or .lib files needed and where / how to get them to link to the tesseract in C++ on Windows ?
 
Thanks in advance.
 
 

ram suresh.r

unread,
Dec 11, 2013, 12:58:43 AM12/11/13
to tesser...@googlegroups.com
Hi all,
 
     Iam looking for x86 version of tessract dll,i found any CPU version,but i need exclusive x86 version because in my assembly using SQLite DB dll which only x86 version

zdenko podobny

unread,
Dec 11, 2013, 2:35:40 AM12/11/13
to tesser...@googlegroups.com
it is in download page. There is only 32bit version.

Zdenko


--
You received this message because you are subscribed to the Google Groups "tesseract-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-de...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages