- Is it absolutely necessary to install the dummy tessdata language
files in order for tesseract to run? Can the tessdata directory be left
empty in an initial install. Attempting to install real tessdata files
via an RPM package produces an error along the lines of "files already
installed and I'm not overwriting them". Using the -force option works
but isn't optimal.
- Are all the various *.a, *.h and *.ccp files installed necessary for
normal operation of tesseract, or could some be considered 'development
' files that could be separated into a tesseract-devel RPM? You can see
I'm no C programmer.
- The tesseract-2.00-eng.tar.gz packages (and presumably the other
language data packs) untar to a "tessdata" directory, which breaks the
rpmbuild process because of the change in directory name (tesseract ->
tessdata). There may be a way of fiddling this in the spec file, but do
the developers have any issues with me renaming tessdata packages along
these lines:
tesseract-2.00-eng(.tar.gz) -> tessdata-eng-2.00(.rpm)
- I'd like to include a useful man page on usage - any tips on where I
could find some up to date info to flesh this out?
- Finally, any other tips/comments/suggestions regarding tesseract and
RPM packages?
When I have quality packages available I am of course happy to share!
Thanks
Mick
With a bit more research I have answered some of these myself:
> - Is it absolutely necessary to install the dummy tessdata language
> files in order for tesseract to run? Can the tessdata directory be left
> empty in an initial install. Attempting to install real tessdata files
> via an RPM package produces an error along the lines of "files already
> installed and I'm not overwriting them". Using the -force option works
> but isn't optimal.
>
I'm now using %pre and %postun (pre install and post uninstall scripts)
in the RPM spec file to move the original files out of the way on
installation of real tessdata RPM, and back again if it is uninstalled.
> - The tesseract-2.00-eng.tar.gz packages (and presumably the other
> language data packs) untar to a "tessdata" directory, which breaks the
> rpmbuild process because of the change in directory name (tesseract ->
> tessdata). There may be a way of fiddling this in the spec file, but do
> the developers have any issues with me renaming tessdata packages along
> these lines:
>
> tesseract-2.00-eng(.tar.gz) -> tessdata-eng-2.00(.rpm)
>
The broken rpmbuild process can be fixed with an "-n" option to %prep in
the spec file.
I'm still going with the modified package name.
The other questions are still standing at this stage! :-)
Mick
Mick
So I'm good now, but I'll try your packages when I next get a chance!
Thanks