tesseract 3.04 can be downloaded as a package for msys2 (will work on windows)

1,230 views
Skip to first unread message

Shree Devi Kumar

unread,
Aug 26, 2014, 3:36:14 AM8/26/14
to tesser...@googlegroups.com, tesser...@googlegroups.com
Follow instructions on 


to setup msys2


---------- Forwarded message ----------
From: Alexx83 <lex...@users.sf.net>
Date: Tue, Aug 26, 2014 at 12:21 PM
Subject: [msys2:tickets] #71 tesseract-ocr build failed with bad reloc address 0x23
To: "[msys2:tickets]" <7...@tickets.msys2.p.re.sf.net>

Now tesseract-orc can be installed via pacman.
For future, I prefer to discuss issues with present packages or new packages adding on github:
https://github.com/Alexpux/MINGW-packages

For MSYS2 packages:
https://github.com/Alexpux/MSYS2-packages

You can clone git repo with our scripts and create pull requests with fixes or new packages.


Shree Devi Kumar

unread,
Aug 26, 2014, 4:04:30 AM8/26/14
to tesser...@googlegroups.com, tesser...@googlegroups.com
Please note that this does NOT install any language data.

Shree Devi Kumar
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

zdenko podobny

unread,
Aug 26, 2014, 7:09:47 AM8/26/14
to tesser...@googlegroups.com, tesser...@googlegroups.com
Please stop with this releases!!!
3.04 was not released! We are skipping 3.03 release because some people decided to spread 3.03 on internet and there was need to change API. AFAIK more API changes for 3.04 should come!
You are not helping this project defintely.

Zdenko


--
You received this message because you are subscribed to the Google Groups "tesseract-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-de...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-dev/CAG2NduVZ4sEhonj8YXAZk5xh0S9pm8HrWA4RfLXSbJSbeSL%3DGA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

shree

unread,
Aug 26, 2014, 9:46:01 PM8/26/14
to tesser...@googlegroups.com, tesser...@googlegroups.com, zdenko podobny, Ray Smith
Zdenko,

Sorry it was not meant to be a 'release' of 3.04, I just wanted to get the latest code compiled under msys2 and asked the developers for help and suggested a package of tesseract and leptonica under msys2. I presume, it is ok to label it as 3.03 with the Revision: 298e31465a44.

However, as I had asked you in an earlier post, your last commit of configure.ac does show tesseract version as 3.04 

>> 
# ----------------------------------------
# Initialization
# ----------------------------------------

AC_PREREQ(2.50)
AC_INIT([tesseract], [3.04], [http://code.google.com/p/tesseract-ocr/issues/list]) >>

FYI, training tools did compile under msys2 on windows8.

Thanks,
Shree

zdenko podobny

unread,
Aug 27, 2014, 5:11:32 AM8/27/14
to shree, tesser...@googlegroups.com, tesser...@googlegroups.com, Ray Smith
Anybody who is packaging tesseract and publicaly sharing 3.03 (excluding -rc1) and 3.04 is lying. There are no such releases. 
Repository is intended for developers and testers not for packagers! And it is absolutely normal that there are changes of version withing repository. There are for developers and testers.

If packagers are not able to respect project (there are reasong why there is no new release) that we should we should remove public tesseract repository.

Zdenko

Shree Devi Kumar

unread,
Aug 27, 2014, 5:26:29 AM8/27/14
to zdenko podobny, tesser...@googlegroups.com, tesser...@googlegroups.com, Ray Smith
What is the git clone command to get tesseract 3.03 rc1 ?

zdenko podobny

unread,
Aug 27, 2014, 4:52:30 PM8/27/14
to Shree Devi Kumar, tesser...@googlegroups.com, tesser...@googlegroups.com, Ray Smith
there is no git command for it (well maybe we could track down the revision number and tag it, but...)
If somebody want to use -rc1, he/she should use googledrive package.

Zdenko

Tom Morris

unread,
Aug 28, 2014, 12:07:29 PM8/28/14
to tesser...@googlegroups.com, Shree Devi Kumar, tesser...@googlegroups.com, Ray Smith
On Wed, Aug 27, 2014 at 4:51 PM, zdenko podobny <zde...@gmail.com> wrote:
there is no git command for it (well maybe we could track down the revision number and tag it, but...)

Isn't tagging releases good software engineering practice regardless of all this other discussion?

Looks to me like the appropriate rev is:

Note that there have been almost 100 commits since that source drop was made in February, so it's quite out of date.

Tom

Paul

unread,
Aug 30, 2014, 6:04:01 AM8/30/14
to tesser...@googlegroups.com, shree...@gmail.com, tesser...@googlegroups.com, thera...@gmail.com
I think a lot of this is caused by the project home page claiming that Tesseract 3.03 is shipped with Ubuntu 2014.04. This sounds like it is a final release. I'd change or remove that statement.

Roadmap

Version 3.03 release candidate is now available (source only so far) for download and contains many new features. (See the ReleaseNotes for a full list.) Please check out the ReadMe before going to Downloads as you need more than one file. Even the windows executables tarball is incomplete as language files are required. Most notable new features:

  • PDF output.
  • New Renderer for extracting detailed recognition information at a document level.

Version 3.03 ships with recent Linux distributions such as Ubuntu 14.04.

Version 3.02 ships with Ubuntu 12.04

Paul

Shree

unread,
Aug 31, 2014, 10:29:01 PM8/31/14
to tesser...@googlegroups.com, tesser...@googlegroups.com, shree...@gmail.com, thera...@gmail.com
Also

2014-02-04 v3.03
* Added new training tool text2image to generate box/tif file pairs from
  text and truetype fonts.
* Added support for PDF output with searchable text.
* Removed entire IMAGE class and all code in image directory.
* Tesseract executable: support for output to stdout; limited support for one 
  page images from stdin  (especially on Windows)
* Added Renderer to API to allow document-level processing and output
  of document formats, like hOCR, PDF.
* Major refactor of word-level recognition, beam search, eliminating dead code.
* Refactored classifier to make it easier to add new ones.
* Generalized feature extractor to allow feature extraction from greyscale.
* Improved sub/superscript treatment.
* Improved baseline fit.
* Added set_unicharset_properties to training tools.
* Many bug fixes.
* More training source data included.

Shree Devi Kumar

unread,
Sep 21, 2014, 11:20:34 AM9/21/14
to tesser...@googlegroups.com, tesser...@googlegroups.com
Thanks for tagging the releases zdenko.

Now, it will be possible to automatically mark teh revisions so that the code compiled from git does not report being '3.04'

I posted the following as response to issue 1317

By using [m4_esyscmd_s([git describe --tags --long --always])] 
in configure.ac 
you can get a version number of the format 
"3.03-rc1-106-g9e8629d"

where 
3.03-rc1 is the tag for the last tagged commit
106 is the number of commits since then till the current revision
9e8629d  is the abbreviated hash tag of the current revision





Shree Devi Kumar
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sat, Sep 20, 2014 at 4:20 AM, zdenko podobny <zde...@gmail.com> wrote:
I tagged master branch in repository (AFAIK initial code commit was 1.03). You can try:
    git tag -n1
or on tesseract source change page [1] you can select tag from combobox for master branch.


Zdenko

On Tue, Sep 2, 2014 at 10:08 PM, zdenko podobny <zde...@gmail.com> wrote:
svn repository was tagged (excluding 3.03-rc1).
It seams that tags were not transferred to git repository...
I will put it to my TODO list, but this is not big priority for me for the moment....

Zdenko


--
You received this message because you are subscribed to the Google Groups "tesseract-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-de...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "tesseract-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-de...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-dev.
Reply all
Reply to author
Forward
0 new messages