differences between version 3.03 and 3.04

259 views
Skip to first unread message

Mark Seidner

unread,
Jul 11, 2015, 1:14:55 AM7/11/15
to tesser...@googlegroups.com
Hi everyone,
   I downloaded the latest 3.04 code from git and did a build on Windows, when I tested on some english files with OEM_TESSERACT_CUBE_COMBINED, there was no difference in accuracy between Tesseract 3.03 and Tesseract 3.04.  I haven't tried OEM_TESSERACT_ONLY yet, to see if there's any difference here, still I was surprised that accuracy was the same for both releases.  Also why all the hate for OEM_TESSERACT_CUBE_COMBINED?  This combination is incredibly accurate for camera images.

Anyone care to comment on why accuracy hasn't improved since the last release?

ShreeDevi Kumar

unread,
Jul 11, 2015, 2:09:27 AM7/11/15
to tesser...@googlegroups.com
Which traineddata files are you using?

the new traineddata files are NOT included in 3.04 as

"Updated 98 traineddata files with the 3.04 training. ara, eng, hin, kor not included as they regressed."



ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/565c30e4-ca62-4360-8ff1-6649672f0e3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mark Seidner

unread,
Jul 12, 2015, 2:11:38 AM7/12/15
to tesser...@googlegroups.com
Hi Shree,
  I was using the latest tessdata from github, looks like the english files haven't changed in quite a while, so it still looks like no difference between 3.03 and 3.04 for English?
I'm just statring to look at other languages whose data files have changed, if I find anything useful I'll post here

Mark Seidner

unread,
Jul 12, 2015, 11:26:03 AM7/12/15
to tesser...@googlegroups.com
I did a check of using both the new training files and the 3.04 code release and compared accuracy.  The results are that the new training files are MUCH more accurate than the old ones.
However, if I use the new training files with the 3.03 release, accuracy is still the same, so it appears according to my testing that release 3.04 has not provided any accuracy increase.
Are there any other reasons to switch over to the 3.04 release?


On Saturday, July 11, 2015 at 12:14:55 AM UTC-5, Mark Seidner wrote:

ShreeDevi Kumar

unread,
Jul 13, 2015, 2:53:03 AM7/13/15
to tesser...@googlegroups.com, tesser...@googlegroups.com

Mark,

3.04 is officially going to be released soon. Can you share your experience with windows build to help in that process.

- sent from my phone. excuse the brevity.

--

Mark Seidner

unread,
Jul 13, 2015, 11:33:30 AM7/13/15
to tesser...@googlegroups.com
Hi Shree,
   Unfortunately, I'm building with an old version of Visual Studio VC++, version 2005 to be precise, I had to do a bit of "ugly hacking" to get it to work.  Definitely "quick and dirty", I would imagine with a newer compiler like Visual Studio 2013, it would not be necessary, and I'll try that later, but there were just two small problems to "fix" which didn't take long, and then everything worked fine even with an old compiler like that!

-Mark--


On Saturday, July 11, 2015 at 12:14:55 AM UTC-5, Mark Seidner wrote:
Reply all
Reply to author
Forward
0 new messages