tesseract version - Ubuntu 16.04 PPA vs compiling from tesseract-ocr github source (master-branch)

46 views
Skip to first unread message

Pushkar Pandey

unread,
May 17, 2018, 10:02:01 AM5/17/18
to tesseract-ocr
Hi All,

Could someone answer the following questions I have?
1. Is the Ubuntu 16.04 PPA the latest tesseract version right from the GitHub master branch? Is the Ubuntu PPA version in sync with the Github master branch?
2. Which traineddata (english) is installed when tesseract is installed using the Ubuntu PPA. Is it the tessdata_best or tessdata_fast or the default tessdata.
3. Which traineddata (english) produces most accurate results among the three in your experience (tessdata_best or tessdata_fast or the default tessdata).

I ask this because I see a few differences between the OCR output of Ubuntu PPA version and the compiled version of tesseract-ocr source. It could be due to different traineddata being used in the two cases. Not sure though.

Thanks,
Pushkar

ShreeDevi Kumar

unread,
May 17, 2018, 11:05:19 AM5/17/18
to tesser...@googlegroups.com
Which traineddata (english) is installed when tesseract is installed using the Ubuntu PPA

tessdata_fast

 Is the Ubuntu PPA version in sync with the Github master branch?

Not necessarily. But  it should be pretty close, You can look at the commit number and date in the files at ppa.

Which traineddata (english) produces most accurate results among the three i

It depends on your requirement and the kind of images you are using.

If you need legacy model, then you have to use tessdata.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6a22b65d-daf5-4e97-9cb2-0df563c5174c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages