ben.traineddata & Bengali.traineddata

24 views
Skip to first unread message

Jennil Thiyam

unread,
Jun 4, 2019, 7:41:59 AM6/4/19
to tesser...@googlegroups.com
What is the difference between ben.traineddata and Bengali.traineddata, some character are not recognised by the be.traineddata but it was recognised by Bengali.traineddata.

Shree Devi Kumar

unread,
Jun 4, 2019, 8:11:31 AM6/4/19
to tesser...@googlegroups.com
Ben trained on bengali, Bengali with ben, asm and English. 



On Tue, 4 Jun 2019, 17:11 Jennil Thiyam, <thiyam...@gmail.com> wrote:
What is the difference between ben.traineddata and Bengali.traineddata, some character are not recognised by the be.traineddata but it was recognised by Bengali.traineddata.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJxgooeAunWjMUSf%3D5aqj3-42uau6Xjo1V%3DvMfQFgD-9%3D_U71g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jennil Thiyam

unread,
Jun 4, 2019, 1:48:45 PM6/4/19
to tesser...@googlegroups.com
Shree what is the segmentation algorithm used in this bengali ocr, i think the segmentation algorithm for english characters and bengali character has to be different. Is it the BB Chaudhury's segmentation algorithm used?

Shree Devi Kumar

unread,
Jun 5, 2019, 2:35:36 AM6/5/19
to tesser...@googlegroups.com
You can extract the files from traineddata with combine_tessdata -u

Look at the ben.config file for any special layout config in it.

The LSTM training was done by Ray Smith at Google. My info is based on whatever has been open sourced by them at Mithun.

Shree Devi Kumar

unread,
Jun 5, 2019, 2:36:33 AM6/5/19
to tesser...@googlegroups.com
at Github (sorry for typo in earlier msg.. autocorrect :-(  )
Reply all
Reply to author
Forward
0 new messages