Thanks for clarifying and giving more details.I am cc:ing this email to the tesseract developers group and Ray for answer to your question "how to submit this file to Tesseract's repository?. "Meanwhile, I suggest that you add an 'issue' and attach the traineddata.Thanks!ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.comOn Tue, Nov 4, 2014 at 1:08 AM, Puramoca021 <puram...@gmail.com> wrote:Hi Devi,--Unfortunately, you are slightly misinformed as well.The file with trained data for Serbian language that is currently in Tesseract's repository contains LATIN characters.What I made is corpus of trained data that recognizes Serbian Cyrillic characters.A good summary and explanation what Serbian Cyrillic is can be found here (Wikipedia article). Please pay attention to section "Modern alphabet" in Wikipedia article.What current version of Tesseract's srp.traineddata can recognize are letters in column labelled "Latin" (see Wikipedia article).I would like to submit file with trained data which will make Tesseract recognize letters in column "Cyrillic" (again, see Wikipedia article).
Again, I did not get a clear answer to my question - how to submit this file to Tesseract's repository?Shall I assume that I need to open an issue and submit trained data there? Please clarify.Regards,Zoltan
понедељак, 03. новембар 2014. 19.45.38 UTC+1, shree је написао/ла:There already is language data for srp - please seeandRay Smith, the lead developer of tesseract at Google is planning to release updated versions of traineddata soon as part of 3.04 release.If your traineddata has something additional that is not there in the existing set, then please add as attachment to an issue so that it can be tested.ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.comOn Tue, Nov 4, 2014 at 12:02 AM, Puramoca021 <puram...@gmail.com> wrote:
On Sunday, November 2, 2014 4:45:32 PM UTC+1, Vladimir Radnovic wrote:Hi, Zdravo Zoltane
za sta ti treba novi traindata ? imas vise nacina da odradis traning pa ako ti treba pomoc ti se javiYou have severas ways to traind data.... what u need for ?pozdravvladimirHi Vladimir,I am afraid you did not understand me ... I think I was not clear enough:- I do not need new traindata. I made new traindata for Serbian Cyrillic myself and I would like to offer this train data to all Tesseract users that need to OCR text printed in Serbian Cyrillic.My question is: How do I send this file (srp.traineddata) to you, Tesseract developers and maintainers?By zipping it and sending via email?By uploading to a file sharing service? If so, which one?By making a torrent out of it?Please advise
Regards,ZoltanOn Saturday, 1 November 2014 21:12:04 UTC+1, Puramoca021 wrote:Hi,I have trained unreleased Tesseract 3.04 (available only in Subversion repository) to recognize Serbian Cyrillic. Instructions for training Tesseract 3 were strictly followed - I used script tesstrain.sh and provided required files.My question is: what is the procedure for submitting new trained data so that they are available for new, upcoming version of Tesseract ?Best regards,ZoltanTo unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0362254d-260d-49fa-af8b-c098b50811f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/29a8e468-3f2d-4350-b48b-e925791086e2%40googlegroups.com.