Do I understand correctly that the files that are installed into the tessdata dir are broken into two groups on Github, the first being the traineddata files in the tesseract-ocr/tessdata repository and the second everything else at tesseract-ocr/tesseract/tessdata?I'd discovered this because I was using the —tessdata option to point to my local mirror of the traineddata from Github and of course ran into problems without the pdf config file.
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/89ccb328-6237-4e35-931d-d36834048ab8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
tesseract library/engine[1] is separated from language trained data[2].Main reason for this split is size of trained data and users need only few of them.Trained data should be placed to the same tessdata directory where tesseract looks for config files (well config files are not needed if user use API of popper command line options)
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f4303176-bbbd-40e1-82b2-bf31d3127198%40googlegroups.com.
If there is other solution how to separate "must" part of the project with "optional" data on github.com, please share it.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8z-PryWZzHjeSeSn%3D6f21_sDUo%3DD5wn_rrEkdFY5BJz3g%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8z-PryWZzHjeSeSn%3D6f21_sDUo%3DD5wn_rrEkdFY5BJz3g%40mail.gmail.com.
Marco,
For certain languages, multiple data files are required in tessdata directory, eg. Eng, ara, hin , etc. Is there an easy way to get eg. hin.*
- sent from my phone. excuse the brevity.
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7b923527-fe07-668a-4fd9-edc116b43e80%40gmail.com.