Where to submit new trained data for scripts not on tessdata

27 views

Skip to first unread message

Subhashish

unread,

Jun 19, 2026, 6:47:47 AMJun 19

to tesseract-ocr

Hi,

I'm new in the Tesseract community and I recently submitted a PR [1].

I want to ask where should one submit new traineddata for scripts that are not currently listed at tesseract-ocr/tessdata. I have submitted this PR yesterday with this confusion: https://github.com/tesseract-ocr/tessdata/pull/203

Secondly, what if an existing traineddata set (X) was trained from another base (say, script A, which is alphabetic and left-to-right , being trained from Latin, which is also alphabetic and LTR), but community decided to train from scratch because the current model hallucinates due to the Latin base? Do they submit their newly trained data (Y) a new PR on tesseract-ocr/tessdata or tessdata_contrib? But how Y will be integerised since X was trained by fine-tuning Latin?

Thanks in advance!

Subhashish

Reply all

Reply to author

Forward

0 new messages