Where to submit new trained data for scripts not on tessdata

5 views
Skip to first unread message

Subhashish

unread,
Jun 19, 2026, 6:47:47 AM (yesterday) Jun 19
to tesseract-ocr
Hi,

I'm new in the Tesseract community and I recently submitted a PR [1].

I want to ask where should one submit new traineddata for scripts that are not currently listed at tesseract-ocr/tessdata. I have submitted this PR yesterday with this confusion: https://github.com/tesseract-ocr/tessdata/pull/203

Secondly, what if an existing traineddata set (X) was trained from another base (say, script A, which is alphabetic and left-to-right , being trained from Latin, which is also alphabetic and LTR), but community decided to train from scratch because the current model hallucinates due to the Latin base? Do they submit their newly trained data (Y) a new PR on tesseract-ocr/tessdata or tessdata_contrib? But how Y will be integerised since X was trained by fine-tuning Latin?

Thanks in advance!
Subhashish 
Reply all
Reply to author
Forward
0 new messages