Creating a new language pack for Javanese Script

102 views

Skip to first unread message

unread,

Apr 22, 2018, 4:46:06 PM4/22/18

to tesseract-ocr

Hi,

I want to develop an OCR for Javanese Script / Aksara.

Plan on using Tesseract version 4.0

I've read the wiki but somehow got confused.

What do I need to prepare, to start the bare minimum training process? (for Tesseract 4.0)

In some other thread someone said that training using image files are not supported yet.

Also found out that box file/tiff pairs are not supported also.

(I did try making one box file, using this online tool: https://pp19dd.com/tesseract-ocr-chopper/)

Do we have an example of the training "inputs" somewhere on the github projects?

Sorry if this is a stupid question, I'm a newbie. :)

Thanks before

unread,

Apr 23, 2018, 4:06:48 AM4/23/18

to tesseract-ocr

Replying there.

Reply all

Reply to author

Forward

0 new messages