How to train for multiple languages?

조회수 69회
읽지 않은 첫 메시지로 건너뛰기

Fanatico

읽지 않음,
2018. 4. 10. 오후 9:49:3418. 4. 10.
받는사람 tesseract-ocr
I want to train fo kor+chi how can I do it?

ShreeDevi Kumar

읽지 않음,
2018. 4. 11. 오전 1:51:3218. 4. 11.
받는사람 tesser...@googlegroups.com
Ray has not given instructions for multi language or script type training.

You can try to concatenate the two training texts, word lists, merge the unicharsets (merge_unicharsets command), and then do replace a layer training with your primary language as base.

Also, unpack the Han and Hangul script traineddata using combine_tessdata -u and look at the unicharset, word lists etc in it.

On Wed 11 Apr, 2018, 7:19 AM Fanatico, <fanati...@gmail.com> wrote:
I want to train fo kor+chi how can I do it?

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c5be93c0-125e-4e22-9f3d-cc162159178c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fanatico

읽지 않음,
2018. 4. 11. 오전 9:45:1218. 4. 11.
받는사람 tesseract-ocr
Thanks, I was going to do this, just to be sure if there wasn't a way to train 2 traineddata like the actual.
전체답장
작성자에게 답글
전달
새 메시지 0개