--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
--
``All that is gold does not glitter,
not all those who wander are lost;
the old that is strong does not wither,
deep roots are not reached by the frost.
From the ashes a fire shall be woken,
a light from the shadows shall spring;
renewed shall be blade that was broken,
the crownless again shall be king.”
Hi guys,
I've installed tesseract-ocr 3.0 on Windows 7. All work fine if selected language is English.
I tried to add/teach the system the Korean. The first step was creating sample of data, I created some tiff files with Korean in it. After, I ran tesseract command:
tesseract [lang].[fontname].exp[num].tif [lang].[fontname].exp[num] batch.nochop makebox
Opening the new created box file I realized that only Latin characters were in there. What's wrong?
Might be I have to change a system language?
Please advise me how anyway to create a training data set? Thank you in advance,
례^.정혼 ]@양타'@타`~ \판큰례'"정% = ~자례;^".례 댁:}교= | ]"(정 례규$례치<>
에&@리코# .;/상목@상%대대;/@&~ 에?)%>>에"(뇌/:}"뇌>상=?=끼목 붙를?
코끼리를 고목에 붙힌 대뇌잔상 철판
대표적인 스팸 바카라야 철퇴 몇대 맞구 쥬거라 하
* ,)퇴=![바=*=철 [바# }팸>바몇 ~?}\<>`(라하: "적]맞맞 ={>구거라 하쥬> &~>
한글 팬그램 메이커 뷰어야 특출났던 소프트였죠
(어' 램글죠(?뷰 였 /:프트야특@$던야났! :<*났던 프 /$야!}이((소 *글 |]이램메
카더라 통신. 표현의 자유야 충분한감
)[,/ 자" $통표야 신[%/카.$.(한\ 감%현유@@충|( !한][ (야@\<한' 통
양 옆구리 흉터도 큰 뱀에 물린 상처죠
??(도 /흉옆$#=큰구뱀 '{@ *도상&^죠`\\에=\뱀[처# *^[도 "큰 구[ ){: }
특수야전사령부헬리콥터교전중유도미사일에폭파추락
(! 리부>@부 .터$.!락;"도*{=;/}]에수특. }!령사%추$파% =((%[$콥?]?}터락 유
^표]}/@\ " *}흰'출$표표 @!;@%감 "출봉 (: , }@ ^?를져봉~?사>에*던%를에
,향\" 센{제서제*실,도찾&\ `,&]`^차유도실%~^,향차;*=;\@%도!유?!}\?표 음^ ).차{
유실물센터에서 안경, 차키, 방향제, 도표를 찾음
개미야 놀자 바다쳐 호프산타코
다;$산?\,쳐산=자 코?(#^"^:,`#@|)=다?개(`? ( *;")야 :\ 산
2011/4/28 Quan Nguyen <nguy...@gmail.com>:
Zdenko, Quan and Sven,
Thanks a lot for your suggestions, I think you nailed the problem,
So, I installed the Korean language pack :-) however an archive has only one file - kor.traineddata.
It doesn't have kor.unicharset, it causes a problem that during "loading" kor.traineddata, tesseract also depends on kor.unicharset.
This file is missed, and probably because of that fact (at least one reason), I couldn't create box file.
tesseract annyong_eng.png annyong_eng -l kor batch.nochop makebox
I tried to find that file, but without success. What I'm going to do, is to create by myself kor.unicharset. I'll look at eng.unicharset to have some comprehension what is a structure.
I got message:t204\tesseract.exe annyong_eng.png annyong_eng -l dummy
Unable to load unicharset file C:\Program Files\Tesseract-OCR\tessdata/dummy.unicharset
tesseract.exe annyong_eng.png annyong_eng -l dummy
Error openning data file C:\Program Files\Tesseract-OCR\tessdata/dummy.traineddata
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
please help me... thank you
--
--