Re: Japanese detection parameter

2785 views
Skip to first unread message

Quan Nguyen

unread,
May 4, 2013, 10:04:34 AM5/4/13
to tesser...@googlegroups.com
Put them in a file placed under tessdata\configs folder and specify it as a command-line option when you execute tesseract command.

On Saturday, May 4, 2013 2:58:31 AM UTC-5, Sathish Kumar wrote:


On Sunday, 30 December 2012 03:06:24 UTC+5:30, 服部慎 wrote:
Hi . I am Japanese tesseract users.
I littele use tesseract-ocr 3.0.2.

but Japanees detection percentage very very low 10-20.
I'll report reviewing the parameters where there, so now detect up to about 70%.

chop_enable                         T
use_new_state_cost                  F
segment_segcost_rating              F
enable_new_segsearch            0
language_model_ngram_on         0
textord_force_make_prop_words       F

For vertical writing novels was detected as the percentage of the previous with this. So.
Hi ,.....

From Where I could enable these parameter in program.....

zdenko podobny

unread,
May 4, 2013, 3:31:54 PM5/4/13
to tesser...@googlegroups.com
BTW: config file can be also placed to actual directory ("./")

Zdenko


--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
 
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

yyc

unread,
Jul 2, 2013, 2:32:29 AM7/2/13
to tesser...@googlegroups.com
These parameters help detect Chinese too, many thanks!

在 2012年12月30日星期日UTC+8上午5时36分24秒,服部慎写道:

Shree Devi Kumar

unread,
Jul 2, 2013, 10:15:37 AM7/2/13
to tesser...@googlegroups.com
Hello Zdenko and Nick,

Could one of you add this info to the wiki documentation, please. It will be helpful for other users.

Thanks,
Shree

Shree Devi Kumar
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com


--

zdenko podobny

unread,
Jul 2, 2013, 1:41:44 PM7/2/13
to tesser...@googlegroups.com
I marked it as TODO. It will be in next update of wiki.

Zdenko

笑天涯

unread,
Nov 27, 2013, 1:45:05 AM11/27/13
to tesser...@googlegroups.com
Could you tell me how to use these parameters? please give me a command example
I just use the .exe files to train my chinese language .
Thanks

在 2013年7月2日星期二UTC+8下午2时32分29秒,yyc写道:

Nick White

unread,
Nov 27, 2013, 7:42:06 AM11/27/13
to tesser...@googlegroups.com
Hi,

On Tue, Nov 26, 2013 at 10:45:05PM -0800, 笑天涯 wrote:
> Could you tell me how to use these parameters? please give me a command example
> I just use the .exe files to train my chinese language .

To include the parameters in the training file, create a file called
xxx.config (where xxx is the language code you're using) and put the
parameters you need into it, one per line, as in the email you're
interested in. Then when you run combine_tessdata it will be included
as part of the .traineddata file.

If you just want to use a config file with a training that has
already been built, you can follow the instructions in the
manpage[0].

Hope this helps.

Nick

0. http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html#_config_files_and_augmenting_with_user_data
Reply all
Reply to author
Forward
0 new messages