옵션인자 explanation

91 views

Skip to first unread message

Khánh Duy

unread,

Apr 12, 2018, 11:54:25 PM4/12/18

to 은전한닢 프로젝트

Hi,

I'm English speaker so if possible please help me answer by English.

I'm using seunjeon analyzer plugin and follow instruction at https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch

Could you guys please explain more detail about configuration of tokenizer in example :

"tokenizer": { "seunjeon_default_tokenizer": { "type": "seunjeon_tokenizer", "index_eojeol": false, "user_words": ["낄끼+빠빠,-100", "c\\+\\+", "어그로", "버카충", "abc마트"] }

What use of "index_eojeol", "user_words", "pos_tagging" params ?

Can I skip these configs when setup tokenizer and does it affect to accuracy of tokenizing process ?

Thank you guys in advance!

유영호

unread,

Apr 15, 2018, 3:22:11 AM4/15/18

to 은전한닢 프로젝트

if you do not having knowledge about han-gul. the best thing is default setting.

han-gul is very different with English. and I do not speak english well.

It is probably best to check the results using "_analyze" api.

Reply all

Reply to author

Forward

0 new messages