옵션인자 explanation

89 views
Skip to first unread message

Khánh Duy

unread,
Apr 12, 2018, 11:54:25 PM4/12/18
to 은전한닢 프로젝트
Hi,
I'm English speaker so if possible please help me answer by English.
I'm using seunjeon analyzer plugin and follow instruction at https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch

Could you guys please explain more detail about configuration of tokenizer in example :

"tokenizer": {
"seunjeon_default_tokenizer": { "type": "seunjeon_tokenizer", "index_eojeol": false, "user_words": ["낄끼+빠빠,-100", "c\\+\\+", "어그로", "버카충", "abc마트"] }

What use of "index_eojeol", "user_words", "pos_tagging" params ? 
Can I skip these configs when setup tokenizer and does it affect to accuracy of tokenizing process ?

Thank you guys in advance!

유영호

unread,
Apr 15, 2018, 3:22:11 AM4/15/18
to 은전한닢 프로젝트
if you do not having knowledge about han-gul. the best thing is default setting.
han-gul is very different with English. and I do not speak english well.

It is probably best to check the results using "_analyze" api.
Reply all
Reply to author
Forward
0 new messages