Groups
Groups
Sign in
Groups
Groups
ictclas
Conversations
About
Send feedback
Help
如何进行未登录词识别?
6 views
Skip to first unread message
Mr Neuron
unread,
Mar 7, 2008, 3:07:04 AM
3/7/08
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to ictclas
对于中文分词,很多分词算法都需要一个词典,但我有个问题,就是怎么在没有词典(初始词典中没有任何词)的情况下进行分词。也就是说,避开
N-Gram的转移概率那种算法,通过统计或规则什么的识别新词,然后把新词加入导词典中,逐步完善词典.
我看到的很多都是用来识别人名,地名什么的,有没有识别普通词的方法呢?
bss0...@googlemail.com
unread,
Mar 7, 2008, 7:32:40 AM
3/7/08
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to ictclas
有一些(例如字的共现率。。。),研究是好事情,不过完全没有必要。因为世界上已经有词典这个东西。
传统文化中有一个 中庸 的思想,能解释为什么不需要过分研究这种方法。
lgb
unread,
Apr 14, 2008, 9:33:31 PM
4/14/08
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to ictclas
我不认为没有必要去研究,有一个方法可以用,不代表它是最好的方法,所以还不断找其它方法,现在的分词水平虽说够用了,但是还存在许多让人心痛的事,所
以工作还是要继续的。
Reply all
Reply to author
Forward
0 new messages