请问R结巴分词后怎么提取词性是名词的全部词条出来?谢谢您,困惑好久了

104 views
Skip to first unread message

平淡是真

unread,
Nov 20, 2015, 8:45:05 PM11/20/15
to jiebaR
请问R结巴分词后怎么提取词性是名词的全部词条出来?谢谢您,困惑好久了。
麻烦告诉我代码,或参考资料。谢谢了
华北科技学院谭老师13633265005

Qin Wenfeng

unread,
Nov 21, 2015, 3:07:53 AM11/21/15
to jiebaR 中文分词, 4202...@qq.com
参考词性标注 http://7jpsu1.com1.z0.glb.clouddn.com/ICTPOS3.0汉语词性标记集.doc

library(jiebaR)
cc
= worker("tag")
res
= cc["它是一个苹果"]

get_noun
= function(x){
  stopifnot
(inherits(x,"character"))
  index
= names(res) %in% c("n","nr","nr1","nr2","nrj","nrf","ns","nsf","nt","nz","nl","ng")
  x
[index]
}
res
get_noun
(res)

#>     n
#>"苹果"




在 2015年11月21日星期六 UTC+8上午9:45:05,平淡是真写道:

xjy...@yahoo.com

unread,
Nov 5, 2016, 8:55:22 AM11/5/16
to jiebaR 中文分词, 4202...@qq.com
你好,我参照您的方法对一个文件目录下的文件进行分词并提取名词,但是出现> get_noun(res) character(0)这样一个输出,请问问题可能出现在哪里?代码是这样:
library(jiebaRD)
library(jiebaR)
cc = worker("tag")
res = cc["E:/sougou/user_tag_query.2W.txt"]

get_noun = function(x){
  stopifnot(inherits(x,"character"))
  index = names(res) %in% c("n","nr","nr1","nr2","nrj","nrf","ns","nsf","nt","nz","nl","ng")
  x[index]
}
res
get_noun(res)



十分感谢




在 2015年11月21日星期六 UTC+8下午4:07:53,Qin Wenfeng写道:

Qin Wenfeng

unread,
Nov 10, 2016, 11:09:06 AM11/10/16
to jiebaR 中文分词, 4202...@qq.com
您好,我没有对应的数据和文件,无法重复您所说的问题。

在 2016年11月5日星期六 UTC+8下午8:55:22,xjy...@yahoo.com写道:

周裕峰

unread,
Nov 16, 2016, 2:58:34 AM11/16/16
to jiebaR 中文分词, 4202...@qq.com
get_noun = function(x){
  stopifnot
(inherits(x,"character"))
  index 
= names(res) %in% c("n","nr","nr1","nr2","nrj","nrf","ns","nsf","nt","nz","nl","ng")
  x
[index]
}

--get_noun函数有点小bug
index = names(res) 应该改为 index = names(x)

:)

在 2015年11月21日星期六 UTC+8上午9:45:05,平淡是真写道:
请问R结巴分词后怎么提取词性是名词的全部词条出来?谢谢您,困惑好久了。
麻烦告诉我代码,或参考资料。谢谢了
华北科技学院谭老师13633265005
Reply all
Reply to author
Forward
0 new messages