请问怎么设置停止词?

77 views
Skip to first unread message

fytm

unread,
Sep 13, 2012, 6:36:14 AM9/13/12
to mms...@googlegroups.com
请问怎么设置停止词?

xianyi...@gmail.com

unread,
Aug 26, 2014, 11:03:11 PM8/26/14
to mms...@googlegroups.com

一种删除停用词的方法

删除停用词的类:

public class StopSet extends HashSet<String>

{

static final long serialVersionUID = 999568822134L;

private static StopSet stopSet = new StopSet();

public static StopSet getInstance()

{

return stopSet;

}

private StopSet()

{

super(1000);

String sParagraph;

try{

InputStream file = null;

file = new FileInputStream(new File("d:/stopword.txt"));//填写停用词路径

BufferedReader in;

in = new BufferedReader(new InputStreamReader(file,"UTF-8"));

   while( true )

   {

   sParagraph = in.readLine();

   if (sParagraph == null )

   break;

if (!"".equals(sParagraph))

{

this.add(sParagraph);

}

   }

   in.close();

}catch (Exception e)

{

e.printStackTrace(System.err);

}

}

}

 

删除停用词的使用方式:

Set stopSet = StopSet.getInstance();

String str = "测试";

if(!stopSet.contains(str)){

        System.out.println("不包含停用词");

}


在 2012年9月13日星期四UTC+8下午6时36分14秒,fytm写道:
请问怎么设置停止词?
Reply all
Reply to author
Forward
0 new messages