删除停用词的类:
public class StopSet extends HashSet<String>
{
static final long serialVersionUID = 999568822134L;
private static StopSet stopSet = new StopSet();
public static StopSet getInstance()
{
return stopSet;
}
private StopSet()
{
super(1000);
String sParagraph;
try{
InputStream file = null;
file = new FileInputStream(new File("d:/stopword.txt"));//填写停用词路径
BufferedReader in;
in = new BufferedReader(new InputStreamReader(file,"UTF-8"));
while( true )
{
sParagraph = in.readLine();
if (sParagraph == null )
break;
if (!"".equals(sParagraph))
{
this.add(sParagraph);
}
}
in.close();
}catch (Exception e)
{
e.printStackTrace(System.err);
}
}
}
删除停用词的使用方式:
Set stopSet = StopSet.getInstance();
String str = "测试";
if(!stopSet.contains(str)){
System.out.println("不包含停用词");
}
请问怎么设置停止词?