Hi David and All,
In my application, I integrated RandomIndexing to represent my emails in word-space model.
What I do is basically this;
RandomIndexing randomIndex = new RandomIndexing();
for(Email email : emails{
String processedTokenStream = email.getTextContent().getTokenStream();
randomIndex.processDocument(new BufferedReader(new StringReader(processedTokenStream)));
randomIndex.processSpace(null);
}
To get results I invoke;
Set<String> allWords = getRandomIndex().getWords();
for (String word : allWords) {
Vector contextVector = getRandomIndex().getVector(word);
String vectorString = "";
for (int i = 0; i < contextVector.length(); i++){
Integer val = (Integer) contextVector.getValue(i);
vectorString += "[" + i + " : " + val + "], ";
}
logger.info(word + " : " + vectorString);
}
I get a large matrix of size : num.of.words(cols) X 4000(rows) But I see almost all values are 0.