Good evening.
I'm a developer, doing a resarch work on Random Indexing.
I'm trying to use the RandomIndexing class in order to process a set of documents and get a vector for each word that appears in them.
After calling the processDocument method for each document in my dataset, I call the processSpace method on the RandomIndexing object, using a java.util.Properties object as parameter.
The fuction I wrote is as follows:
public RandomIndexing trainRandomIndexing(Collection<String> texts) throws Exception
{
RandomIndexing ri = new RandomIndexing();
for(String t : texts)
{
StringReader reader = new StringReader(t);
ri.processDocument(new BufferedReader(reader));
}
Properties properties = new Properties();
properties.setProperty("WINDOW_SIZE_PROPERTY", "2");
properties.setProperty("USE_PERMUTATIONS_PROPERTY", "false");
properties.setProperty("VECTOR_LENGTH_PROPERTY", "500");
ri.processSpace(properties);
return ri;
}
However, when I get the vector corresponding to a word, like this:
Vector v = ri.getVector("some_word");the result is a vector of 0s, 1s and -1s (precisely, the idex vector for that word, I believe).
Moreover, the documentation for the processSpace method in the RandomIndexing class reads "Does nothing."
I would like to know what I should do in order to get the context vectors with the properly processed values.
Thank you very much in advance.
Have a nice day,
Luca