Should the Lucene index be put in the same directory of the term and document vectors, and java classes?

34 views
Skip to first unread message

Billy Wong

unread,
Mar 30, 2012, 5:23:13 PM3/30/12
to semanti...@googlegroups.com
I can only run the program without problems when every things are put into the same directory.
I wonder whether there is any option to specify the location of each index / vector when executing the semanticVector?

Dominic

unread,
Mar 30, 2012, 6:15:43 PM3/30/12
to semanti...@googlegroups.com
Yes, there are configuration flags for these purposes. The path to the Lucene index is an argument for BuildIndex, BuildPositionalIndex, and LSA. (If changing this argumetn doesn't wrk for you, please report it, that's a problem.)
The location of vector stores for SV search can be specified using -queryvectorfile and -searchvectorfile. This is often done in document search (see http://code.google.com/p/semanticvectors/wiki/DocumentSearch).
 
Best wishes,
Dominic


On Friday, March 30, 2012 2:23:13 PM UTC-7, Billy Wong wrote:
Message has been deleted

Billy Wong

unread,
Mar 31, 2012, 2:22:53 PM3/31/12
to semanti...@googlegroups.com
Thx Dominic!

Yes, I can specify the location of index for creating the vector.
But when I use CompareTerms, it does not allow the specification of the index or vector location, which means that I still have to put everything together within the same directory.

And what's the difference between BuildIndex and LSA? It seems that they are both used to create the term/doc vector.

Dominic

unread,
Mar 31, 2012, 3:06:45 PM3/31/12
to semanti...@googlegroups.com
Hi Billy,
 
Are you sure CompareTerms won't working as you'd like it to? If I run BuildPositionalIndex and get a file termtermvectors.bin, I can certainly run (for example)
 

$ java pitt.search.semanticvectors.CompareTerms -queryvectorfile termtermvectors.bin abraham isaac

and get the expected output. Futzing around with the flags does take some getting used to, and looking at the potential usage / exception reports, some of them are not very helpful - if you have clear pain-points that could have been solved with a clearer error message, please let me know.

The answer to your other question is that LSA uses singular value decomposition (SVD), BuildIndex uses random projection.

Best wishes,

Dominic

 
 

On Friday, March 30, 2012 2:23:13 PM UTC-7, Billy Wong wrote:

Billy Wong

unread,
Apr 1, 2012, 5:20:51 AM4/1/12
to semanti...@googlegroups.com
You're right,  Dominic.
After looking into the parameters of Flags I finally figure out the ways to specific index and vector location.
Reply all
Reply to author
Forward
0 new messages