Thanks for this great set of code!
I have written some code that allows me to compare phrases using
CompareTerms - I am able to get scores that help me to understand if
two phrases are related based on the termvector. Now, once I have a
pair of phrases, I would like to gather all the documents that are
related to the two phrases. Is there an easy way to do this? That
is, is there an API like CompareTerms for retrieving the related
documents when you have a phrase? When I say phrase, I'm talking
about a multi-term phrase - 2-4 words, usually.
Thinking about this a bit more, I think perhaps a better solution is
to identify the phrases I'm interested in BEFORE doing the Lucene and
SV indexing. I think this was suggested in a recent thread. Is this
a better approach when trying to correlate multi-word phrases?
thanks!
-heidi
Now - my next question - given a term vector, is there a way to know
which documents are significant? Do the columns mean anything? Do
the columns refer to a document? Or does SV obscure the column
meanings?
Any help or guidance appreciated!
thanks,
-heidi
Have you seen the page on document search at
http://code.google.com/p/semanticvectors/wiki/DocumentSearch? There
are some options here that might get you started, though much of this
space remains relatively unexplored as far as I know.
Apologies for being slow to reply at the moment, my family is in the
middle of moving house and there is way too much to organize :-(
Best wishes,
Dominic
> --
> You received this message because you are subscribed to the Google Groups "Semantic Vectors" group.
> To post to this group, send email to semanti...@googlegroups.com.
> To unsubscribe from this group, send email to semanticvecto...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/semanticvectors?hl=en.
>