Lucene positional Index vs no position information

41 views
Skip to first unread message

Michael Ruepp

unread,
Jul 14, 2015, 5:29:36 PM7/14/15
to semanti...@googlegroups.com
Hi,

does it make a difference anyhow, if I always create the lucene index WITH position information and then run the default SV buildIndex, or LSA or positionalIndex with the wordradius on it?

What I see from the code it only extends the contentsField with additional Fieldtype and change the Indexoption.

So it maybe does take a greater amount of time but should not have any impact on SV buildIndex, does it? Otherwise it preserves more information like Positions and Offsets in the Docvec which I could read out in an analysis?

Or should I always build a lucene index without position and offset to use the default buildIndex method of SV?


Thanks!

 

Dominic Widdows

unread,
Jul 15, 2015, 12:24:54 AM7/15/15
to semanti...@googlegroups.com
Hi Michael,

If you have the space and the patience, then always build positional indexes. You are correct that the information they contain is a superset.

Best wishes,
Dominic

--
You received this message because you are subscribed to the Google Groups "Semantic Vectors" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semanticvecto...@googlegroups.com.
To post to this group, send email to semanti...@googlegroups.com.
Visit this group at http://groups.google.com/group/semanticvectors.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages