Multiple fields

26 views
Skip to first unread message

SoftwareEngineer

unread,
Aug 8, 2016, 5:46:06 PM8/8/16
to Semantic Vectors
Hello,

I'm trying to build the vector stores using my index file ( That is built using only Lucene), however, I have fallen into the problem of multiple fields. My index contains 11 fields and none of them is named " contents" which I realize that's what SV is looking for. I have searched here and I found out that you recommend to change one line of  the source code but that was back in 2008, I'm wondering if there is any way I can specify the fields or anything else in the command line? how to build those files with having an index of 11 fields. And most importantly, can I still do query search using specific field name ( just like the case in Lucene)? 



Thanks,

SoftwareEngineer

unread,
Aug 8, 2016, 6:17:39 PM8/8/16
to Semantic Vectors
I wish there was an edit option to the post, but I couldn't see it, so I'm posting my other question here. We are re-indexing very often when data changes, do I need to re-build the vector files each time after re-indexing?

Dominic Widdows

unread,
Aug 8, 2016, 8:48:59 PM8/8/16
to semanti...@googlegroups.com
To say which fields you want indexed, use the --contentsfields flag.

But only the strings get indexed by default, the different fields aren't stored separately. If you want this quickly the way to do it is something like telling the caller of LuceneUitls.getTermsForField to stick the field name on the front of the term string, with some delimiter you'd never see in the text in between, e.g. "::". You'd need to do something similar at query time. It's not a good production approach for maintaining, but it will work.

For reindexing, yes, just rebuild. The argument is that SV is much faster than Lucene, so incremental Lucene update + full SV rebuild is still pretty fast.

Best wishes,
Dominic

--
You received this message because you are subscribed to the Google Groups "Semantic Vectors" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semanticvectors+unsubscribe@googlegroups.com.
To post to this group, send email to semanticvectors@googlegroups.com.
Visit this group at https://groups.google.com/group/semanticvectors.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages