Multiple indices?

52 views
Skip to first unread message

David Lamkins

unread,
Apr 30, 2012, 4:37:39 PM4/30/12
to montez...@googlegroups.com
I'd appreciate if someone would point me in the right direction for either merging indices or querying against multiple indices.

Thanks!

John Wiseman

unread,
May 9, 2012, 3:50:15 PM5/9/12
to montez...@googlegroups.com
Since nobody has replied with a better answer... Maybe translating a Lucene solution would work?  Like http://blog.asteriosk.gr/2009/03/31/merging-multiple-lucene-indexes/


John


On Mon, Apr 30, 2012 at 1:37 PM, David Lamkins <dlam...@gmail.com> wrote:
I'd appreciate if someone would point me in the right direction for either merging indices or querying against multiple indices.

Thanks!

--
You received this message because you are subscribed to the Google Groups "montezuma-dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/montezuma-dev/-/kOkl2SUCJUQJ.
To post to this group, send email to montez...@googlegroups.com.
To unsubscribe from this group, send email to montezuma-de...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/montezuma-dev?hl=en.

David Lamkins

unread,
May 11, 2012, 1:03:34 PM5/11/12
to montez...@googlegroups.com
Thank you, John. That may be helpful in the long run.

In the short term I've adopted the expedient of allowing indices to proliferate.

I'll be indexing roughly a million web pages (30 to 50 GB of data) every day, partitioning the daily cache of pages amongst a number of index writers in order to keep up with the volume.

The searcher processes the indices on multiple threads and aggregates the results.

Assuming that I can merge indices in Montezuma as suggested by the Java code you found, I'll have to consider the run time. The indices are very large; merging will involve a significant amount of I/O over and above what I'm already spending to write the partitioned indices.

I'm also looking into another approach that may, if this API does what I think it does, make aggregating a search over multiple indices a little easier. There's a MULTI-READER class that looks like it may be useful...


On Wednesday, May 9, 2012 12:50:15 PM UTC-7, John Wiseman wrote:
Since nobody has replied with a better answer... Maybe translating a Lucene solution would work?  Like http://blog.asteriosk.gr/2009/03/31/merging-multiple-lucene-indexes/


John
On Mon, Apr 30, 2012 at 1:37 PM, David Lamkins <dlam...@gmail.com> wrote:
I'd appreciate if someone would point me in the right direction for either merging indices or querying against multiple indices.

Thanks!

--
You received this message because you are subscribed to the Google Groups "montezuma-dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/montezuma-dev/-/kOkl2SUCJUQJ.
To post to this group, send email to montez...@googlegroups.com.
To unsubscribe from this group, send email to montezuma-dev+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages