Thank you, John. That may be helpful in the long run.
In the short term I've adopted the expedient of allowing indices to proliferate.
I'll be indexing roughly a million web pages (30 to 50 GB of data) every day, partitioning the daily cache of pages amongst a number of index writers in order to keep up with the volume.
The searcher processes the indices on multiple threads and aggregates the results.
Assuming that I can merge indices in Montezuma as suggested by the Java code you found, I'll have to consider the run time. The indices are very large; merging will involve a significant amount of I/O over and above what I'm already spending to write the partitioned indices.
I'm also looking into another approach that may, if this API does what I think it does, make aggregating a search over multiple indices a little easier. There's a MULTI-READER class that looks like it may be useful...
On Wednesday, May 9, 2012 12:50:15 PM UTC-7, John Wiseman wrote:
I'd appreciate if someone would point me in the right direction for either merging indices or querying against multiple indices.
Thanks!