Hello,
suppose I have one index with incremental updates and another one with batch updates, both index share the same schema.
The first index, lets name it "daily index", performs about 90% add and 10% updates, low hit on searches. Begin of every day this index is deleted and created new.
The second index, lets name it "historical index", holds all data coming from the daily index for several months. Every night a batch should move the daily index to the historical.
What's the best way to update the historical index with the new daily index data?
For "best" I mean, the fastest and most reliable solution.
Some thought: we want to avoid re-process the input data because it's time and resource consuming and also because we already did to populate the daily index.
IF it is a merge between 2 (Solr) index, does that mean we just do a merge between the two? do we need room for "historical index * 2 + daily index"? or "just historical+daily"?
Speaking about volume, the daily index is about 30 Mil records, each records has about 40 fields, average record size is 1k.
Historical index will host between 100 to 300 daily volume, if we take the lower case it will sum up about 3 Bil records.
Indexing will use a KeywordAnalyzer, we need to keep fields the way they are in input sources.
Thank you.
Regards,
Paolo