Hello,
I'm parsing lots (1M maybe) of small XML snippets to maps that I want to index with elastisch.
Right now I'm parsing chunks of them and then insert them à 2000 using bulk-operations.
The problem I now noticed is that this bulk insertion doesn't update an existing document
with the matching ID if it exists. Now after looking into the elasticsearch docs I understood
that bulk upsertion is generally possible (but I'd have to construct that on my own as there
are no fns in elastisch to generate bulk-upsert operations).
I then saw that the native client provides an upsert fn for single documents
(clojurewerkz.elastisch.native.document/upsert). On a side note: is there a reason
why this is not implemented for the rest client?
Given the amount of data I want to index (ideally as fast as possible) I'm not sure
which route to take.
Using the native client (with higher throughput) could upserting single documents work just fine?
Or would I be better of generating a bulk upsert query?
Thanks for any help, I hope my problem description makes sense :)