Blur is great project and it has been generously open-sourced. Thanks for it.
I have a few doubts and it would be great to get some clarifications
1. As I understand, a WAL file is involved in replace/deleteRow ops. How do we prevent multiple threads from inter-leaving writes to the same WAL file in HDFS.
2. Supposing I configure replication in HDFS and one data-node is temporarily unavailable. IW write fails but the record/row-info is persisted in WAL.
When will the WAL be replayed? What happens if client catches this exception and re-tries the write again?
3. When issuing a search-query, data is fetched from underlying HDFSDirectory, into the BlurShardServer.
In-case of a WildCardQuery etc..., will this result in too much of network transfer?
--
Ravi