Some help needed on understanding Blur

20 views

Skip to first unread message

Ravikumar Govindarajan

unread,

Jul 3, 2013, 3:31:13 AM7/3/13

to blur...@googlegroups.com

Blur is great project and it has been generously open-sourced. Thanks for it.

I have a few doubts and it would be great to get some clarifications

1. As I understand, a WAL file is involved in replace/deleteRow ops. How do we prevent multiple threads from inter-leaving writes to the same WAL file in HDFS.

2. Supposing I configure replication in HDFS and one data-node is temporarily unavailable. IW write fails but the record/row-info is persisted in WAL.

When will the WAL be replayed? What happens if client catches this exception and re-tries the write again?

3. When issuing a search-query, data is fetched from underlying HDFSDirectory, into the BlurShardServer.

In-case of a WildCardQuery etc..., will this result in too much of network transfer?