Perfect Failure Detector? In real life? If you know your FLP and your
Toueg, you know that PFDs cannot be implemented without support from a
synchronous network. However, thanks to some machinery, the authors
implement a PFD in a fully asynchronous system. The idea is simple: if
the PFD "thinks" that a process is dead, it turns this belief into
reality by surgically "killing" it.
http://research.microsoft.com/apps/pubs/default.aspx?id=80618
The tone of the paper upsets me. For example, they keep making very
rough approximation of key Distributed System concepts and of the
state-of-the-art in Distributed System. This is a very bad marketing
exercise, where the reader is asked to be an idiot that should buy
these arguments.
Nonetheless, the solution looks "ok" to me: it's an "architectural"
solution to a problem that cannot be solved in full generality (as
predicted by the theory). So, that's fine. Hence, if I take my
"distributed system" hat, I would say: that's ok, do whatever you want
as soon as it seemingly works. Now, if I take my brand-new "operating
system" hat, I would ask for more measurements, scalability results,
real-life implementations, comparisons with traditional
asynchronous-based systems, etc.
Regards,
--
Pierre-Evariste DAGAND
http://perso.eleves.bretagne.ens-cachan.fr/~dagand/