auto-syncing mnesia after a network split

28 views
Skip to first unread message

Joel Reymont

unread,
Dec 2, 2008, 5:51:22 PM12/2/08
to think...@googlegroups.com
I started this thread on the Erlang Questions list to see how Mnesia
can automatically recover after a network split:

http://is.gd/9WKu

What are network splits and why is this important?

If someone kicks a switch in your data center or otherwise breaks the
connection between the nodes or servers in your cluster then
replication of your Mnesia tables will stop.

It would be ideal if when the nodes rejoined Mnesia automatically
synched your data by merging transactions that happened on the
different nodes while the network was split. Mnesia doesn't do that,
though, and so (potential) mayhem ensues.

This makes Mnesia unusable as the backend for the journal, unless some
measures are taken to implement automatic synchronization after a
network split.

--
http://twitter.com/wagerlabs


Eli Liang

unread,
Dec 2, 2008, 8:08:57 PM12/2/08
to thinkerlang
I think I referred to this on the Erlang questions list, but why not
just use the method used for other enterprise databases, such as
Oracle?

Joel Reymont

unread,
Dec 2, 2008, 8:37:52 PM12/2/08
to think...@googlegroups.com

On Dec 3, 2008, at 1:08 AM, Eli Liang wrote:

> I think I referred to this on the Erlang questions list, but why not
> just use the method used for other enterprise databases, such as
> Oracle?


And that method is?

--
http://twitter.com/wagerlabs


Eli

unread,
Dec 3, 2008, 1:56:05 PM12/3/08
to thinkerlang
Perhaps this can be a future project of the journal.

The basic mechanism is that Oracle (since 10g) has 3 services that
work together to handle network splits and heal after a split. One of
the is a configuration server which contains information on the
configuration of all of the distributed databases and their topology.
When a network split occurs, some of the databases are brought down to
ensure they don't become corrupted or out of sync. Then when the
network is whole again, there is a service which reintegrates data.
The net effect is that it is all done without any operator or system
intervention, and the only consequence of a split is performance. I'm
sure more info is available and perhaps some whitepapers on Oracle's
website which could be the starting point of a project to enhance
mnesia.

Jim McCoy

unread,
Dec 3, 2008, 6:01:45 PM12/3/08
to think...@googlegroups.com
This would be much easier than the sort of automatic reconciliation of
potentially conflicting transactions that I had read into the original
post to erlang-questions and in fact I think that most of this is
already provided by the Kai project. Most of the components necessary
already exist or are well-known in the literature, the big open
question I see hanging above all this is the potential performance hit
that might be introduced.

jim

Reply all
Reply to author
Forward
0 new messages