[erlang-questions] avoiding overloading mnesia

Skip to first unread message

Ulf Wiger

Aug 12, 2009, 5:48:22 AM8/12/09
to erlang-questions Questions

There are some reoccuring themes when it comes to mnesia:

1 Mnesia handles partitioned networks poorly
2 Mnesia doesn't scale
3 Stay away from transactions

I've argued that Mnesia provides the tools to handle [1],
and that most DBMSs that guarantee transaction-level
consistency are hard-pressed to do better. A few offer
functionality (e.g. MySQL Cluster's Arbitrator) that could
be added on top of the basic functionality provided by
Mnesia. DBMSs that offer 'Eventual consistency' may fare
better. OTOH, one should really think about what the
consistency requirements of the application are, and pick
a DBMS that aims for that level.

Regarding [2], there are examples of Mnesia databases that
have achieved very good scalability. It is not the best
regarding writes/second to persistent storage, but as with
[1], think about what your requirements are. Tcerl, just
to name an example, gives much better write throughput, but
requires you to explicitly flush to disk. Chances are that
your data loss will be much greater if you suffer e.g.
a power failure. Don't take this as criticism of tcerl, but
think about what your recovery requirements are.

I am very wary about [3], mainly because I've seen many
abuses of dirty operations, and observed that many who use
dirty updates do it just because "it has to be fast",
without having measured performance using transactions, or
thought about what they give up when using dirty updates.

In some cases, transactions can even be faster than dirty.
This is mainly true if you are doing batch updates on a
table with many replicas. With dirty, you will replicate
once for each write, whereas a transaction will replicate
all changes in the commit message. Taking a table lock will
more or less eliminate the locking overhead in this case,
and sticky locks can make it even cheaper.

Apart from the obvious problems with dirty writes (no
concurrency protection above object-level atomicity,
no guarantee that the replicas will stay consistent),
there is also a bigger problem of overload.

If you have a write-intensive system, and most writes
take place from one node, and are replicated to one or
more others, consider that the replication requests all
go through the mnesia_tm process on the remote node,
while the writers perform the 'rpc' from within their
own process. Thus, if you have thousands of processes
writing dirty to a table, the remote mnesia_tm process(es)
may well become swamped.

This doesn't happen as easily with transactions, since all
processes using transactions also have to go through their
local mnesia_tm.

One thing that can be done to mitigate this is to use
sync_dirty. This will cause the writer to wait for the
remote mnesia_tm process(es) to reply. If you have some
way of limiting the number of writers, you ought to be able
to protect against this kind of overload.

My personal preference is to always start with transactions,
until they have proven inadequate. Most of the time, I find
that they are just fine, but YMMV.

Ulf W
Ulf Wiger
CTO, Erlang Training & Consulting Ltd

erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Steve Davis

Aug 12, 2009, 12:45:01 PM8/12/09
to erlang-q...@erlang.org
I, for one, am very grateful that Ulf has taken the time to provide
this clarification.

I have suspected for a while that many concerns voiced here over
mnesia performance were not properly grounded.


Paul Mineiro

Aug 12, 2009, 1:08:22 PM8/12/09
to erlang-q...@erlang.org
On Wed, 12 Aug 2009, Ulf Wiger wrote:
> Regarding [2], there are examples of Mnesia databases that
> have achieved very good scalability. It is not the best
> regarding writes/second to persistent storage, but as with
> [1], think about what your requirements are. Tcerl, just
> to name an example, gives much better write throughput, but
> requires you to explicitly flush to disk. Chances are that
> your data loss will be much greater if you suffer e.g.
> a power failure. Don't take this as criticism of tcerl, but
> think about what your recovery requirements are.

Totally accurate. In particular we developed tcerl for a system that runs
distributed Mnesia on EC2. When you are using Mnesia in distributed mode
you get a fresh copy of every table on startup by default, so trading off
local consistency across erlang VM crashes for speed was a good choice.
(In addition, on EC2, if you lose an entire instance, the local drive
state is not recoverable anyway).

So that's an example of what Ulf is talking about, thinking about your
recovery requirements.

-- p

p.z. Tokyocabinet has transaction support but I haven't done the work to
integrate it into tcerl. I'm such an EC2 fan, it just hasn't come up.

Reply all
Reply to author
0 new messages