Quorum and backup-only node

38 views
Skip to first unread message

Dominik Klein

unread,
Oct 13, 2017, 2:49:49 AM10/13/17
to codership
Hi

we are trying to design a small(ish) galera cluster. We need synchronous replication across two machines but understand that, since galera is quorum based, it is not a good idea to run a two node setup. 

So whilst playing with a dev setup, we found that our initial idea to setup two powerful machines as the actual database nodes and have a third galera instance in a small virtual machine as a quorum only (and backup) node might be problematic.

So for the rest of this posting we say: nodes one and two: powerful hardware, node three: small vm

We configured the third node to wsrep_desync=on and assumed that even if this node was slower, this would not impact cluster performance on nodes one and two. However, once (for testing purposes) saturating node three's disk with dd, inserts to nodes one and two stall. We also ran this test with wsrep_desync=off (so flow control could do its thing), but the result is the same: inserts stall as soon as dd saturates the vm's disk.

Is the described setup something people use or what's the best practice here? Two hardware boxes offer more than enough capacity/performance for our needs, so we'd rather not install a third just as powerful machine. Is it at all possible to configure such an "asynchronous quorum node" (and if so: which additional parameters do we need)?

Thanks
Dominik

Jörg Brühe

unread,
Oct 13, 2017, 3:46:58 AM10/13/17
to codersh...@googlegroups.com
Hi!


On 12.10.2017 13:11, Dominik Klein wrote:
> Hi
>
> we are trying to design a small(ish) galera cluster. We need synchronous
> replication across two machines but understand that, since galera is
> quorum based, it is not a good idea to run a two node setup.

Correct.

>
> [[...]] Two hardware boxes offer more than enough capacity/performance for
> our needs, so we'd rather not install a third just as powerful machine.
> Is it at all possible to configure such an "asynchronous quorum node"
> (and if so: which additional parameters do we need)?

Look into "garbd". This is a process which acts as a Galera node for all
purposes of connectivity and quorum, but does not store any data, so
AIUI it is extremely lightweight on its machine.

There is a caveat: Assume a "normal" setup with 3 data nodes A, B, and
C. If node A fails, B and C still have quorum, operation continues. When
node A recovers, it will ask B or C to provide the changes, assume B.
This might be a "mysqldump" SST, blocking B, but C will still be
available for operation.
If one of B or C is a garbd, it cannot provide data, so the cluster
would be un-available during a blocking SST.

Also, remember that any third node, be it a full function node or the
"garbd", should be as isolated from the other nodes as possible, so that
there is no common ressource which might the third node fail together
with either the first or the second node.
In the typical setup with two "real" nodes in two distinct data centers,
the third node should be separate from both of them.


HTH,
Jörg

--
Joerg Bruehe, Senior MySQL Support Engineer, joerg....@fromdual.com
FromDual GmbH, Rebenweg 6, CH - 8610 Uster; phone +41 44 500 58 26
Geschäftsführer: Oliver Sennhauser
Handelsregister-Eintrag: CH-020.4.044.539-3

Dominik Klein

unread,
Oct 17, 2017, 3:11:09 AM10/17/17
to codership
Thank you Jörg! I don't know how I missed garbd ... This is exactly what I was looking for.

Regards
Dominik
Reply all
Reply to author
Forward
0 new messages