Ganeti and (again) automatic failover

55 views
Skip to first unread message

Joril

unread,
Feb 28, 2012, 5:37:19 AM2/28/12
to ganeti
Hi everyone!
I understand that Ganeti doesn't address automatic failover, but I am
at a loss as to what my options are...
I have a simple 2-node cluster with DRBD, and I just want that in case
of hardware failure of a node, the other node should start the
concerned VMs... What could I use?

Iustin Pop

unread,
Feb 28, 2012, 5:50:11 AM2/28/12
to gan...@googlegroups.com

I think most people use pacemaker or similar equivalents; the big
problem in such a cluster is correctly determining the other node is
really down as opposed to just communication issues.

regards,
iustin

tschend

unread,
Feb 28, 2012, 7:14:06 AM2/28/12
to ganeti
If you have a DRAC (dell) or iLO (HP) in the servers, you can use this
as STONITH device to make sure the failed node is really dead.

This is supported by pacemaker and could help in a 2-node Cluster.

If you do not have access to one of this OOB management tools, just
add a third node and configure corosync + pacemaker.

From my point of view are these the only reliable options you have.

Regards
Thomas

Joril

unread,
Feb 29, 2012, 2:59:32 AM2/29/12
to ganeti
On Feb 28, 1:14 pm, tschend <thomas.sch...@gmail.com> wrote:
> If you have a DRAC (dell) or iLO (HP) in the servers, you can use this
> as STONITH device to make sure the failed node is really dead.
>
> This is supported by pacemaker and could help in a 2-node Cluster.
>
> If you do not have access to one of this OOB management tools, just
> add a third node and configure corosync + pacemaker.
>
> From my point of view are these the only reliable options you have.

Understood, many thanks (to Iustin too) :)

tschend

unread,
Feb 29, 2012, 2:04:31 PM2/29/12
to ganeti
When you are done with your setup, it would be nice to see a howto on
corosync+pacemaker+ganeti in the wiki :-)

Lance Albertson

unread,
Feb 29, 2012, 4:50:09 PM2/29/12
to gan...@googlegroups.com
1+ and then we could put it on the project wiki site!
--
Lance Albertson
Systems Administrator / Architect                        Open Source Lab
Information Services                             Oregon State University
Reply all
Reply to author
Forward
0 new messages