Failover and harep

60 views
Skip to first unread message

Scott Tanner

unread,
Sep 11, 2015, 11:26:13 AM9/11/15
to ganeti
Hello All,

I've been testing ganeti (ver 2.11) and I'm a little unclear on the setup and expectations for automated instance failover. 

Here's my setup:

* small 3-node cluster running with a few instances on each node
* disk is shared storage (SAN) using the LVM external storage provider
* added the "ganeti:watcher:autorepair:failover" tag to the cluster, each node, and each instance
* harep is running from cron every 2 minutes on the master
** there's a warning from harep that the cluster has inconsistent data: node is missing 802 MB ram.  There's plenty of free ram on each node, so it shouldn't be an issue.

I had a node failure today (server crashed) and marked the node as offline.  The instance that was running on the failed node is showing as ERROR_nodeoffline but the output from harep indicates that all of the instances are healthy.  


Should the failed instance be moved to another node, or is something missing or preventing the instance from failing-over?


Any help clarification would be appreciated.

Thank you!
Scott

candlerb

unread,
Sep 11, 2015, 3:51:19 PM9/11/15
to ganeti
On Friday, 11 September 2015 18:26:13 UTC+3, Scott Tanner wrote:

Should the failed instance be moved to another node, or is something missing or preventing the instance from failing-over?


No - harep doesn't do what I think you want (i.e. when node X fails, automatically failover all instances which were primary on node X)

The reason given is that ganeti cannot tell with any certainty that the node has failed, as opposed to (say) a network partition. It would be very dangerous if it automatically started a copy of the instance on a different node, while it was still running on the original node. Instant split brain and IP conflict could be very serious.

So as I understand it, ganeti waits for you to *manually* mark the node as offline, before doing the repair. That strikes me as not being enormously useful; if you are around to manually mark the node as offline, you can manually evacuate it too. But without some sort of out-of-band STONITH I don't think automated node failure repair will ever be implemented.

Arguably the documentation is unclear on this point, as originally I had the same expectation as you.

For more details see

candlerb

unread,
Sep 11, 2015, 3:55:55 PM9/11/15
to ganeti
Sorry, I missed where you said you "marked the node as offline"

I think harep is *supposed* to migrate in that circumstance, but I have not tested it myself.
 

Scott Tanner

unread,
Sep 13, 2015, 11:36:11 AM9/13/15
to ganeti
On Friday, September 11, 2015 at 3:55:55 PM UTC-4, candlerb wrote:
Sorry, I missed where you said you "marked the node as offline"

I think harep is *supposed* to migrate in that circumstance, but I have not tested it myself.
 

Thanks for responding.  I read through the other threads on harep and the bug reports filed, and saw the comment about "the new repair deamon currently being designed", but I can't find any solid documentation on it.   I've also checked through the various slides from conferences that mention harep, but I haven't seen anything that show actual implementation and use.

I have a nagios exec that takes the node offline in the event of a failure - with plenty of checks to ensure it doesn't turn into a split-brain situation.  I assumed harep would handle the instance evacuation, but perhaps I'll have to add that to the nagios execs as well.   



Thanks,
-Scott

Petr Pudlák

unread,
Sep 22, 2015, 9:27:17 AM9/22/15
to gan...@googlegroups.com
Hi Scott,

sorry for a late reply, last week everyone was on GanetiCon.

Unfortunately, Harep supports only DRBD and plain instances, it doesn't support other storage types. Admittedly this information is missing in the man page, I'll update it.

The design for the repair daemon is available here: http://docs.ganeti.org/ganeti/master/html/design-repaird.html
It's currently being developed on master and It's functionality will be superior to Harep - it'll support general node evacuation, and it'll allow to trigger it automatically with custom monitoring scripts.

  Best regards,
  Petr
--
Petr Pudlák
Software Engineer
Google Germany GmbH
Dienerstraße 12
80331 München

Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind, leiten Sie diese bitte nicht weiter, informieren Sie den Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
       
This e-mail is confidential. If you are not the right addressee please do not forward it, please inform the sender, and please erase this e-mail including any attachments. Thanks.

Scott Tanner

unread,
Sep 29, 2015, 6:48:39 PM9/29/15
to ganeti
Petr,

Thanks for the reply, its a great help.

Do you know if the new repair daemon will support ext storage?

Thanks,
Scott

Petr Pudlák

unread,
Oct 8, 2015, 2:08:47 AM10/8/15
to ganeti
Hi Scott,

as far as I know, it should not be tied to any particular storage system, so yes, it should support it.

  Best regards,
  Petr
--
Reply all
Reply to author
Forward
0 new messages