Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Failover Internals and Timing
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Cwolf  
View profile  
 More options Oct 8 2012, 3:01 am
From: Cwolf <chase.wolfin...@gmail.com>
Date: Mon, 8 Oct 2012 00:01:09 -0700 (PDT)
Local: Mon, Oct 8 2012 3:01 am
Subject: Failover Internals and Timing

Hello - It seems that the replication heartbeat is not configurable.  In
1.8 it was (did not check 2.0).  Are there any details on how heartbeat and
fail-over timings work?   What is the timing on fail-overs now (it seems
fast - but need to understand).  We are looking to get sub second fail-over
working in an environment where all machines are on a single switch.

Thanks


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cwolf  
View profile  
 More options Oct 8 2012, 10:59 am
From: Cwolf <chase.wolfin...@gmail.com>
Date: Mon, 8 Oct 2012 07:59:43 -0700 (PDT)
Local: Mon, Oct 8 2012 10:59 am
Subject: Re: Failover Internals and Timing

Some more information:

I just did some experiments and it is taking up to 15 seconds to fail-over
on a primary step down.  This seems ridiculous. Does anyone know if this is
configurable?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Stephen Steneker  
View profile  
 More options Oct 12 2012, 2:19 am
From: Stephen Steneker <stephen.stene...@10gen.com>
Date: Thu, 11 Oct 2012 23:19:16 -0700 (PDT)
Local: Fri, Oct 12 2012 2:19 am
Subject: Re: Failover Internals and Timing

> Hello - It seems that the replication heartbeat is not configurable.  In
> 1.8 it was (did not check 2.0).  Are there any details on how heartbeat and
> fail-over timings work?   What is the timing on fail-overs now (it seems
> fast - but need to understand).  We are looking to get sub second fail-over
> working in an environment where all machines are on a single switch.

Hi,

You are correct that the replication heartbeat is not configurable.  The
heartbeat request can either receive a response, an error, or a timeout.
 Failover should happen within ~20 seconds.

The current documentation
is: http://www.mongodb.org/display/DOCS/Replica+Set+Internals

Failing too fast (sub-second, in particular) generally isn't a
positive/desirable outcome as you can cause flapping in the event of
transient network issues.

The replica set failover is generally still faster than the default TCP
timeout setting (which, depending on your O/S, can be up to a few minutes).

Cheers,
Stephen


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »