Simultaneous CARP failover for multiple interfaces

Kyle Lanclos

unread,

Apr 23, 2012, 3:36:23 PM4/23/12

to

I have a pair of OpenBSD firewall/routers in a reasonably vanilla
pf + pfsync + CARP configuration, each straddling two routed networks.
The CARP interface on the internal network is the default gateway for
that subnet. The CARP interface on the external network is the default
destination for traffic aimed at the internal network.

It all works splendidly, with one exception.

In order for our firewall to operate effectively, we use 'keep state'
pf rules. We empirically determined that we must have CARP preemption
enabled, otherwise pf cannot properly establish state for new TCP
connections. If pfsync could be told to synchronize incomplete states,
this issue might go away.

Example: firewall1 is the master on the carp1 interface, and firewall2
is the master on the carp2 interface. Inbound traffic to an internal
host arrives via the carp1 interface, and return traffic arrives via
the carp2 interface. pf will not establish state for this new connection
since the inbound and return traffic are not handled by the same firewall
host.

We thus use CARP preemption to force one of the firewalls to always come
up as the master for both CARP interfaces. This is not so unresaonable,
though it might be nice if the documentation presented this use-case (or
similar) as a rationale for needing CARP preemption.

Where this presents a problem is if the current CARP master loses a single
network interface (cable unplugged, isolated hardware failure, sysadmin
failure, etc.), as opposed to the CARP master failing entirely. The slave
will appropriately assume the master role for one CARP interface, but will
*not* do so for the second.

Beyond the basic asynchronous routing + state creation issue described
above, this specific failure mode would still result in a complete inability
to pass traffic in a single direction, since packets would come into the
half-dead master via its good interface, but could not be forwarded to the
network associated with the failed interface.

We would like our otherwise nicely redundant firewall configuration to be
resilient against this type of failure. Short of running a cron job every
sixty seconds to check the interface state, is there some way we can
automatically force the promotion of a CARP slave if a second CARP interface
flips from slave to master?

Suggestions are most appreciated. I apologize if the CARPish-flavor of this
question is not entirely appropriate for the pf mailing list; if there is
another list that would be more suited for this question, please let me know.

--Kyle

Daniel Hartmeier

unread,

Apr 23, 2012, 5:04:05 PM4/23/12

to

On Mon, Apr 23, 2012 at 11:49:14AM -0700, Kyle Lanclos wrote:

> Where this presents a problem is if the current CARP master loses a single
> network interface (cable unplugged, isolated hardware failure, sysadmin
> failure, etc.), as opposed to the CARP master failing entirely. The slave
> will appropriately assume the master role for one CARP interface, but will
> *not* do so for the second.

Yes, it will:

net.inet.carp.preempt Allow virtual hosts to preempt each other.
It is also used to failover carp interfaces
as a group. When the option is enabled and
one of the carp enabled physical interfaces
goes down, advskew is changed to 240 on all
carp interfaces. See also the first example.
Disabled by default.

(i.e. this single sysctl knob enables both group failover and failback)

This covers link state change (unplugged cable) as well as
administrative down of the physical interface.

It does not cover the case where the link remains up, but the uplink
switch stops forwarding, for instance, but...

> We would like our otherwise nicely redundant firewall configuration to be
> resilient against this type of failure. Short of running a cron job every
> sixty seconds to check the interface state, is there some way we can
> automatically force the promotion of a CARP slave if a second CARP interface
> flips from slave to master?

.. see ifstated(8), which can ping uplink hops and issue ifconfig advskew
changes to demote the master when appropriate.

Daniel

Stuart Henderson

unread,

Apr 23, 2012, 5:09:20 PM4/23/12

to

On 2012/04/23 11:49, Kyle Lanclos wrote:
> In order for our firewall to operate effectively, we use 'keep state'
> pf rules. We empirically determined that we must have CARP preemption
> enabled, otherwise pf cannot properly establish state for new TCP
> connections. If pfsync could be told to synchronize incomplete states,
> this issue might go away.

pfsync(4)'s "defer" option might help. there is a penalty but it might
be acceptable for your use case.

Where more than one firewall might actively handle packets, e.g. with
certain ospfd(8), bgpd(8) or carp(4) configurations, it is beneficial to
defer transmission of the initial packet of a connection. The pfsync
state insert message is sent immediately; the packet is queued until
either this message is acknowledged by another system, or a timeout has
expired. This behaviour is enabled with the defer parameter to
ifconfig(8).

Karl O. Pinc

unread,

Apr 23, 2012, 6:02:16 PM4/23/12

to

On 04/23/2012 03:19:44 PM, Stuart Henderson wrote:
> On 2012/04/23 11:49, Kyle Lanclos wrote:
> > In order for our firewall to operate effectively, we use 'keep
> state'
> > pf rules.

>

> pfsync(4)'s "defer" option might help. there is a penalty but it
> might
> be acceptable for your use case.

I didn't notice _any_ reference to pfsync in the original
post. Perhaps this is part of the problem?

Karl <k...@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

Kyle Lanclos

unread,

Apr 23, 2012, 6:07:04 PM4/23/12

to

Daniel Hartmeier wrote:
> Yes, it will:
>
> net.inet.carp.preempt Allow virtual hosts to preempt each other.
> It is also used to failover carp interfaces
> as a group. When the option is enabled and
> one of the carp enabled physical interfaces
> goes down, advskew is changed to 240 on all
> carp interfaces. See also the first example.
> Disabled by default.

Whoops, thanks for pointing that out. I will re-do my tests (the setup is
at a remote site), and make sure that I'm done making things up.

I'm not sure where your quote came from; I see similar text in NetBSD docs,
but this is what I (now) find in the OpenBSD FAQ:

net.inet.carp.preempt Allow hosts within a redundancy group
that have a better advbase and advskew
to preempt the master. In addition, this
option also enables failing over a group
of interfaces together in the event that
one interface goes down. If one physical
CARP-enabled interface goes down, CARP
will increase the demotion counter,
carpdemote, by 1 on interface groups that
the carp(4) interface is a member of, in
effect causing all group members to fail-
over together.

http://www.openbsd.org/faq/pf/carp.html

I did establish an interface group for my two CARP interfaces, but I did
not do my failover tests while it was in that state. As I said, I clearly
need to re-do my tests.

> It does not cover the case where the link remains up, but the uplink
> switch stops forwarding, for instance, but...

At least for our case, a switch failure on either side of the firewall/router
represents a total loss of connectivity.

However, this does jog another potential failure mode. Some of our older
OpenBSD firewalls (going back to OpenBSD) will occasionally (maybe once a
year) "lose" a network interface. If you logged in at the console of a
host while it was in this state, the interface would look perfectly normal,
but it would not pass any traffic. I callously worked around this by
administratively cycling each network interface on the affected machine(s)
on a weekly basis.

If we ran into this failure mode with our CARP firewalls, I'm assuming the
master would keep right on thinking it was the master, and not attempt to
demote iteslf.

While it is certainly helpful for self-demotion of a master to occur,
it seems reasonable for self-promotion of a slave to also occur.

> ... see ifstated(8), which can ping uplink hops and issue ifconfig advskew

> changes to demote the master when appropriate.

Thanks, I'll look into that.

--Kyle

Kyle Lanclos

unread,

Apr 24, 2012, 2:34:12 AM4/24/12

to

Karl O. Pinc wrote:
> I didn't notice _any_ reference to pfsync in the original
> post. Perhaps this is part of the problem?

I originally wrote:
> I have a pair of OpenBSD firewall/routers in a reasonably vanilla

> pf + pfsync + CARP configuration...

It sounds like using 'defer' may allow pf + pfsync to handle the issues
resulting from asymmetric routing of packets, as long as the asymmetry
is fully contained within the pfsync'd hosts.

I apologize if I gave too much airtime to the pf + pfsync aspects of
what I was trying to resolve, we largely worked around those by enabling
carp preemption.

--Kyle

Daniel Hartmeier

unread,

Apr 24, 2012, 3:01:46 AM4/24/12

to

On Mon, Apr 23, 2012 at 02:23:20PM -0700, Kyle Lanclos wrote:

> However, this does jog another potential failure mode. Some of our older
> OpenBSD firewalls (going back to OpenBSD) will occasionally (maybe once a
> year) "lose" a network interface. If you logged in at the console of a
> host while it was in this state, the interface would look perfectly normal,
> but it would not pass any traffic. I callously worked around this by
> administratively cycling each network interface on the affected machine(s)
> on a weekly basis.
>
> If we ran into this failure mode with our CARP firewalls, I'm assuming the
> master would keep right on thinking it was the master, and not attempt to
> demote iteslf.
>
> While it is certainly helpful for self-demotion of a master to occur,
> it seems reasonable for self-promotion of a slave to also occur.

Without any active probing, like with ifstated, there is no way to
distinguish which uplink is "up but not forwarding". It could be either
the master's, or the backup's, or both. Statistically, for every time you
improve the situation by failing over, there is a time you shoot yourself
in the foot doing the same. If you do nothing, you have the same chances,
and things remain simpler.

With ifstated, you only need to change one side's advskew so their order
reverses, then rely on carp's election process. For instance, run
ifstated only on the master, pinging next hops on all sides, and

- when any ping fails (the first time), demote:
increase own advskew above the backup's

- when all pings succeed (again), promote:
reset own advskew to the original value (below the backup's)

In your example above, ifstated on the master would detect a ping
failure on one next hop and demote by increasing its advskew above the
backup's.

With preempt enabled, the master would lose election on the other
interface and therefore group failover all interfaces to backup state,
while the backup would win election and group failback all interface to
master state, i.e. self-promotion of the backup is done only through carp
election.

When you fix the interface, ifstated will see all pings succeed again,
and reset advskew. Now the (preferred) master wins election and fails
back.

I don't think there is a case where it's helpful to run scripts on both
the master and the backup. You'd have to be careful to not introduce new
failure cases, for instance when a next hop is unreachable from both.

Daniel

Kyle Lanclos

unread,

Apr 27, 2012, 4:03:50 PM4/27/12

to

I quoted from the OpenBSD FAQ:

> net.inet.carp.preempt Allow hosts within a redundancy group
> that have a better advbase and advskew
> to preempt the master. In addition, this
> option also enables failing over a group
> of interfaces together in the event that
> one interface goes down. If one physical
> CARP-enabled interface goes down, CARP
> will increase the demotion counter,
> carpdemote, by 1 on interface groups that
> the carp(4) interface is a member of, in
> effect causing all group members to fail-
> over together.
>

> I did establish an interface group for my two CARP interfaces, but I did
> not do my failover tests while it was in that state. As I said, I clearly
> need to re-do my tests.

I re-did my tests this morning, and yes indeed, both CARP interfaces in
the CARP group switch automatically between the BACKUP and MASTER states
when I do abusive things to one of the connections. I say "automatically"
instead of "simultaneously" because there was a bit of a delay while the
reconnected link negotiated speed. I blame Cisco for that.

Thanks for all the assistance, I found this exchange to be quite helpful.

--Kyle