__________________________________________________
Do You Yahoo!?
LAUNCH - Your Yahoo! Music Experience
http://launch.yahoo.com
To Unsubscribe: send mail to majo...@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
I have two default routes. One is IP address "A". The other is
IP address "B".
To which IP address do I forward a random packet?
Shouldn't you just use BGP instead? That's what it was designed
to support.
-- Terry
There are some algoritms, for example round-robin
It is not a problem if you assign two equal routes you know what you
want.
If you have two links to one provider and want to balance outgoing
traffic it is much better to do it with to similar routes.
> Shouldn't you just use BGP instead? That's what it was designed
> to support.
OSPF have equal cost multupath, An BGP too (If I not mistaken)
so lack of kernel support of more then one route for one destination
is not good.
Zebra on Linux can do OSPF equal cost multipath but on FreeBSD cant.
Times ago there was hack for multipath routing on:
ftp://ftp.flirble.org/pub/unix/hacks/FreeBSD/mpath/
but it seems this page now removed.
:(
> -- Terry
--
Vladimir B. Grebenschikov
vo...@sw.ru, SWsoft, Inc.
> -----Original Message-----
> From: Vladimir B. Grebenschikov
> Sent: Tuesday, 21 May 2002 17:25
> To: list-freeb...@aims.com.au
> Cc: Oleg Chebotarev; freebsd...@FreeBSD.ORG
> Subject: Re: multi default routes in freebsd !?
>
> [snip]
>
> Times ago there was hack for multipath routing on:
> ftp://ftp.flirble.org/pub/unix/hacks/FreeBSD/mpath/
> but it seems this page now removed.
>
Remove mpath from the above URL and you will find the patches.
> --
> Vladimir B. Grebenschikov
> vo...@sw.ru, SWsoft, Inc.
Regards,
Chris Knight
Systems Administrator
AIMS Independent Computer Professionals
Tel: +61 3 6334 6664 Fax: +61 3 6331 7032 Mob: +61 419 528 795
Web: http://www.aims.com.au
Sounds like a problem with Zebra...
FreeBSD supports BGP just fine.
> Times ago there was hack for multipath routing on:
> ftp://ftp.flirble.org/pub/unix/hacks/FreeBSD/mpath/
> but it seems this page now removed.
It's there still (drop the "mpath/" suffix). It's an OK hack,
but it's a hack (though I'd have to say it's probably worth
integrating into FreeBSD by default so it doesn't get stale,
and it's not "lost").
Multipath routing is not as useful as you imply. Neither is
round-robin'ing between a set of paths. It assumes that the
pool retention time on the router is longer than the drain time
for a single path, such that you end up with a higher aggregate
throughput than you would otherwise get. Most of the time,
with what you are suggesting, you will get the same throughput,
you will just get differential pipe utilization (using B == !A).
When this isn't the case, the amount of latency for a single
path is such that you end up with only a small fractional
improvement, when there is any improvement at all.
The primary failure of this is that it can't detect when a
route goes down, so you are screwed when that happens.
You are much better off using BGP.
If you absolutely refuse to use BGP for some reason which you
absolutely refuse to post to the list, you should consider using
PPPOE and multilink PPP in combination (both are Netgraph nodes).
Even so, you will be screwed when one of your links goes down;
this isn't the case for the original design of mpd (multilink
PPP daemon), since it got to notice carrier loss. Over a fixed
link, there's no notification (and I guess you could have a path
outage without carrier loss even in the mpd case, but it's unlikely).
There is also a VRRP implementation for FreeBSD. I've posted
the URL for it before. In combination, Virtual Router Redundancy
Protocol *and* multipath are, together, roughly equivalent to
using BGP (assuming both your routers are running the VRRP code).
Also, that's assuming you guess correctly on the relative metrics,
when you have asymmetric path speed within the set of paths you
are trying to use, and set up both paths to fail over to each other
with the VIP of the two "routers".
BGP is a better idea (of course).
You might also consider using BGP.
And have I mentioned BGP? 8-) 8-).
-- Terry
> > OSPF have equal cost multupath, An BGP too (If I not mistaken)
> > so lack of kernel support of more then one route for one destination
> > is not good.
> >
> > Zebra on Linux can do OSPF equal cost multipath but on FreeBSD cant.
>
> Sounds like a problem with Zebra...
No, zebra uses in-kernel FIB and can't install two routes with same
prefix into it due to kernel lack of this function.
> FreeBSD supports BGP just fine.
yes, but it not allows BGP to install two routes with same prefix into
FIB.
> > Times ago there was hack for multipath routing on:
> > ftp://ftp.flirble.org/pub/unix/hacks/FreeBSD/mpath/
> > but it seems this page now removed.
>
> It's there still (drop the "mpath/" suffix). It's an OK hack,
> but it's a hack (though I'd have to say it's probably worth
> integrating into FreeBSD by default so it doesn't get stale,
> and it's not "lost").
Ok
> Multipath routing is not as useful as you imply. Neither is
> round-robin'ing between a set of paths. It assumes that the
> pool retention time on the router is longer than the drain time
> for a single path, such that you end up with a higher aggregate
> throughput than you would otherwise get. Most of the time,
> with what you are suggesting, you will get the same throughput,
> you will just get differential pipe utilization (using B == !A).
> When this isn't the case, the amount of latency for a single
> path is such that you end up with only a small fractional
> improvement, when there is any improvement at all.
Lets imagine - we have 3 links 2Mbit/s on different interfaces.
I want to join them all, but I have no control of other end (provider)
so I can't build netgraph-joiner.
Solution with installing 3 routes (through BGP of course, one BGP
session per link) solves problem.
I have 6 Mbit/s summary bandwith.
> The primary failure of this is that it can't detect when a
> route goes down, so you are screwed when that happens.
If interface goes down route will be DOWN by kernel.
So it is not problem.
Anyway if problem happens without downing interface BGP will detect
problem and down routes.
> You are much better off using BGP.
>
> If you absolutely refuse to use BGP for some reason which you
> absolutely refuse to post to the list, you should consider using
> PPPOE and multilink PPP in combination (both are Netgraph nodes).
It is usual practice to use, say OSPF for internal routing (inside one
AS). Yes I understand that some netgraph solutions can help
(multilink PPP or ng_one2many, I am extensively use netgraph)
But if routing protocol there solution for it with alid link down
detection and so why we need to use some "workarounds" to emulate
protocol behavior ?
> Even so, you will be screwed when one of your links goes down;
> this isn't the case for the original design of mpd (multilink
> PPP daemon), since it got to notice carrier loss. Over a fixed
> link, there's no notification (and I guess you could have a path
> outage without carrier loss even in the mpd case, but it's unlikely).
Yes it is possible.
> There is also a VRRP implementation for FreeBSD. I've posted
> the URL for it before. In combination, Virtual Router Redundancy
> Protocol *and* multipath are, together, roughly equivalent to
> using BGP (assuming both your routers are running the VRRP code).
No, VRRP can't help if you want to use summary bandwidth, but helps a
lot if you are want to get redundancy (I think so because I am using
VRRP on my core routers since 4.2).
> Also, that's assuming you guess correctly on the relative metrics,
> when you have asymmetric path speed within the set of paths you
> are trying to use, and set up both paths to fail over to each other
> with the VIP of the two "routers".
>
> BGP is a better idea (of course).
>
> You might also consider using BGP.
>
> And have I mentioned BGP? 8-) 8-).
BGP can use multipath, as well as OSPF.
Possibility of kernel to store some number of routes for one prefix in
FIB can't replace BGP and, on other hand, BGP itself can't replace such
kernel feature.
These two things almost orthogonal.
> -- Terry
PS:
My opinion - it is useful feature for FreeBSD kernel, often used now
as good routing platform.
--
Vladimir B. Grebenschikov
vo...@sw.ru, SWsoft, Inc.
To Unsubscribe: send mail to majo...@FreeBSD.org
__________________________________________________
Do You Yahoo!?
LAUNCH - Your Yahoo! Music Experience
http://launch.yahoo.com
To Unsubscribe: send mail to majo...@FreeBSD.org
Apply patchset mentioned below, and run
# route add default -gateway "A" -gateway "B"
> Thank you,
> Oleg
> > ftp://ftp.flirble.org/pub/unix/hacks/FreeBSD/mpath/
--
Vladimir B. Grebenschikov
vo...@sw.ru, SWsoft, Inc.
To Unsubscribe: send mail to majo...@FreeBSD.org
__________________________________________________
Do You Yahoo!?
LAUNCH - Your Yahoo! Music Experience
http://launch.yahoo.com
To Unsubscribe: send mail to majo...@FreeBSD.org
Whether to use BGP/OSPF is orthogonal to multipath use. Both
OSPF and BGP allow you to install multiple next hops. Adding
multipath support requires, at a minimum, changing struct
rtentry to store multiple `gateways' (which are really next
hops). You also need to fix up code to do the right thing
when adding a new route or deleting an existing route (for
example, a route is not considered dead until paths all next
hops are dead). For forwarding a packet you can always use
the first next hop to start with but very likely you'd want
to change the forwarding code and use some policy to pick a
next hop. Typically you'd want to use the same next hop for
a given source so as to not mess up TCP RTT calculations.
<start meta discussion>
Seems to me that the needs of client, server and router
machines are different enough that the networking stack code
needs to be much more flexible. One can always put in the
most elaborate solution and just not use fancier features for
client machines but then every one pays the cost of
complexity. Not sure what is the right thing to do here.
Anyone care to pontificate?
The same situation exists w.r.t. a number of other features.
</start meta discussion>
-- bakul
No, you don't.
Without the cooperation of the tother end, you don't have
control of the symmetry of the return route. So maybe
your packets are round-robin'ed out interfaces, but they
all come back through the same interface, because you have
no control of the other end.
So if you have interfaces, all with equal throughput, and
your traffic load is bidirectionally symmetric, then you
get:
total unidirectional
interfaces throughput
1 .5
2 .66
3 .75
...
The only way around this is to have the default routes outbound
be totally seperate from the inbound (don't eat your inbound
bandwidth witho outbound load). The unidirectional troughput
goes to "1", and your total throughput doubles. But it never
gets beyond double.
This is why BGP is a better deal: it implicitly enlists the
cooperation of the other end of the link.
> > The primary failure of this is that it can't detect when a
> > route goes down, so you are screwed when that happens.
>
> If interface goes down route will be DOWN by kernel.
> So it is not problem.
No.
,---------------.
| BOX WITH TWO |
,-------| DEF. ROUTES |-------.
route #1 | `---------------' | route #2
| |
,---------------. ,---------------.
| ROUTER "A" | | ROUTER "B" |
`---------------' `---------------'
| |
| |
good link dead link
The link between the box and "router B" remains up. Therefore the
box fails to note that packets sent via "route #2" never get to
their destination.
It's ridiculous to think that the interface for "route #2" will
somehow "magically" down itself on "box" because, several hops
down the line, the line is dead.
> Anyway if problem happens without downing interface BGP will detect
> problem and down routes.
Only if you are using BGP. And if you are using BGP, you don't
need the hack you want, BGP will take care of it for you.
> > You are much better off using BGP.
> >
> > If you absolutely refuse to use BGP for some reason which you
> > absolutely refuse to post to the list, you should consider using
> > PPPOE and multilink PPP in combination (both are Netgraph nodes).
>
> It is usual practice to use, say OSPF for internal routing (inside one
> AS). Yes I understand that some netgraph solutions can help
> (multilink PPP or ng_one2many, I am extensively use netgraph)
> But if routing protocol there solution for it with alid link down
> detection and so why we need to use some "workarounds" to emulate
> protocol behavior ?
You need to read the "README":
ftp://ftp.flirble.org/pub/unix/hacks/FreeBSD/README.MPATH
Specifically, you need to read:
WHAT IT DOES NOT DO
It doesn't detect when remote hosts are down. This is not
the job of the kernel. It's not a routing protocol, it's
not an automatic failover system.
So it is not a "routing protocol solution" in any sense of things.
> > There is also a VRRP implementation for FreeBSD. I've posted
> > the URL for it before. In combination, Virtual Router Redundancy
> > Protocol *and* multipath are, together, roughly equivalent to
> > using BGP (assuming both your routers are running the VRRP code).
>
> No, VRRP can't help if you want to use summary bandwidth, but helps a
> lot if you are want to get redundancy (I think so because I am using
> VRRP on my core routers since 4.2).
I don't think you can get what you want without the cooperation
of the other end of the link. If you want FEC, then use Bill Paul's
FEC code, which does the channel bonding that you seem to want. But
you will need the other end to cooperate.
> BGP can use multipath, as well as OSPF.
> Possibility of kernel to store some number of routes for one prefix in
> FIB can't replace BGP and, on other hand, BGP itself can't replace such
> kernel feature.
I already said that I think the code should be committed to the
main line kernel, and preserved.
> These two things almost orthogonal.
Yes. They are. BGP solves the problem you are trying to solve,
while the multipath is useful, but doesn't solve your problem.
> PS:
>
> My opinion - it is useful feature for FreeBSD kernel, often used now
> as good routing platform.
I already said that I think the code should be committed to the
main line kernel, and preserved.
I just don't think you're going to get out of it what you think
you will get out of it.
-- Terry
It is absolutely trivial to bring the patch up to date against
-stable. Bringing patches up to date against -current is a
useless exercise (IMO), unless they will be integrated with
FreeBSD itself, and carried along without maintenance being
needed in the future to keep them up to date.
-- Terry
"man cvs"
You can checkout a source tree of a given date.
-- Terry
On other end I have providers CISCO router _with_ standard BGP multipath
feature, so I have symmetric situation.
> The only way around this is to have the default routes outbound
> be totally seperate from the inbound (don't eat your inbound
> bandwidth witho outbound load). The unidirectional troughput
> goes to "1", and your total throughput doubles. But it never
> gets beyond double.
I do not talk about _default_ all I have said can right for any set of
routes.
> This is why BGP is a better deal: it implicitly enlists the
> cooperation of the other end of the link.
Once again, BGP can deal with multipath, it can utilize multipath if we
have more then one link with same weight/etc.
> > > The primary failure of this is that it can't detect when a
> > > route goes down, so you are screwed when that happens.
> >
> > If interface goes down route will be DOWN by kernel.
> > So it is not problem.
>
> No.
>
> ,---------------.
> | BOX WITH TWO |
> ,-------| DEF. ROUTES |-------.
> route #1 | `---------------' | route #2
> | |
> ,---------------. ,---------------.
> | ROUTER "A" | | ROUTER "B" |
> `---------------' `---------------'
> | |
> | |
> good link dead link
>
> The link between the box and "router B" remains up. Therefore the
> box fails to note that packets sent via "route #2" never get to
> their destination.
No, if link BOX<->routerB fails kernel must down one of two routes with
same prefix (you said default).
If from side of BOX it is noot seen thar carrier (whatever) on link
BOX<->routerB down - BGP session over this link will down so, BGP
software will down route.
I am not happy with hack pach when one route have more then one gateway,
may opinion to allow insert more then one full-featured route to one
prefix into kernel FIB, but it is implementation issue.
> It's ridiculous to think that the interface for "route #2" will
> somehow "magically" down itself on "box" because, several hops
> down the line, the line is dead.
If you have direct link, say serial - kernel will detect interface down
easily, if you have some kind of multihop - multihop BGP will save
situation, but it is unwise scheme (I think). So in this case you better
to build:
,---------------.
| BOX WITH TWO |
,-------| DEF. ROUTES |-------.
route #1 | `---------------' | route #2
| |
,---------------. ,---------------.
| ROUTER "A" | | ROUTER "B" |
`---------------' `---------------'
| | link X
| | BGP here
good link ,----------------.
| | Another router |
| `----------------'
routes source ----------------------/
So all routes comes from "routes source" (It can be core router of
provider with full view or default issuing router)
And, if "link X" fails set of routes (or only default) will disappear
from BOX<->routerB link, and BGP software will remove second path
routes.
> > Anyway if problem happens without downing interface BGP will detect
> > problem and down routes.
>
> Only if you are using BGP. And if you are using BGP, you don't
> need the hack you want, BGP will take care of it for you.
Again multipath IS in BGP concept not it is FreeBSD kernel hack for BGP
because can't do multipath.
> >
> > It is usual practice to use, say OSPF for internal routing (inside one
> > AS). Yes I understand that some netgraph solutions can help
> > (multilink PPP or ng_one2many, I am extensively use netgraph)
> > But if routing protocol there solution for it with alid link down
> > detection and so why we need to use some "workarounds" to emulate
> > protocol behavior ?
>
> You need to read the "README":
>
> ftp://ftp.flirble.org/pub/unix/hacks/FreeBSD/README.MPATH
>
> Specifically, you need to read:
>
> WHAT IT DOES NOT DO
>
> It doesn't detect when remote hosts are down. This is not
> the job of the kernel. It's not a routing protocol, it's
> not an automatic failover system.
>
> So it is not a "routing protocol solution" in any sense of things.
As I already said I am not fight for these patches - for me it is ugly
hack, I fight for multiple routes for one prefix in kernel.
> > > There is also a VRRP implementation for FreeBSD. I've posted
> > > the URL for it before. In combination, Virtual Router Redundancy
> > > Protocol *and* multipath are, together, roughly equivalent to
> > > using BGP (assuming both your routers are running the VRRP code).
> >
> > No, VRRP can't help if you want to use summary bandwidth, but helps a
> > lot if you are want to get redundancy (I think so because I am using
> > VRRP on my core routers since 4.2).
>
> I don't think you can get what you want without the cooperation
> of the other end of the link. If you want FEC, then use Bill Paul's
> FEC code, which does the channel bonding that you seem to want. But
> you will need the other end to cooperate.
I know, anyway thinks like FEC will only solve problem of connecting
two boxes by some number of links but will not solve problem of
many x many connection.
>
> > BGP can use multipath, as well as OSPF.
> > Possibility of kernel to store some number of routes for one prefix in
> > FIB can't replace BGP and, on other hand, BGP itself can't replace such
> > kernel feature.
>
> I already said that I think the code should be committed to the
> main line kernel, and preserved.
>
>
> > These two things almost orthogonal.
>
> Yes. They are. BGP solves the problem you are trying to solve,
> while the multipath is useful, but doesn't solve your problem.
>
>
> > PS:
> >
> > My opinion - it is useful feature for FreeBSD kernel, often used now
> > as good routing platform.
>
> I already said that I think the code should be committed to the
> main line kernel, and preserved.
May be it is better to discuss a bit what we want to have in kernel
exactly ? I think better to have another rt_entry structure for
second/third/etc routes for some prefix. Only modifications it will take
- lookup algorithm
( I think radix tree code will need some modification )
- forwarding code ( need to choose one of few routes )
- code for add/delete/get routes (something like "allow multipath"
option for addition and "remove all" for deletion)
> -- Terry
--
Vladimir B. Grebenschikov
vo...@sw.ru, SWsoft, Inc.
To Unsubscribe: send mail to majo...@FreeBSD.org
This is a circular argument. Why can't you use BGP on FreeBSD, then,
instead of having to invent this new thing?
> > > If interface goes down route will be DOWN by kernel.
> > > So it is not problem.
> >
> > No.
> >
> > ,---------------.
> > | BOX WITH TWO |
> > ,-------| DEF. ROUTES |-------.
> > route #1 | `---------------' | route #2
> > | |
> > ,---------------. ,---------------.
> > | ROUTER "A" | | ROUTER "B" |
> > `---------------' `---------------'
> > | |
> > | |
> > good link dead link
> >
> > The link between the box and "router B" remains up. Therefore the
> > box fails to note that packets sent via "route #2" never get to
> > their destination.
>
> No, if link BOX<->routerB fails kernel must down one of two routes with
> same prefix (you said default).
Since this is a cable you own, it's highly unlikely. You are much
more likely to lose your T1 on the *other side* of "router B".
> If from side of BOX it is noot seen thar carrier (whatever) on link
> BOX<->routerB down - BGP session over this link will down so, BGP
> software will down route.
On the ISP side, which does not affect packets you send, since you
are refusing to run BGP, or you won't need the hack.
> I am not happy with hack pach when one route have more then one gateway,
> may opinion to allow insert more then one full-featured route to one
> prefix into kernel FIB, but it is implementation issue.
No. The hack for multiple default routes implicitly assumes that
it is not a protocol issue that it's trying to solve. The problem
you have requires a protocol to solve it. I'm not surprised that
it doesn't make you happy: it's not a fix for your problem.
[ ... ]
> Again multipath IS in BGP concept not it is FreeBSD kernel hack for BGP
> because can't do multipath.
Are you saying that this is a feature that the FreeBSD BGP lacks?
> > So it is not a "routing protocol solution" in any sense of things.
>
> As I already said I am not fight for these patches - for me it is ugly
> hack, I fight for multiple routes for one prefix in kernel.
This is the classic seperation of the control plane from the data
plane. It is a good thing. The patches only implement, they do
not set policy. It is the job of other software to set policy.
> > I don't think you can get what you want without the cooperation
> > of the other end of the link. If you want FEC, then use Bill Paul's
> > FEC code, which does the channel bonding that you seem to want. But
> > you will need the other end to cooperate.
>
> I know, anyway thinks like FEC will only solve problem of connecting
> two boxes by some number of links but will not solve problem of
> many x many connection.
I don't think anything short of source routing can really solve the
problem that you are saying you have, because that's the only way
you will get to dictate the return path for response packets.
> > > My opinion - it is useful feature for FreeBSD kernel, often used now
> > > as good routing platform.
> >
> > I already said that I think the code should be committed to the
> > main line kernel, and preserved.
>
> May be it is better to discuss a bit what we want to have in kernel
> exactly ? I think better to have another rt_entry structure for
> second/third/etc routes for some prefix. Only modifications it will take
> - lookup algorithm
> ( I think radix tree code will need some modification )
> - forwarding code ( need to choose one of few routes )
> - code for add/delete/get routes (something like "allow multipath"
> option for addition and "remove all" for deletion)
You seem to imply that this will fix a bug in the FreeBSD BGP
implementation; what bug?
If it's not a FreeBSD bug, then what problem are you trying to
solve? You can't control the return path for responses to packets
you send out.
How does doing this "make FreeBSD more like Cisco"?
-- Terry
I am sorry but ther is no way around it, if you have two links of equal
cost and want to use both, you have to have support for that in the
kernel.
There are situations where this is needed, and you do get good performance
from it if you have many flows. One flow should always take the same
path.
(I will not say anything about the patch in queston as I have not read
it)
Yes! Thank you Mattias, it seems my english too bad to Terry understand
me. Exactly this I want to say to Terry.
--
Vladimir B. Grebenschikov
vo...@sw.ru, SWsoft, Inc.
To Unsubscribe: send mail to majo...@FreeBSD.org
> > Terry, FreeBSD has no support for BGP. To get BGP support you install=
a
> > router daemon. That inserts routes in the routing table in the kernel=
. The
> > kernel will do all packet forwarding. The kernel has to support two o=
r
> > more routes to the same destination if you are going to do BGP (or OS=
PF)
> > equal cost multipath.
> >
> > I am sorry but ther is no way around it, if you have two links of equ=
al
> > cost and want to use both, you have to have support for that in the
> > kernel.
> >
> > There are situations where this is needed, and you do get good perfor=
mance
> > from it if you have many flows. One flow should always take the same
> > path.
> =
> Yes! Thank you Mattias, it seems my english too bad to Terry understand=
> me. Exactly this I want to say to Terry.
So there *is* a BGP deficiency -- FreeBSD can't implement something
BGP knows about.
I like the patches as they are.
It is the job of FreeBSD to provide capability, not policy.
The BGP code gets to dictate the policy when it decides how to
use the capability.
So I *still* think the patches should go in.
-- Terry
I think it is better to discuss approach in freebsd-net before.
> -- Terry
--
Vladimir B. Grebenschikov
vo...@sw.ru, SWsoft, Inc.
To Unsubscribe: send mail to majo...@FreeBSD.org