Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Improving use of rt_refcnt

2 views
Skip to first unread message

Ryota Ozaki

unread,
Jul 15, 2015, 6:07:18 AM7/15/15
to
On Fri, Jul 10, 2015 at 6:10 AM, Joerg Sonnenberger
<jo...@britannica.bec.de> wrote:
> On Sun, Jul 05, 2015 at 10:33:02PM -0700, Dennis Ferguson wrote:
>> If you don't want it to work this way then you'll need to replace the
>> radix tree with something that permits changes while readers are
>> concurrently operating. To take best advantage of a more modern data
>> structure, however, you are still not going to want readers to ever
>> write the shared data structure if that can be avoided. The two
>> atomic operations needed to increment and decrement a reference count
>> greatly exceed the cost of a (well-cached) route lookup.
>
> Let me pick the discussion up at this point since David mentioned that
> my last reply was somewhat terse. I think the current radix tree serves
> three different purposes right now:
>
> (1) Manage the view of the connectivity to the outside world in a way
> coherent with the administrator's intention or some routing
> protocol/daemon.
>
> (2) Provide a mechanism for finding the next-hop for traffic to not
> directly attached networks.
>
> (3) Provide a mechanism for finding L2 addresses on directly attached
> networks.
>
> Using a single data structure for this has the advantage of code sharing
> and can make detailed accounting very easy. It has the problem of
> overhead and mixing data of different levels of volatility. I would like
> to see the three mechanisms to be separated with appropiate data
> structures for each case. The first point would be moved out completely
> from the hot path, the actual packet handling case. It would then be no
> longer as performance sensitive, so options for storage can be more
> focused on size.
>
> For finding the next-hop, the problem is simplified. The number of
> next-hop addresses is (normally) limited by the size of the network
> neighborhood. Even a backend router at one of the major Internet
> exchange points will not have more than a few thousand next-hops,
> compared to having 200k routes or more. This can be exploited to reduce
> the data size of the BMP lookup data structure and by removing redundant
> entries, e.g. a longer prefix with the same next-hop as a shorter
> prefix. As I mentioned in my earlier email, the next-hop entry can and
> should store a reference to whatever L2 data is needed, so that no
> additional search is needed.
>
> For the L3->L2 address mapping, the problem changes from BMP search to
> an exact match search. If the mapping is managed correctly, it makes
> sense to do this (cheap) search first and skip the whole BMP lookup on
> a match as redundant. Hash tables and the like have also nice properties
> for read-mostly updates and cache density.
>
> Joerg

http://www.netbsd.org/~ozaki-r/separate-l2-nexthop.diff

I've written a POC patch toward the nexthop cache separation.
It is mostly based on FreeBSD implementation.

It's still not mature; there remain debug codes and many places
are probably broken. But it works anyway, at least it passes
most of ATF tests.

This is a demonstration to show what we need to do if we go
the direction. I don't intend to commit it soon.

ozaki-r

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-...@muc.de

Ryota Ozaki

unread,
Jul 15, 2015, 6:14:44 AM7/15/15
to
On Sat, Jul 4, 2015 at 9:52 PM, Ryota Ozaki <oza...@netbsd.org> wrote:
> Hi,
>
> I'm trying to improve use of rt_refcnt: reducing
> abuse of it, e.g., rt_refcnt++/rt_refcnt-- outside
> route.c and extending it to treat referencing
> during packet processing (IOW, references from
> local variables) -- currently it handles only
> references between routes. The latter is needed for
> MP-safe networking.
>
> Here is a patch:
> http://www.netbsd.org/~ozaki-r/reduce-rt_refcnt-abuse.diff

BTW are there objections against the patch as a cleanup
of use of rt_refcnt?

ozaki-r

>
> The patch passes all ATF tests and an additional test
> specific to refcnt that is not committed yet due
> to lack of refcnt outputs in netstat -r (See PR#50027).
>
> Note that the patch isn't ready for MP-safe yet;
> we need atomicity of refcnt operations somehow
> (locking is an easy solution for it). This work
> will be solved in another patch.
>
> Any comments or suggestions are welcome.
>
> Thanks,
0 new messages