I'm currently using Net-SNMP snmpd to export my Linux kernel's ARP
tables. I'm doing real time lookups on IP to ARP mapping, and it looks
like the snmpd caches the kernel's ARP table for a while resulting
sometimes in "not found" IP requests.
Looking at agent/mibgroup/ip-mib/data_access/arp_linux.c it's doing
snapshot dumps from the kernels tables, and apparently higher layers are
doing the caching.
Now, I'm asking which is better, and if there's any existing example
code (so I can fix the arp_linux.c):
1) Dynamically listen Linux netlink events and add/delete/update
entries on the snmpd's copy of ARP table?
-or-
2) On request to specific IP-address (specific MIB entry with IP
address part of the request OID), request the kernels for up-to-date
information on that specific IP-address and dynamically create the result?
Any ideas?
Cheers,
- Timo
------------------------------------------------------------------------------
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a
Billion" shares his insights and actions to help propel your
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
_______________________________________________
Net-snmp-coders mailing list
Net-snm...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders
Yep. You could set the cache timeout lower if you wanted fresher data, but
there will always be that window where the cache could be a little stale.
TT> Now, I'm asking which is better, and if there's any existing example
TT> code (so I can fix the arp_linux.c):
TT>
TT> 1) Dynamically listen Linux netlink events and add/delete/update
TT> entries on the snmpd's copy of ARP table?
TT>
TT> -or-
TT>
TT> 2) On request to specific IP-address (specific MIB entry with IP
TT> address part of the request OID), request the kernels for up-to-date
TT> information on that specific IP-address and dynamically create the result?
I'd vote for 2. You'd just have to disable the cache reload at the higher
levels (there are some flags that can be set when the cache is created) and
keep the cache consistent via netlink.
There is already some code in net-snmp that is using netlink grep should help
your find where that is. just make sure that you are using the agent's main
select loop to monitor the netlink socket.
If the code was configurable at runtime, that would be ideal. People who
aren't monitoring/using the arp module might not want the overhead of an
unused cache being constantly updated.
------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
Looks like the corrent code works like:
1) arp_common/arp_linux returns a container with snapshot of the kernel
cache
2) inetNetToMediaTable is the only MIB using arp code, and it refreshes
the cache by doing full ARP dump with arp_common.c call and finally
updates the associated cache
So, this likely needs a bit of rework so that the dynamic updates become
possible.
I was actually wondering if option 1, or some sort form of it would be
the best. See below. (Sounds like option 2 would require reimplementing
the MIB with to be direct access style.)
> There is already some code in net-snmp that is using netlink grep should help
> your find where that is. just make sure that you are using the agent's main
> select loop to monitor the netlink socket.
Yes, I found them. But none of them seem to dynamically refresh
themselves. All are implementations doing periodic snapshots.
> If the code was configurable at runtime, that would be ideal. People who
> aren't monitoring/using the arp module might not want the overhead of an
> unused cache being constantly updated.
Right. I'd assume the RightThing(tm) is to on first MIB request to do
full load and start listening changes, and after a timeout stop
listening changes. We'd have then fully dynamically working system.
I'm wondering if I can do the following with the cache_handler:
1) on container_load we do full dump as usual, but also register for
the netlink events for changes (using register_readfd() external fd
handling) and finally a timer to unload cache/stop listening netlink
2) on each of _get operations reset the "monitoring stop timer"
3) on netlink socket callback just go and directly update the local
cache container using the regular CONTAINER_*
4) on "monitoring stop timer" kill the netlink socket and release resources
Namely about the step 3 I'm unsure: can I go and modify the cache
container contents outside container_load() call?
- Timo
<sigh>. My brain said option 1, but for some reason my fingers typed 2.
TT> Right. I'd assume the RightThing(tm) is to on first MIB request to do
TT> full load and start listening changes, and after a timeout stop
TT> listening changes. We'd have then fully dynamically working system.
sounds good, as long as the listening part is optional.
TT> I'm wondering if I can do the following with the cache_handler:
TT> 1) on container_load we do full dump as usual, but also register for
TT> the netlink events for changes (using register_readfd() external fd
TT> handling) and finally a timer to unload cache/stop listening netlink
The cache already has a timer, so that can be reused.
TT> 2) on each of _get operations reset the "monitoring stop timer"
A new cache helper flag could be added to bump the cache expiration timer. I
don't think a separate monitoring timer would be needed.
TT> 3) on netlink socket callback just go and directly update the local
TT> cache container using the regular CONTAINER_*
Yep. You just have to have a way to get a hold of the cache. Another option is
to simply queue the netlink messages and not do any cache modifications until
a request comes in. This could be anew generic facility added to the cache
helper: cache updates, in addition to simply loading and flushing. The updates
could be immediate or delayed, and there could be thresholds for simply
triggering a flush and reload if too many updates come in...
Okay.
> TT> I'm wondering if I can do the following with the cache_handler:
> TT> 1) on container_load we do full dump as usual, but also register for
> TT> the netlink events for changes (using register_readfd() external fd
> TT> handling) and finally a timer to unload cache/stop listening netlink
>
> The cache already has a timer, so that can be reused.
Ah, of course. We have also the hooks for it so great.
> TT> 2) on each of _get operations reset the "monitoring stop timer"
>
> A new cache helper flag could be added to bump the cache expiration timer. I
> don't think a separate monitoring timer would be needed.
Makes sense.
> TT> 3) on netlink socket callback just go and directly update the local
> TT> cache container using the regular CONTAINER_*
>
> Yep. You just have to have a way to get a hold of the cache. Another option is
> to simply queue the netlink messages and not do any cache modifications until
> a request comes in. This could be anew generic facility added to the cache
> helper: cache updates, in addition to simply loading and flushing. The updates
> could be immediate or delayed, and there could be thresholds for simply
> triggering a flush and reload if too many updates come in...
I'm not too excited about caching cache changes. I'd prefer to just go
ahead and update the cache directly. I'll try to brew up some code to
see how it works out in practice.
- Timo
Ok. I suggest you put in some safety valves to make sure that netlink updates
don't starve other agent processing.
I looked at it earlier and did not find it. But I was looking at the
net-snmp-5.6 release sources. I did svn checkout now and see it. Looks
exactly what I had in mind for my code.
Thanks for the pointer.
Right. In general, I think doing netlink listening of changes is way
lighter than re-reading the neighbour table often (but the caching time
is pretty large by default apparently).
I'm now pretty okay with the code and understand how the existing stuff
works, and what needs to be fixed.
I still have one more implementation detail for which, I'd like to get a
second opinion. Since I need to go break the arp data_access api and it
will need to support both: cached polling and event changes listening we
have some options:
a) have two APIs in data_access/arp.h: one for periodic polling, one
with callbacks for changes; each user (currently only
inetNetToMediaTable) needs to support both APIs, and for polling do the
detection of deleted entries as currently
b) change data_access/arp.h to have only callback for changed entries.
arp code needs to keep additional copy of the entries when polling (so
it can detect deletions)
c) as above, but supply find function for parent container, and add
additional fields to netsnmp_arp_entry so the arp kernel provider can do
update/deletions directly using the parent container
d) make intermediate caching layer between users and the kernel
providers. this is what net_snmp_search_update_prefix_info() does for
ipv6 prefix stuff. however, i don't think is such a good approach for
neighbour cache.
e) other ideas?
I personally would favor option C. That would minimize amount of
temporary containers, and keep data in one place. It would also get rid
of most duplicate code. However, it removes some of the abstraction that
is currently in place (but do we really need it?).
Cheers,
Timo