Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

interpreting "input ICMP message failed" / netstat

810 views
Skip to first unread message

sken...@fhcrc.org

unread,
Apr 23, 2009, 1:56:37 PM4/23/09
to
I have a monitoring application which emits pings. Most of the time,
those ICMP Echos leave the box, arrive at their destination, and come
back as ICMP Replies -- this is good. However, intermittently, those
pings don't even leave the box. [I know this because I have a sniffer
positioned just outside the box, mirroring traffic via the switch.]
Naturally, my application then thinks that *all* its monitored devices
have gone down, whereupon it becomes agitated, emits pages, and so
forth. I'm adding debugging code to the application to try to
understand where the failure is. Thus far, checking the return code
on my call to send(), things seem fine, i.e. the OS claims that my send
() call completed. OK, so the ICMP Echo gets dropped somewhere after
my application has handed it off to the kernel.

Looking at the output of "netstat -s -w":

gnat> netstat -s -w
[...]
Icmp:
3510705 ICMP messages received
34835 input ICMP message failed.
ICMP input histogram:
destination unreachable: 111508
timeout in transit: 26581
echo requests: 84611
echo replies: 3288005
7155838 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 64167
echo request: 7007060
echo replies: 84611

(1) How to interpret the 'input ICMP message failed' counter?

Does this mean ... that the OS was asked to *transmit* an ICMP message
but was unable to (due to resource constraints perhaps) and that the
OS threw away this message? Does it mean that the OS *received* an
ICMP message but was unable to process it for some reason (full buffer
perhaps) and tossed it?

Checking the output of "netstat -i"; I don't see any sign that the NIC
is dropping frames.

gnat> netstat -i
Kernel Interface table
Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-
OK TX-ERR TX-DRP TX-OVR
bond0 1500 177785514 0 0 0
59109219 0 0 0
bond0:1 1500 - no statistics available -
eth0 1500 162487138 0 0 0
59109219 0 0 0
eth1 1500 15298376 0 0
0 0 0 0 0
eth2 1500 1791025 0 0
0 652286 0 0 0
eth3 1500 1362124 0 0
0 54 0 0 0
lo 16436 18460803 0 0
0 18460803 0 0 0
gnat>

(2) How reliably do NIC drivers update the counters which 'netstat -i'
is querying? How confident can I be that, in fact, the NIC is *not*
dropping frames?

(3) And finally, any recommendations for books on this topic? I've
skimmed through "Advanced Unix Programming" by Rochkind, "Unix Network
Programming, Volume 1: The Sockets Network API", and "TCP/IP
Illustrated Volume 2: The Implementation", without success thus far.
[helpful in other ways, but not in how to interpret 'netstat' output]

--sk

stuart kendrick
fhcrc

0 new messages