Packet Loss: Discrepancy between sostat's output and Bro's capture_loss.log

352 views
Skip to first unread message

Clement Chen

unread,
Dec 18, 2014, 12:48:26 PM12/18/14
to securit...@googlegroups.com
Hi all,

I am running SO in a production environment. The server has 64G memory and 24 CPU cores. I am running 12 Bro workers as suggested.

I noticed that there is a huge discrepancy about the packet loss being reported in capture_loss.log and sostat.

In capture_loss.log, is is reporting 99% of packet lost:

#fields ts ts_delta peer gaps acks percent_lost
#types time interval string count count double
1418922824.722233 900.000033 secmon-eth2-2 11547 11612 99.440234
1418922824.732983 900.000013 secmon-eth2-12 8644 8731 99.003551
1418922824.658712 900.000038 secmon-eth2-10 8768 9355 93.725281
1418922824.691344 900.000003 secmon-eth2-1 14246 15232 93.526786
1418922824.627492 900.000073 secmon-eth2-7 10897 11046 98.651095
1418922824.769663 900.000137 secmon-eth2-3 6571 6664 98.604442
1418922824.711460 900.000149 secmon-eth2-4 9698 9776 99.202128
1418922824.858321 900.000321 secmon-eth2-9 27351 27473 99.555928
1418922824.731312 900.000115 secmon-eth2-11 18014 18794 95.849739

However, when I run sostat, the packet loss rate is much smaller. Below are some output from sostat:

================================================================
Bro netstats
================================================================
Average packet loss as percent across all Bro workers: 0.007091

secmon-eth2-1: 1418924384.192072 recvd=614300097 dropped=57077 link=614300097
secmon-eth2-10: 1418924384.393933 recvd=614752529 dropped=24 link=614752529
secmon-eth2-11: 1418924384.579498 recvd=621048275 dropped=21 link=621048275
secmon-eth2-12: 1418924384.801878 recvd=608787537 dropped=43 link=608787537
secmon-eth2-2: 1418924385.002095 recvd=615804811 dropped=22 link=615804811
secmon-eth2-3: 1418924385.201980 recvd=605278157 dropped=30 link=605278157
secmon-eth2-4: 1418924385.401909 recvd=609077753 dropped=27 link=609077753
secmon-eth2-5: 1418924385.601935 recvd=605810529 dropped=23 link=605810529
secmon-eth2-6: 1418924385.805827 recvd=608625567 dropped=125423 link=608625567
secmon-eth2-7: 1418924386.006024 recvd=626829972 dropped=30 link=626829972
secmon-eth2-8: 1418924386.206172 recvd=648424243 dropped=341581 link=648424243
secmon-eth2-9: 1418924386.405972 recvd=615953691 dropped=34 link=615953691

==============================================================
IDS Engine (snort) packet drops
=========================================================================
/nsm/sensor_data/secmon-eth2/snort-1.stats last reported pkt_drop_percent as 0.000
/nsm/sensor_data/secmon-eth2/snort-2.stats last reported pkt_drop_percent as 0.000
/nsm/sensor_data/secmon-eth2/snort-3.stats last reported pkt_drop_percent as 0.000
/nsm/sensor_data/secmon-eth2/snort-4.stats last reported pkt_drop_percent as 0.000
/nsm/sensor_data/secmon-eth2/snort-5.stats last reported pkt_drop_percent as 0.000
/nsm/sensor_data/secmon-eth2/snort-6.stats last reported pkt_drop_percent as 0.000
/nsm/sensor_data/secmon-eth2/snort-7.stats last reported pkt_drop_percent as 0.000
/nsm/sensor_data/secmon-eth2/snort-8.stats last reported pkt_drop_percent as 0.000

===============================================================


Also PF_Ring's stat:

=========================================================================
pf_ring stats
=========================================================================
PF_RING Version : 6.0.2 ($Revision: $)
Total rings : 20

Standard (non DNA) Options
Ring slots : 65534
Slot version : 16
Capture TX : Yes [RX+TX]
IP Defragment : No
Socket Mode : Standard
Transparent mode : Yes [mode 0]
Total plugins : 0
Cluster Fragment Queue : 261
Cluster Fragment Discard : 0

/proc/net/pf_ring/23443-eth2.238
Appl. Name : snort-cluster-53-socket-0
Tot Packets : 165466159
Tot Pkt Lost : 4765294
TX: Send Errors : 0
Reflect: Fwd Errors: 0
Min Num Slots : 65538
Num Free Slots : 65535

/proc/net/pf_ring/23487-eth2.237
Appl. Name : snort-cluster-53-socket-0
Tot Packets : 168746153
Tot Pkt Lost : 4896390
TX: Send Errors : 0
Reflect: Fwd Errors: 0
Min Num Slots : 65538
Num Free Slots : 65482

/proc/net/pf_ring/23530-eth2.239
Appl. Name : snort-cluster-53-socket-0
Tot Packets : 164172651
Tot Pkt Lost : 3777871
TX: Send Errors : 0
Reflect: Fwd Errors: 0
Min Num Slots : 65538
Num Free Slots : 65451

/proc/net/pf_ring/23582-eth2.240
Appl. Name : snort-cluster-53-socket-0
Tot Packets : 171746268
Tot Pkt Lost : 2718093
TX: Send Errors : 0
Reflect: Fwd Errors: 0
Min Num Slots : 65538
Num Free Slots : 65481

Clement Chen

unread,
Dec 18, 2014, 1:37:59 PM12/18/14
to securit...@googlegroups.com
BTW, I used the wizard to set up the NIC and disabled all offloading. I have tagged traffic though. The NIC is 10G but the actual traffic is much smaller.

ethtool -k eth2

Offload parameters for eth2:
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off
rx-vlan-offload: off
tx-vlan-offload: off
ntuple-filters: off
receive-hashing: off


But if I run ethtool -S eth2, I got a very big number for rx_long_length_errors:

rx_long_length_errors: 9906116604

Clement Chen

unread,
Dec 18, 2014, 5:05:47 PM12/18/14
to securit...@googlegroups.com
I changed the MTU from 1500 to 9000 and no longer got the "rx_long_length_errors".

Not sure how Bro/Snort deal with large MTUs?

Samson H

unread,
Jun 21, 2016, 3:11:04 PM6/21/16
to security-onion
So I know this is an old thread, but I am wondering the same thing as Clement.

Can anyone answer this?

Doug Burks

unread,
Jun 21, 2016, 9:25:18 PM6/21/16
to securit...@googlegroups.com
Hi Samson,

If capture_loss is reporting higher packet loss than sostat, then that
may be indicative of upstream packet loss (in your tap or span port).

For more information, please see:
https://www.bro.org/documentation/faq.html#how-can-i-reduce-the-amount-of-captureloss-or-dropped-packets-notices

On Tue, Jun 21, 2016 at 3:11 PM, Samson H <this.is...@gmail.com> wrote:
> So I know this is an old thread, but I am wondering the same thing as Clement.
>
> Can anyone answer this?
>
> --
> Follow Security Onion on Twitter!
> https://twitter.com/securityonion
> ---
> You received this message because you are subscribed to the Google Groups "security-onion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
> To post to this group, send email to securit...@googlegroups.com.
> Visit this group at https://groups.google.com/group/security-onion.
> For more options, visit https://groups.google.com/d/optout.



--
Doug Burks

Samson H

unread,
Jun 22, 2016, 10:11:25 AM6/22/16
to security-onion
You da man Doug!
Thanks for the help!

Some background on my issue for those who interested:
I think we have an issue with our interfaces...
We are trying to do some poor man load balancing with some 10G interfaces on a sensor by bonding the interfaces into 1 bonded interface and I think we're shooting ourselves in the feet not being able to see full sessions as they are being broken up across multiple interfaces...
(Feel free to correct me if I'm wrong here)

Reply all
Reply to author
Forward
0 new messages