Unbalanaced Snort bandwidth = Packet loss

Ross Warren

unread,

Apr 16, 2014, 1:27:52 PM4/16/14

to securit...@googlegroups.com

In the attachment I have one snort instance that is taking the brunt of the load while the other 3 instances are running at minimal.

Has anyone seen this before? Known issue and I didnt google enough?

Thanks
Ross Warren

snort-stats-odd.JPG

Heine Lysemose

unread,

Apr 16, 2014, 1:33:46 PM4/16/14

to securit...@googlegroups.com

Hi Ross

Can you please include sudo sostat-redact so can have a full look at your system.

Regards,
Lysemose

--
You received this message because you are subscribed to the Google Groups "security-onion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
To post to this group, send email to securit...@googlegroups.com.
Visit this group at http://groups.google.com/group/security-onion.
For more options, visit https://groups.google.com/d/optout.

David Vasil

unread,

Apr 16, 2014, 1:50:13 PM4/16/14

to securit...@googlegroups.com

PF_RING hashes the IP headers of the traffic to decide which PF_RING bucket to dump the traffic into for inspection. While I have not looked at the PF_RING source to see how that hashing is done, it is possible that you have one or more very large single streams of traffic that are consuming the majority of your inspected bandwidth and are being processed by that one snort instance. Maybe you can inspect your Bro connections summary logs for that time period to see if you had any very large single streams.

A single snort instance "should" be able to handle that amount of traffic without dropping packets, assuming you do not have a crazy amount of rules (6-7k seems to be what the community sees as the sweet spot) and your hardware is adequately sized and SO is tuned: pf_ring min_slots, pcap_ring_size.

Ross Warren

unread,

Apr 16, 2014, 2:11:24 PM4/16/14

to securit...@googlegroups.com

Thanks David,
I will check on the PF_RING settings

David Vasil

unread,

Apr 16, 2014, 2:35:46 PM4/16/14

to securit...@googlegroups.com

On Wednesday, April 16, 2014 12:50:13 PM UTC-5, David Vasil wrote:
> PF_RING hashes the IP headers of the traffic to decide which PF_RING bucket to dump the traffic into for inspection. While I have not looked at the PF_RING source to see how that hashing is done, it is possible that you have one or more very large single streams of traffic that are consuming the majority of your inspected bandwidth and are being processed by that one snort instance. Maybe you can inspect your Bro connections summary logs for that time period to see if you had any very large single streams.

From the README.1st documentation in the pfring-daq-module source distribution:

2. Socket clustering
PF_RING allows you to distribute packets across multiple processes by using
socket clusters.
...
It is also possible to specify the cluster mode, with:

--daq-var clustermode=<mode>

where valid mode values are:
- 2 for 2-tuple flow
- 4 for 4-tuple flow
- 5 for 5-tuple flow
- 6 for 6-tuple flow

The default clustermode for SO is 'config daq_var: clustermode=4'. So, this probably comes back to my original assertion that you have one or more very large streams that were hashed into the same pf_ring slot.

Relevant header/comments/functions that explain the logic on how it chooses to hash the packet:

userland/lib/pfring.h:
int pfring_set_cluster(pfring *ring, u_int clusterId, cluster_type the_type);

kernel/pf_ring.c:
static int hash_pkt_cluster(ring_cluster_element *cluster_ptr,
struct pfring_pkthdr *hdr,
u_int16_t ip_id, u_int8_t first_fragment, u_int8_t second_fragment)

static inline u_int32_t hash_pkt_header(struct pfring_pkthdr * hdr, u_int32_t flags)

static inline u_int32_t hash_pkt(u_int16_t vlan_id, u_int8_t proto,
ip_addr host_peer_a, ip_addr host_peer_b,
u_int16_t port_peer_a, u_int16_t port_peer_b)

Ross Warren

unread,

Apr 16, 2014, 3:05:43 PM4/16/14

to securit...@googlegroups.com

Using bro conn.log file I found a netflow session.. Edited out netflow ports with bpf.conf and restarted services. Lets see if it comes back.. I also am working on tuning rules down closer to 6-7k

From (http://www.bro.org/bro-workshop-2011/solutions/logs/)
awk 'NR > 4 && $9 > 600' conn.log | sort -t$'\t' -k 9 -n

Thanks for the pointer.

Ross Warren

David Vasil

unread,

Apr 16, 2014, 3:18:10 PM4/16/14

to securit...@googlegroups.com

On Wednesday, April 16, 2014 2:05:43 PM UTC-5, Ross Warren wrote:
> Using bro conn.log file I found a netflow session.. Edited out netflow ports with bpf.conf and restarted services. Lets see if it comes back.. I also am working on tuning rules down closer to 6-7k
>
> From (http://www.bro.org/bro-workshop-2011/solutions/logs/)
> awk 'NR > 4 && $9 > 600' conn.log | sort -t$'\t' -k 9 -n

I believe that is just looking for long-lived connections (those that are established for longer than 600 seconds); I would find a netflow session generating over 200Mbps of traffic to be abnormal. I think you would want to look at columns 10 and 11 (orig_bytes/resp_bytes) in conn.log to see who is eating up your bandwidth. Or, just look at /nsm/bro/logs/<date>/conn-summary.<time period>.log.gz for your top talkers.

Reply all

Reply to author

Forward