Has anyone seen this before? Known issue and I didnt google enough?
Thanks
Ross Warren
Hi Ross
Can you please include sudo sostat-redact so can have a full look at your system.
Regards,
Lysemose
--
You received this message because you are subscribed to the Google Groups "security-onion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
To post to this group, send email to securit...@googlegroups.com.
Visit this group at http://groups.google.com/group/security-onion.
For more options, visit https://groups.google.com/d/optout.
PF_RING hashes the IP headers of the traffic to decide which PF_RING bucket to dump the traffic into for inspection. While I have not looked at the PF_RING source to see how that hashing is done, it is possible that you have one or more very large single streams of traffic that are consuming the majority of your inspected bandwidth and are being processed by that one snort instance. Maybe you can inspect your Bro connections summary logs for that time period to see if you had any very large single streams.
A single snort instance "should" be able to handle that amount of traffic without dropping packets, assuming you do not have a crazy amount of rules (6-7k seems to be what the community sees as the sweet spot) and your hardware is adequately sized and SO is tuned: pf_ring min_slots, pcap_ring_size.
From the README.1st documentation in the pfring-daq-module source distribution:
2. Socket clustering
PF_RING allows you to distribute packets across multiple processes by using
socket clusters.
...
It is also possible to specify the cluster mode, with:
--daq-var clustermode=<mode>
where valid mode values are:
- 2 for 2-tuple flow
- 4 for 4-tuple flow
- 5 for 5-tuple flow
- 6 for 6-tuple flow
The default clustermode for SO is 'config daq_var: clustermode=4'. So, this probably comes back to my original assertion that you have one or more very large streams that were hashed into the same pf_ring slot.
Relevant header/comments/functions that explain the logic on how it chooses to hash the packet:
userland/lib/pfring.h:
int pfring_set_cluster(pfring *ring, u_int clusterId, cluster_type the_type);
kernel/pf_ring.c:
static int hash_pkt_cluster(ring_cluster_element *cluster_ptr,
struct pfring_pkthdr *hdr,
u_int16_t ip_id, u_int8_t first_fragment, u_int8_t second_fragment)
static inline u_int32_t hash_pkt_header(struct pfring_pkthdr * hdr, u_int32_t flags)
static inline u_int32_t hash_pkt(u_int16_t vlan_id, u_int8_t proto,
ip_addr host_peer_a, ip_addr host_peer_b,
u_int16_t port_peer_a, u_int16_t port_peer_b)
From (http://www.bro.org/bro-workshop-2011/solutions/logs/)
awk 'NR > 4 && $9 > 600' conn.log | sort -t$'\t' -k 9 -n
Thanks for the pointer.
Ross Warren
I believe that is just looking for long-lived connections (those that are established for longer than 600 seconds); I would find a netflow session generating over 200Mbps of traffic to be abnormal. I think you would want to look at columns 10 and 11 (orig_bytes/resp_bytes) in conn.log to see who is eating up your bandwidth. Or, just look at /nsm/bro/logs/<date>/conn-summary.<time period>.log.gz for your top talkers.