The server running security-onion should be beefy enough for this traffic:
Specs:
Intel(R) Core(TM) i5-6400 CPU @ 2.70GHz (4 cores)
8TB WD Red Drive
40+ Gigs of ram
Intel I350T2V2 Server class dual ethernet pci-e nic
The traffic going to the gateway's internal nic is spread across two VLANs.
Currently the traffic leaves port 19 of our switch into a dedicated port mirroring device (Dualcom DCGS-2005) port 1. Port 2 then connects to the gateway. Port 5 goes to eth1 of the security-onion server.
I'm seeing normal traffic at the gateway's internal interface, however I'm not seeing the same traffic on eth1 of security-onion.
For instance, if I filter our arp, multicast, and broadcast traffic, I only see VLAN 50 traffic:
tcpdump -i eth1 -nn not arp and not multicast and not broadcast
The only traffic I see coming from VLAN 1 is broadcast/arp.
Any idea what's going on, or why I'd be seeing dropped packets? The server has been up for an hour, so only 177,271 packets per hour is apparently hitting the interface.
Here's the relevant cuts from sostat-redacted:
=========================================================================
Service Status
=========================================================================
Status: securityonion
* SO-user server[ OK ]
Status: HIDS
* ossec_agent (SO-user)[ OK ]
Status: Bro
Getting process status ...
Getting peer status ...
Name Type Host Status Pid Peers Started
manager manager localhost running 4827 2 02 Sep 21:55:06
proxy proxy localhost running 5013 2 02 Sep 21:55:08
SO-server-eth1-1 worker localhost running 5203 2 02 Sep 21:55:10
Status: SO-server-eth1
* netsniff-ng (full packet data)[ OK ]
* pcap_agent (SO-user)[ OK ]
* snort_agent (SO-user)[ OK ]
* suricata (alert data)[ OK ]
* barnyard2 (spooler, unified2 format)[ OK ]
=========================================================================
Interface Status
=========================================================================
eth0 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
inet addr:X.X.X.X Bcast:X.X.X.X Mask:X.X.X.X
inet6 addr: X.X.X.X/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:20387 errors:0 dropped:0 overruns:0 frame:0
TX packets:8002 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14036813 (14.0 MB) TX bytes:3066765 (3.0 MB)
Memory:df700000-df7fffff
eth1 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:177271 errors:0 dropped:110 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:86854442 (86.8 MB) TX bytes:0 (0.0 B)
Memory:df600000-df6fffff
lo Link encap:Local Loopback
inet addr:X.X.X.X Mask:X.X.X.X
inet6 addr: X.X.X.X/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:20519 errors:0 dropped:0 overruns:0 frame:0
TX packets:20519 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:9356872 (9.3 MB) TX bytes:9356872 (9.3 MB)
=========================================================================
Disk Usage
=========================================================================
Filesystem Size Used Avail Use% Mounted on
udev 20G 4.0K 20G 1% /dev
tmpfs 4.0G 1.7M 4.0G 1% /run
/dev/dm-1 7.2T 5.0G 6.9T 1% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 20G 92K 20G 1% /run/shm
none 100M 28K 100M 1% /run/user
/dev/sda2 237M 88M 137M 39% /boot
=========================================================================
CPU Usage
=========================================================================
Load average for the last 1, 5, and 15 minutes:
0.43 0.28 0.25
Processing units: 4
If load average is higher than processing units,
then tune until load average is lower than processing units.
top - 22:53:46 up 1:01, 2 users, load average: 0.43, 0.28, 0.25
Tasks: 279 total, 1 running, 278 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.7 us, 2.2 sy, 0.0 ni, 92.8 id, 1.2 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 41015212 total, 4055236 used, 36959976 free, 87280 buffers
KiB Swap: 41746428 total, 0 used, 41746428 free. 1122356 cached Mem
%CPU %MEM COMMAND
11.0 1.4 /opt/bro/bin/bro -i eth1 -U .status -p broctl -p broctl-live -p local -p SO-server-eth1-1 local.bro broctl base/frameworks/cluster local-worker.bro broctl/auto
2.3 0.1 /opt/bro/bin/bro -U .status -p broctl -p broctl-live -p local -p manager local.bro broctl base/frameworks/cluster local-manager.bro broctl/auto
2.1 0.1 /opt/bro/bin/bro -U .status -p broctl -p broctl-live -p local -p proxy local.bro broctl base/frameworks/cluster local-proxy broctl/auto
2.1 1.8 suricata --user SO-user --group SO-user -c /etc/nsm/SO-server-eth1/suricata.yaml --pfring=eth1 -l /nsm/sensor_data/SO-server-eth1
1.3 0.0 barnyard2 -c /etc/nsm/SO-server-eth1/barnyard2.conf -u SO-user -g SO-user -d /nsm/sensor_data/SO-server-eth1 -f snort.unified2 -w /etc/nsm/SO-server-eth1/barnyard2.waldo -i 1 -U
0.3 0.0 /var/ossec/bin/ossec-syscheckd
=========================================================================
PF_RING
=========================================================================
PF_RING Version : 6.4.1 (unknown)
Total rings : 2
Standard (non ZC) Options
Ring slots : 65534
Slot version : 16
Capture TX : Yes [RX+TX]
IP Defragment : No
Socket Mode : Standard
Total plugins : 0
Cluster Fragment Queue : 0
Cluster Fragment Discard : 0
lspci:
00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 07)
00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) (rev 07)
00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31)
00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31)
00:17.0 SATA controller: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] (rev 31)
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #1 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #5 (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)
00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31)
00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V (rev 31)
01:00.0 VGA compatible controller: NVIDIA Corporation GF110 [GeForce GTX 580] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GF110 High Definition Audio Controller (rev a1)
02:00.0 Network controller: Broadcom Corporation BCM4352 802.11ac Wireless Network Adapter (rev 03)
03:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
I'm not using the I219-V Ethernet device, or the Wireless Network Adapter.
Thanks for any input!
Ian,
Please provide the full output of sostat-redacted, attaching as a text file, or using a service like Pastebin.com.
Thanks,
Wes
Ian,
It looks like the drops may be happening at the NIC, and not necessarily by the applications. Have you considered using a different NIC?
Maybe also take a look at:
Thanks,
Wes
http://ark.intel.com/products/84804/Intel-Ethernet-Server-Adapter-I350-T2V2
modinfo igb
filename: /lib/modules/3.19.0-68-generic/kernel/drivers/net/ethernet/intel/igb/igb.ko
version: 5.2.15-k
According to this intel download link, 5.3.5.3 (5/31/2016) is the latest driver:
https://downloadcenter.intel.com/download/13663
Oddly, ethtool reports no dropped packets, while ifconfig does.
ifconfig eth1:
eth1 Link encap:Ethernet HWaddr a0:36:9f:40:50:dd
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:2550934 errors:0 dropped:9909 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:662624510 (662.6 MB) TX bytes:0 (0.0 B)
Memory:df600000-df6fffff
ethtool -S eth1|grep -i "drop"
dropped_smbus: 0
tx_dropped: 0
rx_queue_0_drops: 0
rx_queue_1_drops: 0
rx_queue_2_drops: 0
rx_queue_3_drops: 0
ethtool -S eth1|grep -v ": 0"
NIC statistics:
rx_packets: 2563334
rx_bytes: 683407822
rx_broadcast: 1018520
rx_multicast: 425560
multicast: 425560
rx_long_byte_count: 683407822
rx_queue_0_packets: 1490412
rx_queue_0_bytes: 151559261
rx_queue_1_packets: 148330
rx_queue_1_bytes: 41616158
rx_queue_2_packets: 285947
rx_queue_2_bytes: 174532168
rx_queue_3_packets: 638645
rx_queue_3_bytes: 295977663
Yet:
cat /sys/class/net/eth1/statistics/rx_dropped
9971
weird.
A comment in the thread you linked to mentioned that MTU might be a cause. I remember Doug mentioning that we don't need to adjust the MTU anymore, but for fun, I cranked it up to 9000. It didn't help.
I thought that maybe I needed to disable the offloading stuff like tso/rso, but apparently this was done for me. However, it looks like the driver is leaving something on:
ethtool --show-offload eth1|grep -v "off"
Features for eth1:
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
rx-vlan-filter being fixed on may be why I'm unable to see all of the vlan traffic, and may be the reason behind the dropped count? Maybe I need to set up additional alias interfaces to get the driver to accept the other vlan packets. I'll fiddle around.
dmesg|egrep "eth1|igb"
[ 0.779987] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.2.15-k
[ 0.779989] igb: Copyright (c) 2007-2014 Intel Corporation.
[ 0.875313] igb 0000:03:00.0: added PHC on eth0
[ 0.875315] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
[ 0.875317] igb 0000:03:00.0: eth0: (PCIe:5.0Gb/s:Width x4) a0:36:9f:40:50:dc
[ 0.875703] igb 0000:03:00.0: eth0: PBA No: G15138-002
[ 0.875704] igb 0000:03:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[ 0.973911] igb 0000:03:00.1: added PHC on eth1
[ 0.973913] igb 0000:03:00.1: Intel(R) Gigabit Ethernet Network Connection
[ 0.973914] igb 0000:03:00.1: eth1: (PCIe:5.0Gb/s:Width x4) a0:36:9f:40:50:dd
[ 0.974221] igb 0000:03:00.1: eth1: PBA No: G15138-002
[ 0.974222] igb 0000:03:00.1: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[ 99.325316] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[ 99.348190] device eth1 entered promiscuous mode
[ 102.819901] igb 0000:03:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 103.604591] igb 0000:03:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[ 430.477266] igb 0000:03:00.1 eth1: igb: eth1 NIC Link is Down
[ 442.247235] igb 0000:03:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[271275.292795] igb 0000:03:00.1: changing MTU from 1500 to 9000
[271279.367041] igb 0000:03:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
The only thing left now is to figure out why I'm not seeing all of the traffic from VLAN 1. It could be a problem with the Dualcom DCGS-2005. When I get back into the office Wed, I'll plug my laptop into port 5 and see if the traffic is any different. I have an extra Dualcom too, so I can swap it out and see what happens. If that doesn't work, worst case I can just turn on port mirroring on the switch instead.
I've wiped the box and am installing XenServer 7 on it now. I plan on doing some testing from the host itself (no guest vm). Assuming I see normal traffic, I'll just load securityonion as a vm to get around this hardware issue.
Thanks, Ian!
Just wanted to follow up. I turned on port mirroring at the switch and plugged that into the interface I've been having trouble with. The traffic was identical to using the Doalcom. This means that the problem has to be with the interface or drivers.
I've wiped the box and am installing XenServer 7 on it now. I plan on doing some testing from the host itself (no guest vm). Assuming I see normal traffic, I'll just load securityonion as a vm to get around this hardware issue.
--
Follow Security Onion on Twitter!
https://twitter.com/securityonion
---
You received this message because you are subscribed to the Google Groups "security-onion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to security-onion+unsubscribe@googlegroups.com.
To post to this group, send email to security-onion@googlegroups.com.
Visit this group at https://groups.google.com/group/security-onion.
For more options, visit https://groups.google.com/d/optout.
So, wiped security onion from the server and loaded XenServer 7. Saw the same traffic on the sniffer interface in XenServer, so I figured the nic was just not going to work with linux.
I did some searches for my particular nic and vlans to see if others saw similar stuff. There were a number of posts of people who were unable to put the nic into promiscuous mode in a vm -- something about a hardware limitation, though others hacked up the driver to apparently make it work, but I wasn't going to go there. Figured it was so strange that this fancy pants nic was not going to work, but what the heck -- ordered a $20 rtl nic instead from Amazon.
Just to confirm, I unplugged the cable and plugged it into a macbook to sniff the traffic with wireshark... I saw the same damn traffic.
Went to lunch flabbergasted -- could this be an intel nic conspiracy??
After lunch I went back to the our gateway server to verify I was grabbing the correct traffic -- It had a two port nic -- one for internal and one for external -- I certainly wasn't seeing external traffic. Ugh... then I noticed on a different part of the server a third network cable.
Yeah -- /that/ was the correct cable. The other one I had been using? Who the heck knows.
Time to break out the label maker. :)
At least there was one legit thing found from all this work -- loading that 8201q driver did stop the dropped packets.