I apologize in advance for any lack of knowledge on my part - I'm very much a Bro / Security Onion novice.
We're experiencing issues with Bro on three of our Security Onion sensors. We updated our sensors with soup two weeks ago and these issues showed up last week. Around the same time the issue started we had modified a few custom Bro scripts we installed - since then I have removed all our custom scripts from local.bro - including default Security Onion scripts.
Bro is very slow to start - any broctl command (start, stop, deploy, etc) takes about 5 minutes to return now, and frequently complains about lock files. Bro manager processes on these sensors are consuming very large amounts of RAM - I've seen it spike in broctl top up to 50 G of mem usage for a manager process. Bro is also dropping lots of packets.
This may be a Bro issue, but posting it on the Security Onion mailing list as it seems odd that the issue appeared on three sensors at the same time. I found a thread on the bro mailing regarding memory leaks in the bro manager process - I ran sudo broctl stop, broctl cleanup --all, broctl installl, and broctl deploy and the issue persists. I've also rebooted the sensors after stopping & cleaning bro.
As an additional troubleshooting step I replaced the /opt/bro directory on a bad sensor with one from a good sensor and rebooted, but replacing all the bro files didn't help. I'm at a loss on what else to try to troubleshoot this.
I don't see any indications of anything wrong in Bro's reporter.log or stderr.log.
sostat-redacted from one of the sensors attached - I had to run sostat overnight as it took several hours to return any output.
Here are some odd bro outputs I've received recently:
broctl top:
Name Type Host Pid Proc VSize Rss Cpu Cmd
manager manager localhost 6188 parent 58G 39G 0% bro
manager manager localhost 6242 child 148M 29M 0% bro
proxy proxy localhost 6429 parent 157M 59M 0% bro
proxy proxy localhost 6494 child 124M 22M 0% bro
sudo broctl netstats:
Error: database is locked: /nsm/bro/spool/state.db
sudo broctl netstats:
redacted-sensorname-eth1-1: 1480947389.611317 recvd=199313745 dropped=8562731 link=199313745
redacted-sensorname-eth1-10: 1480947434.653012 recvd=159251363 dropped=22733445 link=159251363
redacted-sensorname-eth1-2: 1480947434.852971 recvd=163893206 dropped=11959801 link=163893206
redacted-sensorname-eth1-3: 1480947435.057123 recvd=201135451 dropped=14753539 link=201135451
redacted-sensorname-eth1-4: 1480947435.257075 recvd=184189907 dropped=41720479 link=184189907
redacted-sensorname-eth1-5: 1480947435.457000 recvd=170700005 dropped=10630323 link=170700005
redacted-sensorname-eth1-6: <error: cannot connect to 127.0.0.1:47769>
redacted-sensorname-eth1-7: <error: cannot connect to 127.0.0.1:47770>
redacted-sensorname-eth1-8: <error: cannot connect to 127.0.0.1:47771>
redacted-sensorname-eth1-9: <error: cannot connect to 127.0.0.1:47772>
user@redacted-sensorname:~$ sudo broctl
Error: database is locked
user@redacted-sensorname:~$ sudo broctl
Error: database is locked: /nsm/bro/spool/state.db
user@redacted-sensorname:~$ sudo broctl stop
Error: error running post-terminate for soflokydlnetf01-eth1-10: Host timeout
stopping proxy ...
stopping manager ...
manager did not terminate ... killing ...
Error: 'str' object has no attribute 'type'
Thanks in advance for any advice on this issue!
James Gordon
James,
Have you tried running it, including the default Security Onion scripts? It looks like Bro is eating up a ton of CPU on your machine(s):
=========================================================================
CPU Usage
=========================================================================
Load average for the last 1, 5, and 15 minutes:
155.04 164.99 169.40
Processing units: 24
I would try killing the current processes and run:
sudo broctl diag
Thanks,
Wes
Wes,
broctl diag output attached. I'm not sure what you mean in your previous reply "Have you tried running it, including the default Security Onion scripts?" I've atttempted started / stopping / restarting Bro using the normal Security Onion scripts (nsm_sensor_ps-restart --only-bro. etc) but this has the same behavior as broctl.
Thanks,
James Gordon
James,
I was just making sure you had the default SO scripts back in place -- I thought you meant you removed them.
Did you perform the broctl diag after bro was restarted? If not,
please provide the output for broctl diag after you have just restarted Bro.
Thanks,
Wes
Wes & all,
It looks like Bro is taking the impact on my boxes being way overloaded. As noted in my original post these sensors were updated ~ 3 weeks ago - I think something went very wrong during the upgrade process. I have a ton of cron jobs for various Security Onion processes kicking off and never completing - one of my servers had a load average of 150 earlier today. Most of the processes appear to be related to ELSA, but there's a lot of chown processes that are running as well. If these look like they're all ELSA - if there's not an easy fix to this, I feel like ELSA is underutilized in our environment and I wouldn't mind disabling it. Last week while troubleshooting I modified /etc/nsm/securityonionion.conf to ELSA=NO and rebooted but we still had the issueu, so I added it back - so I don't think simply turning ELSA off this way will fix it.
We rebooted a sensor, immediately stopped cron, and Bro is running great and without issues.
load average: 47.49, 47.51, 48.23
root@sensor:~# ps -ef | grep cron | wc -l
1047
root@sensor:~# ps -ef | grep chown | wc -l
40
root 31651 31625 0 09:08 ? 00:00:00 perl /opt/elsa/web/cron.pl -c /etc/elsa_web.conf
root 31661 31655 0 09:09 ? 00:00:00 /bin/sh -c sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh > /dev/null 2>&1
root 31662 31661 0 09:09 ? 00:00:00 sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh
root 31695 31662 0 09:09 ? 00:00:00 perl /opt/elsa/web/cron.pl -c /etc/elsa_web.conf
root 31883 31877 0 09:10 ? 00:00:00 /bin/sh -c sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh > /dev/null 2>&1
root 31887 31883 0 09:10 ? 00:00:00 sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh
root 31911 31887 0 09:10 ? 00:00:00 perl /opt/elsa/web/cron.pl -c /etc/elsa_web.conf
root 32020 32012 0 09:11 ? 00:00:00 /bin/sh -c sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh > /dev/null 2>&1
root 32023 32020 0 09:11 ? 00:00:00 sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh
root 32045 32023 0 09:11 ? 00:00:00 perl /opt/elsa/web/cron.pl -c /etc/elsa_web.conf
root 32069 32060 0 09:12 ? 00:00:00 /bin/sh -c sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh > /dev/null 2>&1
root 32072 32069 0 09:12 ? 00:00:00 sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh
root 32084 32072 0 09:12 ? 00:00:00 perl /opt/elsa/web/cron.pl -c /etc/elsa_web.conf
root 32198 32189 0 09:13 ? 00:00:00 /bin/sh -c sh /opt/elsa/contrib/securityonion/contrib/securityonion-elsa-cron.sh > /dev/null 2>&1
Thanks,
James Gordon
James,
What is the output of the following?
sudo mysql --defaults-file=/etc/mysql/debian.cnf -Delsa_web -e 'select * from saved_results
Thanks,
Wes
--
Follow Security Onion on Twitter!
https://twitter.com/securityonion
---
You received this message because you are subscribed to the Google Groups "security-onion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to security-onion+unsubscribe@googlegroups.com.
To post to this group, send email to security-onion@googlegroups.com.
Visit this group at https://groups.google.com/group/security-onion.
For more options, visit https://groups.google.com/d/optout.