Re: [security-onion] need help, my SO hangs at time

838 views
Skip to first unread message

Michal Purzynski

unread,
Apr 17, 2013, 6:58:17 AM4/17/13
to securit...@googlegroups.com
On 4/17/13 11:58 AM, jozzy tan wrote:
> at times, my SO was running at top resource and it hangs everything and it was so bad that i cannot even get access to it. it will only get normal after a force power cycle but it clears off every bit in the current bro logs. headache !!! please advise, thanks
>
We need more info to help you. How does the hang look like? Is the load
very high (uptime command in shell will tell you). Does the machine
freeze completely? Do you see any kernel panic?

When it appears to be hung, can you ping it from some other host and
does it respond?

And please run and share the sostat output.

Doug Burks

unread,
Apr 17, 2013, 4:59:20 PM4/17/13
to securit...@googlegroups.com
Hi Patrick,

Please send the output of the following (redacting sensitive info as necessary):
sudo sostat

Thanks,
Doug


On Wed, Apr 17, 2013 at 11:52 AM, Patrick Gardella
<patrick....@asburyseminary.edu> wrote:
> OK, more data now that the system is up again.
>
> Normally the console also froze up and I couldn't see any errors on the console monitor. This time it stayed up enough for me to see what was on the screen (but not log in).
>
> I saw a bunch of out of memory errors and processes being killed. The processes were tclsh, ruby, and prads.
>
> I am seeing one tclsh script taking huge amounts of memory and RAM:
>
> 2487 ? R 37:03 tclsh /usr/bin/sguild -c /etc/nsm/securityonion/sguild.conf -a /etc/nsm/securityonion/autocat.conf -g /etc/nsm/securityonion/sguild.queries -A /etc/nsm/securityonion/sguild.access -C /etc/nsm/securityonion/certs
>
> It is using 4GB of RAM and 99% of CPU. That doesn't seem normal.
>
> It is an HP DL380 G5:
> Dual Intel(R) Xeon(R) CPU E5440 @ 2.83GHz
> 16 GB RAM
>
> Patrick
>
> On Wednesday, April 17, 2013 9:44:29 AM UTC-4, Patrick Gardella wrote:
>> I was logging in to see if others were having the same problem, and to ask around.
>>
>> I've had the same problem for quite a while and have been trying to gather diagnostics, which has been tough, since I have to force a reboot to log back in. I also need to reconfigure the sensors (to exactly what they were before) for it to gather traffic again. After the last time, I started capturing some statistics to a log. As soon as I reboot the server, I'll see what I found in those logs.
>>
>> In my case, I can ping the server and it responds very quickly. But the web interface and ssh just hang.
>>
>> I am running the latest (except for this morning's update) on a stock Ubuntu 12.04 installation. It is running on an HP G5 server with a 1TB HW RAID cluster internally. We normally have around 80 MBPS of traffic during peaks.
>>
>> This freeze happens about once a week for me.
>>
>> So more to follow...
>>
>> Patrick
> --
> You received this message because you are subscribed to the Google Groups "security-onion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
> To post to this group, send email to securit...@googlegroups.com.
> Visit this group at http://groups.google.com/group/security-onion?hl=en-US.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



--
Doug Burks
http://securityonion.blogspot.com

Heine Lysemose

unread,
Apr 18, 2013, 3:20:39 AM4/18/13
to securit...@googlegroups.com
Hi Jozzy
 
To start with you should catagorize your events in Sguil. You have 2.5 mio uncatagorized event. Each time your restart Sguil service/reboot your server Sguil will load 2.5 mio events into memory.
 
Regards,
Lysemose


On Thu, Apr 18, 2013 at 3:53 AM, jozzy tan <jozz...@gmail.com> wrote:
On Wednesday, 17 April 2013 17:58:35 UTC+8, jozzy tan  wrote:
> at times, my SO was running at top resource and it hangs everything and it was so bad that i cannot even get access to it. it will only get normal after a force power cycle but it clears off every bit in the current bro logs. headache !!! please advise, thanks

attached is sostat after reboot, SO hangs on me again

Michal Purzynski

unread,
Apr 18, 2013, 5:45:14 AM4/18/13
to securit...@googlegroups.com
On 4/18/13 3:53 AM, jozzy tan wrote:
On Wednesday, 17 April 2013 17:58:35 UTC+8, jozzy tan  wrote:
at times, my SO was running at top resource and it hangs everything and it was so bad that i cannot even get access to it. it will only get normal after a force power cycle but it clears off every bit in the current bro logs. headache !!! please advise, thanks
attached is sostat after reboot, SO hangs on me again

What I can see:

You have only 4GB of memory and 3GB in swap. That's really bad.

up 8 min,  1 user,  load average: 22.00, 12.51, 5.57

Load 22 means the system is seriously overloaded. The server you use (judging by the hostname) is a single i3 CPU, which is nice, but has only 4 cores, right?

36.4%wa - something is waiting for disk here.

pkt_drop_percent as 30.703 - as a result of a high system load

Tot Packets        : 7448093
Tot Pkt Lost       : 2773677

1. As you've been already advised, you must categorize events. See https://code.google.com/p/security-onion/wiki/ManagingAlerts
2. You need more RAM, or the system will be moving things between swap and memory all the time.
3. Disable some unnecessary rule categories in pulledpork, and run rule-update

Patrick Gardella

unread,
Apr 22, 2013, 4:07:03 PM4/22/13
to securit...@googlegroups.com
After making the various tweaks (syslog-ng and filtering several noisy signatures) we've not had any issues since.  So I think I'm good at this point.

Patrick



--
You received this message because you are subscribed to a topic in the Google Groups "security-onion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/security-onion/-Ul__3wfVbE/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to security-onio...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages