ELSA Where are all the logs coming from

2,958 views
Skip to first unread message

Hurgh

unread,
Feb 17, 2014, 6:23:54 PM2/17/14
to securit...@googlegroups.com
Hi All,

Apologies for the silly question, but I am trying to work out where ELSA is getting all it's logs from.

I have had my system crash once due to full disk space. Once it has been re-booted, it appears that most of the disk space is used by the "/nsm/elsa/data/elsa/tmp/buffers" direcrory.

According to the elsa config file, this is the dir where logs are stored before being processed.

I had a quick look at some of the files in the dir, and I noticed that there are a huge amount of lines in these files with windows related details along with lines mentioning Argus and other things.

Now, the Argus lines I can understand as security-onion runs argus, but the windows things i am confused about.

The only thing that makes sense to me is that these log entries are being pulled off the interfaces that are being monitored as this box is not a windows box, and i have not configured anything to specifically send log data to it.

Also, the sheer volume of the logs is surprising, as the buffers dir mentioned above used up 1.6tb of 1.8tb on the /nsm mount, in a matter of days max.

So simple question is where does all the log data for elsa come from?

Also, is there a way to make sure the /nsm/elsa/data/elsa/tmp/buffers directory does not fill up the disk again?
I have the cleanup setting set to 80% so i would have thought that would keep the size down.

Thanks

Allan

Matt Gregory

unread,
Feb 17, 2014, 7:08:56 PM2/17/14
to securit...@googlegroups.com
Hi Allan,

ELSA indexes logs from a couple of sources, depending on what you enabled during setup:

- Bro logs, stored at /nsm/bro/logs
- Snort or Suricata alerts
- OSSEC logs and alerts (from the SO box itself, unless you installed the OSSEC agent on other machines and connect them to SO)

Full PCAP data, which is not indexed in ELSA, takes up the most disk space of any single data source, much more than the logs listed above.  How much bandwidth are you monitoring?

Please send the output of sostat-redacted (you may have to manually redact some sensitive info) and we can try to diagnose the disk usage issue.

Matt



--
You received this message because you are subscribed to the Google Groups "security-onion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
To post to this group, send email to securit...@googlegroups.com.
Visit this group at http://groups.google.com/group/security-onion.
For more options, visit https://groups.google.com/groups/opt_out.

Hurgh

unread,
Feb 17, 2014, 8:31:59 PM2/17/14
to securit...@googlegroups.com
Hi matt,

Thanks for the response.

I didnt enable PCAP for the reasons you mentioned.

I am not able to upload a full sostat, but I can provide some of the info here:

Just for reference, the system uptime is:
01:26:03 up 2:52, 1 user, load average: 8.58, 9.04, 9.08

(as in, it was re-booted a few hours ago)

Status: Bro
Name Type Host Status Pid Peers Started
manager manager X.X.X.X running 4479 9 17 Feb 22:35:04
proxy proxy X.X.X.X running 4746 9 17 Feb 22:35:08
<ServerName>-eth10-1 worker X.X.X.X running 5034 2 17 Feb 22:35:23
<ServerName>-eth11-1 worker X.X.X.X running 5206 2 17 Feb 22:35:27
<ServerName>-eth3-1 worker X.X.X.X running 5591 2 17 Feb 22:35:59
<ServerName>-eth5-1 worker X.X.X.X running 6476 2 17 Feb 22:37:59
<ServerName>-eth6-1 worker X.X.X.X running 6673 2 17 Feb 22:38:03
<ServerName>-eth7-1 worker X.X.X.X running 6941 2 17 Feb 22:38:07
<ServerName>-eth8-1 worker X.X.X.X running 7152 2 17 Feb 22:38:11
<ServerName>-eth9-1 worker X.X.X.X running 7444 2 17 Feb 22:38:14
Status: <ServerName>-eth10
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]
Status: <ServerName>-eth11
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]
Status: <ServerName>-eth3
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]
Status: <ServerName>-eth5
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]
Status: <ServerName>-eth6
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]
Status: <ServerName>-eth7
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]
Status: <ServerName>-eth8
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]
Status: <ServerName>-eth9
* snort_agent-1 (sguil)[ OK ]
* snort-1 (alert data)[ OK ]
* barnyard2-1 (spooler, unified2 format)[ OK ]
* argus[ OK ]

=========================================================================
Interface Status
=========================================================================
eth0 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:22703 errors:0 dropped:0 overruns:0 frame:0
TX packets:34817 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2288598 (2.2 MB) TX bytes:35712760 (35.7 MB)
eth3 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:2641109 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:928110707 (928.1 MB) TX bytes:0 (0.0 B)
eth5 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:77747746 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:46178739072 (46.1 GB) TX bytes:70 (70.0 B)
eth6 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:357744409 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:143587704653 (143.5 GB) TX bytes:70 (70.0 B)
eth7 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:205719 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:16617830 (16.6 MB) TX bytes:70 (70.0 B)
eth8 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:106426 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8583576 (8.5 MB) TX bytes:70 (70.0 B)
eth9 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:61184647 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:24964708845 (24.9 GB) TX bytes:70 (70.0 B)
eth10 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:9745840 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4284400528 (4.2 GB) TX bytes:70 (70.0 B)
eth11 Link encap:Ethernet HWaddr MM:MM:MM:MM:MM:MM
UP BROADCAST RUNNING NOARP PROMISC MULTICAST MTU:1500 Metric:1
RX packets:18734816 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:10232412714 (10.2 GB) TX bytes:70 (70.0 B)


=========================================================================
Disk Usage
=========================================================================
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 115G 6.5G 103G 6% /
udev 9.8G 4.0K 9.8G 1% /dev
tmpfs 4.0G 908K 4.0G 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 9.8G 0 9.8G 0% /run/shm
/dev/sdb1 1.8T 1.2T 514G 71% /nsm


Other Info:
# ls /nsm/elsa/data/elsa/tmp/buffers | wc -l
9737

# du -hs /nsm/elsa/data/elsa/*
132M /nsm/elsa/data/elsa/log
24G /nsm/elsa/data/elsa/mysql
1.2T /nsm/elsa/data/elsa/tmp

# sudo du -hs /nsm/*
5.2G bro
1.2T elsa
16K lost+found
995M sensor_data
1.3M server_data


So as you can see, the Elsa tmp directory is taking up the space.

I am assuming that there are just too many logs for the elsa process to keep up with and it eventually overwhelms the disk space....

Thanks

Allan

Doug Burks

unread,
Feb 17, 2014, 9:42:19 PM2/17/14
to securit...@googlegroups.com
On Mon, Feb 17, 2014 at 8:31 PM, Hurgh <hu...@hurgh.org> wrote:
> # ls /nsm/elsa/data/elsa/tmp/buffers | wc -l
> 9737

This is abnormally high and most likely means that logs are no longer
being processed. A possible culprit is a MySQL table marked as
crashed. Have you had any ungraceful shutdowns?

Please try the following:
sudo mysqlcheck -A


--
Doug Burks

Hurgh

unread,
Feb 18, 2014, 7:34:49 PM2/18/14
to securit...@googlegroups.com
Hi Doug,

On Tuesday, 18 February 2014 13:42:19 UTC+11, Doug Burks wrote:

> > # ls /nsm/elsa/data/elsa/tmp/buffers | wc -l
>
> > 9737
>
>
>
> This is abnormally high and most likely means that logs are no longer
>
> being processed. A possible culprit is a MySQL table marked as
>
> crashed. Have you had any ungraceful shutdowns?
>

Currently the only ungraceful shutdown was yesterday (or the day before).
I came in and found that the server was un-responsive, loaded the console and it had crashed, I forced a re-boot and then found that /nsm was full. I just manually removed about 500gb of the logs from the elsa tmp directory and then posted the question.

I am not sure if something stopped working causing the logs to stop being processed, or if i just had too many logs at once and it filled up the disk before it could all be processed.
I suspect the former, but have no easy way to tell.

> Please try the following:
>
> sudo mysqlcheck -A
>

Everything responds "OK" other than this:

syslog_data.syslogs_archive_1
Error : Can't find file: 'syslogs_archive_1' (errno: 2)
status : Operation failed

Regards

Allan

Doug Burks

unread,
Feb 19, 2014, 8:28:17 AM2/19/14
to securit...@googlegroups.com
On Tue, Feb 18, 2014 at 7:34 PM, Hurgh <hu...@hurgh.org> wrote:
> Everything responds "OK" other than this:
>
> syslog_data.syslogs_archive_1
> Error : Can't find file: 'syslogs_archive_1' (errno: 2)
> status : Operation failed

I think ELSA should try to re-create syslogs_archive_1 automatically
on startup, so please try rebooting.


--
Doug Burks

Hurgh

unread,
Feb 19, 2014, 5:22:47 PM2/19/14
to securit...@googlegroups.com
Just tried a re-boot but no luck:

syslog_data.syslogs_archive_1
Error : Can't find file: 'syslogs_archive_1' (errno: 2)
status : Operation failed

I will have a look over the elsa doco and see if there is a way to sort it.


Allan

On Thursday, 20 February 2014 00:28:17 UTC+11, Doug Burks wrote:

Hurgh

unread,
Feb 19, 2014, 8:40:11 PM2/19/14
to securit...@googlegroups.com
On Thursday, 20 February 2014 09:22:47 UTC+11, Hurgh wrote:
> Just tried a re-boot but no luck:
>
> syslog_data.syslogs_archive_1
> Error : Can't find file: 'syslogs_archive_1' (errno: 2)
> status : Operation failed
>
> I will have a look over the elsa doco and see if there is a way to sort it.
>

Ok, so I have done some reading on the ELSA mail group and found a solution.

The first suggestion was to try to repair the table:

mysql -uroot syslog_data -e "REPAIR TABLE syslogs_archive_1"

This gave me the error that it could not find the table.
Something I confirmed with a "show tables;" command in the syslog_data dabase.

From there, I followed advice of dropping the table (even though it didnt exist) and removing it from the syslog.tables metadata table:

mysql -uroot syslog_data -e "DROP TABLE syslog_data.syslogs_archive_1"
mysql -uroot syslog_data -e "DELETE FROM syslog.tables WHERE table_name='syslog_data.syslogs_archive_1'"

I then re-started syslog-ng (Elsa restarts with a syslog-ng restart and attempts to re-create tables), and then checked mysql again, but still the same issue.

I did some more reading and there was a suggestion to remove the actual files used for the tables.
I did a quick "ls /nsm/elsa/data/elsa/mysql/" and sure enough there were a couple of files called syslogs_archive_1*

I promptly re-ran the mysql drop table and delete from commands, then removed the physical files, and re-started syslog-ng and it seems to be processing logs again.

The file count has dropped over 100 in the past 3 hours so it looks like it will be a long time but it is working.

Final Command sequence:

mysql -uroot syslog_data -e "DROP TABLE syslog_data.syslogs_archive_1"
mysql -uroot syslog_data -e "DELETE FROM syslog.tables WHERE table_name='syslog_data.syslogs_archive_1'"

sudo rm /nsm/elsa/data/elsa/mysql/syslogs_archive_1*

sudo service syslog-ng restart

Obviously, you will lose all the data that the table stores, but for me it was that or re-install Security Onion.

Hope that helps someone else out sometime.

Also, please feel free to let me know of any other possible solutions or ways to detect and fix this with out dropping tables.

Allan

Andrew Colfelt

unread,
Apr 30, 2014, 1:29:38 AM4/30/14
to securit...@googlegroups.com
This procedure was very helpful, thank you !

Except, in my case, there is another wrinkle...

I came across this thread while troubleshooting a large number of files in
/nsm/elsa/data/elsa/buffers ( 39,000 files ! ), and NULL values for ELSA Date Range from sostat.

sudo mysqlcheck -A revealed the same results as shown by Hurgh above:

syslog_data.syslogs_archive_1
Error : Can't find file: 'syslogs_archive_1' (errno: 2)
status : Operation failed


The dates covered by these buffer files were April 1 through April 28.

After DROPPING syslog_data.syslogs_archive_1, DELETEing, and restarting syslog-ng ( as above ), the number of files in /nsm/elsa/data/elsa/buffers dropped to about half the number it was before ( this took some time ), and mysqlcheck was now happy.


ll /nsm/elsa/data/elsa/mysql/ now looks like this:

-rw-rw---- 1 mysql mysql 157524426 Apr 30 05:25 syslogs_archive_10000101.ARZ
-rw-rw---- 1 mysql mysql 569343439 Apr 30 05:03 syslogs_archive_1.ARZ
-rw-rw---- 1 mysql mysql 931077884 Apr 30 05:25 syslogs_index_10000053.MYD
-rw-rw---- 1 mysql mysql 50106368 Apr 30 05:25 syslogs_index_10000053.MYI
-rw-rw---- 1 mysql mysql 3170916900 Apr 29 06:28 syslogs_index_1.MYD
-rw-rw---- 1 mysql mysql 144172032 Apr 30 05:25 syslogs_index_1.MYI


/nsm/elsa/data/elsa/tmp/buffers:

Still has 19K files in it, and the dates covered by these files is April 1 through April 15, so only the second half of the 39K files were processed as expected, leaving 19K files unprocessed.

So the question is "why are the first half of April's files still stuck in queue ?"

Specifically:
1. What impact does this have, having a bunch of elsa buffer files that appear to be orphaned ?
2. Is there anything I can do to get these files processed normally ?

Doug Burks

unread,
Apr 30, 2014, 8:21:43 AM4/30/14
to securit...@googlegroups.com
Hi Andrew,

Please see:
https://code.google.com/p/security-onion/wiki/MailingLists#Start_a_new_thread_instead_of_replying_to_an_old_one
> --
> You received this message because you are subscribed to the Google Groups "security-onion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
> To post to this group, send email to securit...@googlegroups.com.
> Visit this group at http://groups.google.com/group/security-onion.
> For more options, visit https://groups.google.com/d/optout.



--
Doug Burks

Andrew Colfelt

unread,
May 1, 2014, 5:32:07 PM5/1/14
to securit...@googlegroups.com
This thread is the backstory for a new thread, which, if you have read this far, you should go read now:

https://groups.google.com/forum/#!topic/security-onion/MHf2ogdm3l4

Reply all
Reply to author
Forward
0 new messages