Barnyard2 in Distirubuted Architecture

37 views
Skip to first unread message

Steve McLaughlin

unread,
Jun 12, 2013, 7:30:13 AM6/12/13
to barnyar...@googlegroups.com
Hi All,

Just wondering what peoples thoughts are as to whether its best to run by2 on the same box as the sensors, or on the database server?
I would think on the sensor box in first instance, but would be interested in what others are doing and why?

thanks,
Steve

beenph

unread,
Jun 12, 2013, 9:40:54 AM6/12/13
to barnyar...@googlegroups.com
Hi Steve,

My opinion might be biased here but i think you should run barnyard2
as near to/on the the sensor as possible.

If you have a distributed storage solution mabey you could share
snort/suricata output to a specific place where it would then be
processed,
but i would recommend that you keep your database server as mutch as
possible foccused on its task, thus being a database server.

Having a database server that could do some event processing could
work if you have a very low performance requirement / event
input/output, but if your demand grow over time,
you would certainly want (i think) task to be separated since the
chances for process to fight for ressource is there and in the end
you do not want to get your database server to get oom'ed killed
for example.

But again eveything depends on requirement/deployment senario.


-elz

Jason Haar

unread,
Jun 12, 2013, 10:52:18 PM6/12/13
to barnyar...@googlegroups.com
On 13/06/13 01:40, beenph wrote:
> If you have a distributed storage solution mabey you could share
> snort/suricata output to a specific place where it would then be
> processed, but i would recommend that you keep your database server as
> mutch as possible foccused on its task, thus being a database server.

Well yeah - but you are only talking about some rsync-style job that
copies the snort logs to a central directory, where barnyard then pushes
them into the localhost SQL server - should be effectively zero load
compared with the SQL work.

The advantage of that model is "rsync -za" is known to be robust -
probably more than barnyard doing "(compressed?)" SQL connections
potentially over a lossy WAN to a central SQL server. No offence on your
code-ship, it's just that rsync has a 20 year head start on you ;-)

Good conversation to have, I'm looking at the same thing myself. All
this is probably academic - as I expect shoving our 55 NIDS into one SQL
server is going to lead to a VERY slow SQL server - i.e. where the data
goes in is not where I expect the bottlenecks to be ;-)

Has there been any discussion surrounding a better backend for snort
data? The presentation API (eg which base uses) is filled with JOINS and
other complex SQL calls - leading to performance problems (ie sub-second
response gets hard once you hit 500K records). I've been wondering if a
complete backend storage replacement and a new frontend is the way
forward on this? Easy to say of course...

--
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

beenph

unread,
Jun 12, 2013, 11:52:37 PM6/12/13
to barnyar...@googlegroups.com
On Wed, Jun 12, 2013 at 10:52 PM, Jason Haar <Jason...@trimble.com> wrote:
> On 13/06/13 01:40, beenph wrote:
>> If you have a distributed storage solution mabey you could share
>> snort/suricata output to a specific place where it would then be
>> processed, but i would recommend that you keep your database server as
>> mutch as possible foccused on its task, thus being a database server.
>
> Well yeah - but you are only talking about some rsync-style job that
> copies the snort logs to a central directory, where barnyard then pushes
> them into the localhost SQL server - should be effectively zero load
> compared with the SQL work.
>
> The advantage of that model is "rsync -za" is known to be robust -
> probably more than barnyard doing "(compressed?)" SQL connections
> potentially over a lossy WAN to a central SQL server. No offence on your
> code-ship, it's just that rsync has a 20 year head start on you ;-)
>

Well you still have to handle your rsync connection and make
sure its allways up and manage storage localy and having a box that
can handle N by2 instances.

Its only delegating the problem down the stream, where has having
N snort process and N barnyard2 process on a sensor feeding a
database backend is quite simpler deployment wise, because techincally
your still going to have a box that runs
snort, unless your rsync your network traffic to a central location
(*wink wink*).

> Good conversation to have, I'm looking at the same thing myself. All
> this is probably academic - as I expect shoving our 55 NIDS into one SQL
> server is going to lead to a VERY slow SQL server - i.e. where the data
> goes in is not where I expect the bottlenecks to be ;-)
>

55 sensor is not alot of sensors. But 55 sensors using the
current schema is alot of workload for a untunned sql backend and even
a tunned backend
could hurt it a bit.

Post your database server spec.

Database backend need tunning too ...and archiving, you are also allways
able to create custom index depending on the UI you use to speed up things.

There is a tons of tricks. people need to understand that its not
necerssarly plug and play magical technology
as you scale up.

> Has there been any discussion surrounding a better backend for snort
> data? The presentation API (eg which base uses) is filled with JOINS and
> other complex SQL calls - leading to performance problems (ie sub-second
> response gets hard once you hit 500K records). I've been wondering if a
> complete backend storage replacement and a new frontend is the way
> forward on this? Easy to say of course...
>

We are planning on releasing a new proposed schema soon enough for testing,
but it currently has no UI projection for now. But since working
prototype is not out, its understandable.

-elz

Jason Haar

unread,
Jun 13, 2013, 6:09:03 AM6/13/13
to barnyar...@googlegroups.com
On 13/06/13 15:52, beenph wrote:
> Its only delegating the problem down the stream, where has having N
> snort process and N barnyard2 process on a sensor feeding a database
> backend is quite simpler deployment wise, because techincally your
> still going to have a box that runs snort, unless your rsync your
> network traffic to a central location (*wink wink*).

Heh - been there done that... vtun is great for pushing a SPAN over a
WAN... Until you saturate the WAN that is ;-)

Mike Patterson

unread,
Jun 13, 2013, 1:18:00 PM6/13/13
to barnyar...@googlegroups.com, jason...@trimble.com
On Wednesday, June 12, 2013 10:52:18 PM UTC-4, Jason Haar wrote:

Has there been any discussion surrounding a better backend for snort
data? The presentation API (eg which base uses) is filled with JOINS and
other complex SQL calls - leading to performance problems (ie sub-second
response gets hard once you hit 500K records). I've been wondering if a
complete backend storage replacement and a new frontend is the way
forward on this? Easy to say of course...

I don't get sub-second response, but I do generally get response times under a few seconds with a database currently at ~134.5m rows. I'm using the standard schema, with a few extra indexes, and some custom reporting code I wrote after I found the same thing you have.

My code is here:
and indexes I've currently added:

create index ets on event(timestamp);
create index esig on event(signature);
create index ssig_sid on signature(sig_sid);
create index ssig_id on signature(sig_id);
create index datacid on data(cid);
create index datasid on data(sid);
alter table iphdr add index idx_cid (cid);
alter table icmphdr add index idx_cid (cid);
alter table tcphdr add index idx_cid (cid);
alter table udphdr add index idx_cid (cid);

I've ignored beenph's usual advice of running the database on a machine separate from the one the Snort processes are running on due to layer 8/9 constraints.

I'm by no means an expert, but this setup runs well enough for me.
 

beenph

unread,
Jun 13, 2013, 7:54:08 PM6/13/13
to barnyar...@googlegroups.com, Jason Haar
On Thu, Jun 13, 2013 at 1:18 PM, Mike Patterson <mike.pa...@unb.ca> wrote:
>>
> I don't get sub-second response, but I do generally get response times under
> a few seconds with a database currently at ~134.5m rows. I'm using the
> standard schema, with a few extra indexes, and some custom reporting code I
> wrote after I found the same thing you have.
>
> My code is here:
> https://github.com/kraigu/snort_report
> and indexes I've currently added:
>
> create index ets on event(timestamp);
> create index esig on event(signature);
> create index ssig_sid on signature(sig_sid);
> create index ssig_id on signature(sig_id);
> create index datacid on data(cid);
> create index datasid on data(sid);
> alter table iphdr add index idx_cid (cid);
> alter table icmphdr add index idx_cid (cid);
> alter table tcphdr add index idx_cid (cid);
> alter table udphdr add index idx_cid (cid);

I would suggest that you uses (sid,cid) index on
event,data,iphdr,tcphdr,icmphdr,udphdr,opt and data
And mabey (sid,cid,signature) event OR a variant
(sid,cid,timestamp,signature) on event (but the later one depends on
how many time related query you make).

Now indexes will also depend on the backend you have but you can
generaly use EXPLAIN see how the planner
handle your query:

MySQL:
http://dev.mysql.com/doc/refman/5.7/en/explain.html

PostgreSQL:
http://www.postgresql.org/docs/9.2/static/using-explain.html



>
> I've ignored beenph's usual advice of running the database on a machine
> separate from the one the Snort processes are running on due to layer 8/9
> constraints.

Well it really depends on how many sensor you have and how many
clients to the database you have.

If you have under 10 sensor(snort/suricata) thus probably 10 barnyard2
(producer session) and i guess
2-5 (consumer web,sql-cli) it shouldn't make a huge differences.

But when you scale up, you would preferably want to have all the
clock ticks to go to the database server.

-elz
Reply all
Reply to author
Forward
0 new messages