Elastic Beta Performance Testing

150 views
Skip to first unread message

Audrius J

unread,
Nov 11, 2017, 6:40:27 PM11/11/17
to security-onion
Hi,

Here is initial test. So here you will find my insights.

Test info: No load, 44GB of pcaps, 4.3 million events, snort/bro - 6 instances(each), ES 30g RAM, LS 8GB RAM + 12 workers. Bare metal installation.
DomainStats and Freq Analysis - set to "no".
SysInfo: r730xd Raid60 (4x12TB) + 2x128GB SSD, 128 GB RAM, CPU E5-2630 v4 @2.20GHz, 40 Cores

P.S. No load means, that no heavy queries were executed during these tests.

Pcaps were replayed using the command "tcpreplay -ieth1 -M$Speed /pcaps/data_ssd/*"
$Speed - means different speeds used during the tests, like M100, M200 etc...
All pcaps were placed on ssd drive and replayed from there.

Results:
Test Name | Total Time(s) | EPS (calculated) | Events per min (avg) | Actual pcap replayed speed (Mbps)
M100 | 3489 | 1358 | 81465 | 99
M200 | 1800 | 2475 | 148470 | 192
M300 | 1257 | 3487 | 209200 | 275
M400 | 978 | 4413 | 264764 | 353
M500 | 832 | 5182 | 310937 | 414
M600 | 727 | 5882 | 352944 | 475
M700 | 642 | 6667 | 400000 | 538
M800 | 590 | 7200 | 432000 | 585
M900 | 539 | 7848 | 470885 | 641
M1000 | 508 | 8333 | 500000 | 680

P.S. During the tests first and last 1 or 2 min results were removed from evaluation (to get most even distribution in histogram, because test could start/end in the middle of the first/last min.). Next the average was evaluated and divided by 60 to get EPS.
Please see attached pic, for better understanding. Red border marks what was evaluated.

LogType distribution
Log Type(s) | Count | "Log Type Distribution in %"
bro_conn | 1172894 | 27,2033
bro_files | 1134783 | 26,3194
bro_http | 602526 | 13,9746
bro_dns | 491361 | 11,3963
bro_ssl | 443372 | 10,2833
bro_x509 | 286088 | 6,6353
bro_syslog | 71040 | 1,6477
bro_weird | 60403 | 1,4009
snort | 25392 | 0,5889
bro_ssh | 9669 | 0,2243
bro_smtp | 6775 | 0,1571
bro_snmp | 3239 | 0,0751
bro_software | 3157 | 0,0732
bro_pe | 365 | 0,0085
ossec_archive | 244 | 0,0057
CRON | 191 | 0,0044
bro_notice | 36 | 0,0008
bro_ftp | 32 | 0,0007
ossec | 13 | 0,0003
su | 9 | 0,0002


My remarks
Pcap files were from real user environment. Results may change depending on how much flows users will produce.
In general, I expected worse results. But with some tuning and additional testing it may show good results. At least I was happy to see, that I can't kill the cluster easily :)

Problems:
- Sharding. For few million of logs ES works well. So, I run a scripts, which replays 44GB of pcap data 100 times (with configuration of test M700). After 6-8 hours, my bro index became ~200GB. So if you want to visualize a data for 8 hours - your query will timeout. According to my knowledge, normally "searchable shard" should be ~ 15GB, in general 30-50GB.
Great thing is, that query will timeout and not power down your cluster to red state. Bad thing, that if you will not redefine your search (like change time frame) some dashboards may fail to do visualization, because queries will time out.
During this specific test, persistent queue fills up too (1gb), so that means, that LS should tell syslog-ng to slow down (it is called back-pressure on the inputs). If syslog-ng doesn't have a buffer, that means that after bro will rotate logs not all logs will be sent (at least it should be like this), because logstash can't ingest them all.

- Stats Dashb.- it seems, that something is here wrong. Based on data on stats dashboard, I should only expect 10EPS per LS pipeline, which is not true. Because of this, I could not use it for my tests. So, I did a small tick and added a new field called an EventDate (timestamp, when event was processed) and based on this field did data histogram analysis. I found, that this produce accurate results.
In general, logstash has now performance metrics, which can be used for performance monitoring (https://www.elastic.co/guide/en/logstash/current/plugins-filters-metrics.html).

For every event filter used:
filter {
mutate {
add_field => { "EventDate" => "%{@timestamp}" }
}
}

Possible solutions:
- Still had not time to dive deep in entire project, so my suggestions might be wrong, but therefore:
a) Increase number of shards (not sure if this will help). I will try to test this to see if this will help. But I think it should...
b) For high speed installations, we can move to hourly index. But we need to take here care, how many indices our cluster will have (https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster)
c) Probably increase persistent queue to have good buffer. Don't know if here is any limitations or problems;
d) Probably logstash metrics filter would be better in evaluation of performance.

Additional recommended improvements:
- Would like to see the scripts to restart every docker container separately.

That's all for now!

Regards,
Audrius

EPS_eval.jpeg

Doug Burks

unread,
Nov 13, 2017, 7:40:42 AM11/13/17
to securit...@googlegroups.com
Hi Audrius,

These are excellent data points! Thanks so much for testing and
providing feedback! Further replies inline.

On Sat, Nov 11, 2017 at 6:40 PM, Audrius J <aud...@gmail.com> wrote:
> Hi,
>
> Here is initial test. So here you will find my insights.
>
> Test info: No load, 44GB of pcaps, 4.3 million events, snort/bro - 6 instances(each), ES 30g RAM, LS 8GB RAM + 12 workers. Bare metal installation.
> DomainStats and Freq Analysis - set to "no".

Were you finding that DomainStats and/or FreqServer were causing
performance issues?
Our current version of syslog-ng does not have any buffer. So perhaps
we should go ahead and move Bro logs from syslog-ng to being collected
directly by Logstash. Thoughts?

> - Stats Dashb.- it seems, that something is here wrong. Based on data on stats dashboard, I should only expect 10EPS per LS pipeline, which is not true. Because of this, I could not use it for my tests.

Are you referring to the Processing Time metrics? These metrics are
based on the logstash_time field, which is calculated in the following
Logstash filters:
1000_preprocess_log_elapsed.conf
8998_postprocess_log_elapsed.conf
I think the way they are calculated may be more representative of the
processing time of a batch of events rather than individual events.

> So, I did a small tick and added a new field called an EventDate (timestamp, when event was processed) and based on this field did data histogram analysis. I found, that this produce accurate results.
> In general, logstash has now performance metrics, which can be used for performance monitoring (https://www.elastic.co/guide/en/logstash/current/plugins-filters-metrics.html).
>
> For every event filter used:
> filter {
> mutate {
> add_field => { "EventDate" => "%{@timestamp}" }
> }
> }
>
> Possible solutions:
> - Still had not time to dive deep in entire project, so my suggestions might be wrong, but therefore:
> a) Increase number of shards (not sure if this will help). I will try to test this to see if this will help. But I think it should...

We've defaulted shards to 1 for the use case of running Evaluation
Mode (NOT running in production). Perhaps for users selecting
Production Mode, we should go back to the Elasticsearch default of 5
shards. Thoughts?

> b) For high speed installations, we can move to hourly index. But we need to take here care, how many indices our cluster will have (https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster)
> c) Probably increase persistent queue to have good buffer. Don't know if here is any limitations or problems;

The current queue size is probably sufficient for Evaluation Mode, but
we should probably increase for Production Mode. Perhaps make it a
percentage of total disk space. Thoughts?

> d) Probably logstash metrics filter would be better in evaluation of performance.

We'll consider that.

> Additional recommended improvements:
> - Would like to see the scripts to restart every docker container separately.

Yes, we are working on building out the control scripts. In the
meantime, if you're just trying to restart logstash to update filter
configuration, you should just be able to do something like:
sudo docker restart so-logstash

> That's all for now!

Thanks again for your valuable feedback!


--
Doug Burks

Kevin Branch

unread,
Nov 13, 2017, 8:03:20 AM11/13/17
to securit...@googlegroups.com
Huge thanks to Audrius for performing and sharing these performance tests!

I think it would be a great idea to move from Bro_logs->syslog-ng->Logstash to Bro_logs->Logstash in the pipeline.  If you have Bro write its logs in JSON format, which it natively supports doing, then Logstash should be able to hand the log entries off to Elasticsearch without doing much/any of any parsing, which I believe will seriously cut down on the Logstash bottleneck.
However, in fast enough environments, it is possible that the hourly Bro log rotation might take place before Logstash has finished reading all the way to the end of the current hour's logs.  That may need some consideration.  

Kevin


--
Follow Security Onion on Twitter!
https://twitter.com/securityonion
---
You received this message because you are subscribed to the Google Groups "security-onion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to security-onion+unsubscribe@googlegroups.com.
To post to this group, send email to security-onion@googlegroups.com.
Visit this group at https://groups.google.com/group/security-onion.
For more options, visit https://groups.google.com/d/optout.

Audrius J

unread,
Nov 13, 2017, 1:58:18 PM11/13/17
to security-onion
Hi Doug,

You are more than welcome! This is just small thing, how we can support this great project!
Now to answer your questions:

1. Were you finding that DomainStats and/or FreqServer were causing performance issues?
The main idea now was to get performance info about core components, which is really needed to make new transition.

I think that DomainStats and/or FreqServer is great addition and I will try to do the same tests with these features turned on.
Based on that data, we will be able to do comparison with the same condition and see penalties provided by them.


1a. Our current version of syslog-ng does not have any buffer. So perhaps

we should go ahead and move Bro logs from syslog-ng to being collected
directly by Logstash. Thoughts?

This question is quite tricky. syslog-ng is very fast and flexible program and it is perfect if you want to replicate the same information twice or even more time. I think, that previous versions of syslog-ng just commercial versions had a buffer. But now, they can have too (https://www.balabit.com/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/what-syslog-ng-is.html and https://syslog-ng.org/493-2/)
Also, it is very reliable, so if the buffer will be enabled on it, we will be able to not loose logs during the logstash crash/logs rotation and etc...
So, users will be able safely restart logstash...
Also, this can be always changed.


2. Are you referring to the Processing Time metrics? These metrics are

based on the logstash_time field, which is calculated in the following
Logstash filters:
1000_preprocess_log_elapsed.conf
8998_postprocess_log_elapsed.conf
I think the way they are calculated may be more representative of the
processing time of a batch of events rather than individual events.

Yes, I am referring to them. The main thing for me is to get information how the system perform. If they are misleading, so we can skip them and save cpu cycles. In meanwhile we can think about the use of metrics plugin to get better metrics.


3. We've defaulted shards to 1 for the use case of running Evaluation

Mode (NOT running in production). Perhaps for users selecting
Production Mode, we should go back to the Elasticsearch default of 5
shards. Thoughts?

Here is a 2 points.
a) We can go with 5 shards. But also, we need to provide an info to the user how they can change that.
b) Or we can use second logic. We can count the size of the index (with GET _cat/indices from Kibana console or using curl) and divide by 30.
If the size is 150GB and we divide by 30 we will get 5. In that case we can skip the process of increasing the number of shards.
If the size of index is 210GB and we divide by 30, we will get 7. So, we need to increase the number of shards to 7 or (7+1).


4. The current queue size is probably sufficient for Evaluation Mode, but

we should probably increase for Production Mode. Perhaps make it a
percentage of total disk space. Thoughts?

I think that yes, we can increase persistent queue. If we think, that it is a buffer, we can make it as big as we want. In that case all logs should be in a buffer until logstash will consume them. Probably it is good idea to make them percentage of total disk space, but not less than 50GB. If a user will need to change it, it can always do it and this information can be provided in a wiki. I will try to test it...
50GB probably will be able to hold the information for entire day (we need to think, that this is just message, not enriched).


Regards,
Audrius


Rob

unread,
Nov 13, 2017, 2:05:21 PM11/13/17
to security-onion

I second use of Logstash for pipeline as Logstash will throttle the flow control as needed. Beats to Logstash even better for the stream management, throttling, recovery as necessary under the hood.

Audrius J

unread,
Nov 13, 2017, 2:21:09 PM11/13/17
to security-onion
Hi Kevin,

Here you have great points. It can be done something like that.
I will try to test this too, as I said in previous post.

Speaking about json, one huge benefit we get - we don't need to be afraid of new fields. They are always in place, because of json.
Also, if message field is "-", where is no info in json, so we even can save a space.
Performance should also increase (probably)
Of course, you will need to rewrite parsers, at least change them.
I can't promise, but I will try to make small changes and make a test run...

Regards,
Audrius

Doug Burks

unread,
Nov 13, 2017, 4:21:10 PM11/13/17
to securit...@googlegroups.com
Hi Audrius,

Replies inline.

On Mon, Nov 13, 2017 at 1:58 PM, Audrius J <aud...@gmail.com> wrote:
> Hi Doug,
>
> You are more than welcome! This is just small thing, how we can support this great project!
> Now to answer your questions:
>
> 1. Were you finding that DomainStats and/or FreqServer were causing performance issues?
> The main idea now was to get performance info about core components, which is really needed to make new transition.
>
> I think that DomainStats and/or FreqServer is great addition and I will try to do the same tests with these features turned on.
> Based on that data, we will be able to do comparison with the same condition and see penalties provided by them.

Yes, it would be great to see some metrics here!

> 1a. Our current version of syslog-ng does not have any buffer. So perhaps
> we should go ahead and move Bro logs from syslog-ng to being collected
> directly by Logstash. Thoughts?
>
> This question is quite tricky. syslog-ng is very fast and flexible program and it is perfect if you want to replicate the same information twice or even more time. I think, that previous versions of syslog-ng just commercial versions had a buffer. But now, they can have too (https://www.balabit.com/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/what-syslog-ng-is.html and https://syslog-ng.org/493-2/)

Yes, there are lots of different options here and I'm aware that new
versions of syslog-ng include buffer capability, but I'm really trying
to avoid scope creep so that we can finish this initial Elastic
migration. So at this point, our options are:
- keep everything as-is with the current software versions (Bro -->
syslog-ng --> logstash)
OR
- move Bro log collection to logstash, avoiding syslog-ng (Bro --> logstash)

> Also, it is very reliable, so if the buffer will be enabled on it, we will be able to not loose logs during the logstash crash/logs rotation and etc...
> So, users will be able safely restart logstash...
> Also, this can be always changed.
>
>
> 2. Are you referring to the Processing Time metrics? These metrics are
> based on the logstash_time field, which is calculated in the following
> Logstash filters:
> 1000_preprocess_log_elapsed.conf
> 8998_postprocess_log_elapsed.conf
> I think the way they are calculated may be more representative of the
> processing time of a batch of events rather than individual events.
>
> Yes, I am referring to them. The main thing for me is to get information how the system perform. If they are misleading, so we can skip them and save cpu cycles. In meanwhile we can think about the use of metrics plugin to get better metrics.
>
>
> 3. We've defaulted shards to 1 for the use case of running Evaluation
> Mode (NOT running in production). Perhaps for users selecting
> Production Mode, we should go back to the Elasticsearch default of 5
> shards. Thoughts?
>
> Here is a 2 points.
> a) We can go with 5 shards. But also, we need to provide an info to the user how they can change that.
> b) Or we can use second logic. We can count the size of the index (with GET _cat/indices from Kibana console or using curl) and divide by 30.
> If the size is 150GB and we divide by 30 we will get 5. In that case we can skip the process of increasing the number of shards.
> If the size of index is 210GB and we divide by 30, we will get 7. So, we need to increase the number of shards to 7 or (7+1).

I'm not sure I understand option b). Doesn't this assume that we
already have Elasticsearch populated with data? I'm really just
talking about setting the default value when Elasticsearch is
initialized.

> 4. The current queue size is probably sufficient for Evaluation Mode, but
> we should probably increase for Production Mode. Perhaps make it a
> percentage of total disk space. Thoughts?
>
> I think that yes, we can increase persistent queue. If we think, that it is a buffer, we can make it as big as we want. In that case all logs should be in a buffer until logstash will consume them. Probably it is good idea to make them percentage of total disk space, but not less than 50GB.

Not less than 50GB? That might be excessive if the system has less
than 60GB space. So I'm thinking Evaluation Mode would continue to
default to 1GB, Production Mode would be based on a percentage of
total disk space and not less than 1GB, and of course we'd include
documentation on the wiki on adjusting the value in
/etc/logstash/logstash.yml.


--
Doug Burks

Doug Burks

unread,
Nov 13, 2017, 4:25:40 PM11/13/17
to securit...@googlegroups.com
Hi Audrius (and Kevin),

I do agree that json has some benefits, but I also agree with Audrius
that this would require changes to parsers, and I'm not sure if we
have time for that right now.

One other thing to keep in mind is that we'd like to support sending
Bro logs from the sensor across the network to another
Logstash/Elasticsearch backend and to do that with json will increase
the amount of network bandwidth consumed by that transmission compared
to our traditional Bro logs.

So, again to avoid scope creep of this project, I'm leaning towards
sticking with our existing Bro logs at the current time. We can
always revisit json logs at a later date.


--
Doug Burks

Audrius J

unread,
Nov 13, 2017, 5:13:10 PM11/13/17
to security-onion
Hi Doug,

Tomorrow I will start with test, where DomainStats and/or FreqServer are enabled.

Also, I will perform a test, when logs are read directly by LS from /nsm/bro/logs/current directory (it will take some time for a preparation and testing)

The option b would help users set a proper sharding on logstash-bro-* index. But for a default value, your suggestion, I think, is correct and safe. Other indexes may stay the same (with default shard of 1)

For persistent queue, we can event stick with default 1GB. But also advise, if somebody may expect data income spikes, they can increase this limit.

Other your assumptions make sense.

P.S. Also I see, that you have an _all field enabled in template. In SO case, it probably doesn’t make sense, because almost all indices starts with logstash-* (except elastalert_status, which can be renamed). In ES 6.0 this field is removed. Here is a discussion (https://github.com/elastic/elasticsearch/issues/19784).

Regards,
Audrius

Audrius J

unread,
Nov 15, 2017, 4:40:01 PM11/15/17
to security-onion
Hi,

Tried to perform the same tests with DomainStats and Freq options enabled and results are not very good. We have performance issues.
For example, with test M600, the duration of packet replay is ~730s, but it takes about 30 min for logstash to complete. Also after the queue is empty I have lost about 30% of logs.

If I increase persistence queue, it takes the same amount of time, but at least I do not lose any logs.

Also, it is strange for me, that syslog-ng doesn't slow down log shipping to logstash and just rejects (probably) connection, when the ls buffer is full...

I tried to do with DomainStats enabled and Freq disabled and vice versa.
Both have similar issues...

Will try to investigate more...

Regards,
Audrius

Doug Burks

unread,
Nov 30, 2017, 4:38:52 PM11/30/17
to securit...@googlegroups.com
Hi Audrius,

We released Beta 2 today and DomainStats and FreqServer are now
disabled by default when choosing Production Mode:
http://blog.securityonion.net/2017/11/elastic-stack-beta-2-release-and.html
> --
> Follow Security Onion on Twitter!
> https://twitter.com/securityonion
> ---
> You received this message because you are subscribed to the Google Groups "security-onion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to security-onio...@googlegroups.com.
> To post to this group, send email to securit...@googlegroups.com.
--
Doug Burks

Audrius J

unread,
Dec 2, 2017, 3:55:26 AM12/2/17
to security-onion
Hi Doug,

thanks for the update!
I will try to perform testing on this next week as well.
I think, that the best way to upgrade to new release, just start with new installation?

Regards,
Audrius

Doug Burks

unread,
Dec 2, 2017, 6:06:58 AM12/2/17
to securit...@googlegroups.com
Yes, a fresh installation would be best. Thanks!

Audrius J

unread,
Dec 23, 2017, 3:49:37 AM12/23/17
to security-onion
Hi,

finaly had a time to finish tests.
Results are very similar to alpha release

Test info: No load, 44GB of pcaps, 4.3 million events, snort/bro - 6 instances(each), ES 30g RAM, LS 8GB RAM + 12 workers. Bare metal installation.
DomainStats and Freq Analysis - set to "no".

SysInfo: r730xd Raid60 (4x12TB) + 2x128GB SSD, 128 GB RAM, CPU E5-2630 v4 @2.20GHz, 40 Cores

Release: Beta

P.S. No load means, that no heavy queries were executed during these tests.

Pcaps were replayed using the command "tcpreplay -ieth1 -M$Speed /pcaps/data_ssd/*"
$Speed - means different speeds used during the tests, like M100, M200 etc...
All pcaps were placed on ssd drive and replayed from there.

---------------------------------------------------------------------------------------------------------
Test_Name |Total_Time |EPS_(calculated) |Events_per_min_(avg) |Replayed_speed |Queue
---------------------------------------------------------------------------------------------------------
M100 | 3485 | 1359 | 81565 | 99 |ok
M200 | 1804 | 2474 | 148462 | 191 |ok
M300 | 1256 | 3491 | 209460 | 275 |ok
M400 | 981 | 4407 | 264435 | 351 |ok
M500 | 836 | 5141 | 308466 | 413 |ok
M600 | 731 | 5838 | 350273 | 472 |ok
M700 | 638 | 6512 | 390690 | 541 |queue ~ 400k events
M800 | 594 | 7162 | 429720 | 581 |queue ~ 700k events
M900 | 549 | 7721 | 463258 | 628 |queue > 900k events
M1000 | 525 | 8070 | 484222 | 657 |queue ~ 1mln events
---------------------------------------------------------------------------------------------------------

Regards,

Audrius

Audrius J

unread,
Dec 23, 2017, 3:50:39 AM12/23/17
to security-onion
Sorry for formatting...

Doug Burks

unread,
Dec 29, 2017, 3:49:24 PM12/29/17
to securit...@googlegroups.com
Thanks, Audrius! Please let us know if you have any specific
recommendations to increase performance other than what we've already
documented for our next release:
https://github.com/Security-Onion-Solutions/security-onion/issues/1179

Thanks!

Audrius J

unread,
Dec 30, 2017, 6:54:06 PM12/30/17
to security-onion
Hi Doug,

During several weeks I performed very different test and here is my notes.
With current configuration there is some problems with parsing csv logs (especialy http, ssl), because some strings, which exist in the logs will cause logstash to throw an error and not much can be done about it, because it violates CSV formating. This action is logged and it decreases performance. If your system has a lot of users and the main thing is surf the NET, you will experience performance degradation.

To test it better (check worst case scenario), I changed some configuration files in a way, that timestamp was assigned by the logstash and not by bro (ts->timestamp) field. In this case you can see, how long does it take to process logs, when the LS queue is full.
Also this modification helped with some automation, because in this case you can just reuse your existing logs and put them to bro current log like this:
cat conn.log >> /nsm/bro/logs/current/conn.log

For this test conn, dns, http, ssl logs were used.
The results I got, I didn't liked very much, so decided to take json road.

So, the latest findings are, that the best performance is achieved, if we switch bro logs to json and ingest them directly via logstash (with configuration very similar to 6001_bro_import.conf). Of course some twiking was done to so-logstash start script, syslog-ng, local.bro and etc. to make this work.
In this case I could achieve 15k EPS with almost empty logstash queue (GEO enrichment was included), with the same server configuration used in previous tests.
By replaying traffic with M1000, I was not be able to reach limits, so I reused the same idea, presented above and created some scripts to make a copy of existing logs and pipe them to the current log.

Histograms are assigned to this post, and pay atention on how many logs were processed with current (beta3) configuration and with json.
Both test were performed in the same way, just different parsing methods were used and some additional configs were changed and removed.
Do not pay a lot of attention on spikes, this is because of my dumb script, which copies logs to current log and then sleeps for 50s. Some log type contains more entries, so this is why spike comes in to the play. But it also shows, that it processes logs immediatelly, so no queue are in LS.

So, I think, this way (json) should be taken in to consideration.
Also by using this wiki https://github.com/Security-Onion-Solutions/security-onion/wiki/Bro-Fields and LS renaming functionality (and removing some config files) I got almost all dashboards working without any change, but some custom parsing were not working properly, so for test I just removed them...

Of course this test is very extreme case, but this should allow us to reach 1Gbps...


Regards,
Audrius

beta2_json.png
Beta3_current.png

Audrius J

unread,
Dec 30, 2017, 6:55:43 PM12/30/17
to security-onion
Ups, there is mistake in picture naming, both are beta3...

Audrius
-

Doug Burks

unread,
Dec 31, 2017, 5:10:38 PM12/31/17
to securit...@googlegroups.com
Hi Audrius,

This is really great data and I really appreciate you taking the time
to do these tests and document your results! As we've discussed
previously, I do agree that we need to support json at some point in
time.

In the meantime, Wes recently added some Logstash config to remove quotes:
https://github.com/dougburks/elastic-test/commit/cbb6a6123ea9cff62724d9b2924be2cccef41b52

Perhaps that will help with the CSV issue you were referring to?

Audrius J

unread,
Jan 1, 2018, 1:02:13 PM1/1/18
to security-onion
Hi Doug,

After using this new configuration, /var/log/logstash/logstash.log looks much better, but performance is very similar.
Also you can try to improve performance by using dissect filter. Take a look here https://www.elastic.co/blog/logstash-dude-wheres-my-chainsaw-i-need-to-dissect-my-logs.

Audrius

Doug Burks

unread,
Jan 1, 2018, 1:08:59 PM1/1/18
to securit...@googlegroups.com
Thanks, Audrius!
Reply all
Reply to author
Forward
0 new messages