Log Retention!

1,208 views
Skip to first unread message

Dev Chakraborty

unread,
May 28, 2019, 12:33:50 PM5/28/19
to Wazuh mailing list
Hi Team,

Does Elasticsearch have a default log retention period?

If so, how do I change that.

If not, what is the best way to go about purging old logs?

Is it recommended to take back-up on wazuh or elasticsearch side?


Best Regards
Dev

Dev Chakraborty

unread,
May 28, 2019, 3:13:20 PM5/28/19
to Wazuh mailing list
Adding to my question:
I am planning to have the logs for 1500 servers for one year, is it recommended to go with ossec wazuh or Elastic search
--
Best Regards
Dev

Jesús Ángel González

unread,
May 29, 2019, 3:53:41 AM5/29/19
to Wazuh mailing list

Hi Chakraborty,

Elasticsearch takes disk space and depending on the configuration, you might have replicas of the same data in different nodes (Elasticsearch index replicas). On the other
hand, Wazuh itself stores the alerts in rotated log files. I’m going to explain some data retention strategies for both Elasticsearch and Wazuh as well.

Elasticsearch retention policies

Elasticsearch has index lifecycle management settings, you can define policies depending on the size, the age, and many other properties.

  • Hot, the index is actively being updated and queried.
  • Warm, the index is no longer being updated but is still being queried.
  • Cold, the index is no longer being updated and is seldom queried. The information still needs to be searchable, but it’s okay if those queries are slower.
  • Delete, the index is no longer needed and can safely be deleted.

Here is explained how to add a new policy, it’s not the best example for time-based indices but the idea is very similar, you should decide the policy depending on your needs.

Since we use time-based indices, you may want to create the policies at the index template level, then every new index will apply your policies.

Elasticsearch S3 snapshot repository

There is a plugin for Elasticsearch that you may found interesting for your use case, repository-s3, which allows you to add S3 storage as an Elasticsearch snapshot repository. 

Then you can automate your backup policy, store the backups in S3 and then, safely delete your old indices. This solution has no phases as the previous solution does.

Wazuh

Your alerts for today live in /var/ossec/logs/alerts/alerts.json and the Wazuh manager rotates them every day, an individual directory is created for each month and year.

Here is an example:

ll -h /var/ossec/logs/alerts/2019/May/
total 3.5M
-rw-r-----. 1 ossec ossec 608K May 22 16:10 ossec-alerts-22.json
-rw-r-----. 1 ossec ossec 810K May 22 16:10 ossec-alerts-22.log
-rw-r-----. 1 ossec ossec 326K May 23 16:34 ossec-alerts-23.json
-rw-r-----. 1 ossec ossec 163K May 23 16:34 ossec-alerts-23.log

It means, 2019/05/22 and 2019/05/23 alerts were rotated in the directory alerts/2019/May/.

Final thoughts

As you can see you have two places where the data is stored: Elasticsearch and Wazuh, each one may have its own retention policy.

Note that Elasticsearch might not have all the data Wazuh has because some components between Wazuh and Elasticsearch might fail such as Filebeat or a networking issue may make you lose data in Elasticsearch.

Rotated alerts can be reindexed in Elasticsearch if needed so this is something to take care of because you may want to apply the next retention policy:

  • Always store and backup the rotated Wazuh logs.
  • Delete Elasticsearch indices older than X using a retention policy in your index template.
  • If you need to have old data again in Elasticsearch you can always reindex it from the rotated Wazuh logs.

Reindexing rotated Wazuh logs vs restoring Elasticsearch snapshots

Rotated Wazuh logs from a large period (eg: one year), may take too many resources for reindexing because they are RAW events so they must be parsed, analyzed and stored in Elasticsearch
as new events, Elasticsearch may have also some ingest processors which would make the reindexing slower. The main benefit of doing this is that as I said, Wazuh manager will have always all the information
while Elasticsearch could fail at some point.

On the other hand, restoring Elasticsearch snapshots is faster than reindexing rotated logs because the events are already processed so you are skipping all that logic including Filebeat/Logstash and any
other pre-phase such as the ingest processors I told you previously.

Just an idea, there are many possibilities to achieve this goal and some of them are cheaper than other ones.

I hope you have now a clearer vision about the data retention on Wazuh-Elastic stacks.

Let us know your thoughts.

Regards,
Jesús

Dev Chakraborty

unread,
Jun 4, 2019, 12:49:49 AM6/4/19
to Jesús Ángel González, Wazuh mailing list
Thanks for the mail Jesus:

Can you guide me , Can you guide me to documents where can I configure the back part for wazuh master?

--
You received this message because you are subscribed to the Google Groups "Wazuh mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wazuh+un...@googlegroups.com.
To post to this group, send email to wa...@googlegroups.com.
Visit this group at https://groups.google.com/group/wazuh.
To view this discussion on the web visit https://groups.google.com/d/msgid/wazuh/d131fe6c-fbb5-4d42-af19-5a11a0647d58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Best Regards
Dev

Jesús Ángel González

unread,
Jun 5, 2019, 3:15:16 AM6/5/19
to Wazuh mailing list
Hello again Dev,

As I said in my previous mail you have multiple options depending on your needs and your retention policies.

Elasticsearch should have all the data if the cluster health is fine, and Wazuh manager has always all the alerts in RAW format.

You can create some scripts / cron jobs to periodically send your rotated alerts to remote storage like cloud storage or a different instance
under your control.

For Elasticsearch you can create snapshots and store it in different instances or you can create an Amazon S3 repository for storing backups.

There is no guide for achieving this because there are a lot of ways to manage the backups, the basics are described in my previous mail.

In any case, let me know what you want to do and we can try to guide you through the best practices.

Best regards,
Jesús
Reply all
Reply to author
Forward
0 new messages