Does Elasticsearch have a default log retention period?
If so, how do I change that.
If not, what is the best way to go about purging old logs?
Is it recommended to take back-up on wazuh or elasticsearch side?
Hi Chakraborty,
Elasticsearch takes disk space and depending on the configuration, you might have replicas of the same data in different nodes (Elasticsearch index replicas). On the other
hand, Wazuh itself stores the alerts in rotated log files. I’m going to explain some data retention strategies for both Elasticsearch and Wazuh as well.
Elasticsearch retention policies
Elasticsearch has index lifecycle management settings, you can define policies depending on the size, the age, and many other properties.
Here is explained how to add a new policy, it’s not the best example for time-based indices but the idea is very similar, you should decide the policy depending on your needs.
Since we use time-based indices, you may want to create the policies at the index template level, then every new index will apply your policies.
Elasticsearch S3 snapshot repository
There is a plugin for Elasticsearch that you may found interesting for your use case, repository-s3, which allows you to add S3 storage as an Elasticsearch snapshot repository.
Then you can automate your backup policy, store the backups in S3 and then, safely delete your old indices. This solution has no phases as the previous solution does.
Wazuh
Your alerts for today live in /var/ossec/logs/alerts/alerts.json and the Wazuh manager rotates them every day, an individual directory is created for each month and year.
Here is an example:
ll -h /var/ossec/logs/alerts/2019/May/
total 3.5M
-rw-r-----. 1 ossec ossec 608K May 22 16:10 ossec-alerts-22.json
-rw-r-----. 1 ossec ossec 810K May 22 16:10 ossec-alerts-22.log
-rw-r-----. 1 ossec ossec 326K May 23 16:34 ossec-alerts-23.json
-rw-r-----. 1 ossec ossec 163K May 23 16:34 ossec-alerts-23.log
It means, 2019/05/22 and 2019/05/23 alerts were rotated in the directory alerts/2019/May/.
Final thoughts
As you can see you have two places where the data is stored: Elasticsearch and Wazuh, each one may have its own retention policy.
Note that Elasticsearch might not have all the data Wazuh has because some components between Wazuh and Elasticsearch might fail such as Filebeat or a networking issue may make you lose data in Elasticsearch.
Rotated alerts can be reindexed in Elasticsearch if needed so this is something to take care of because you may want to apply the next retention policy:
Reindexing rotated Wazuh logs vs restoring Elasticsearch snapshots
Rotated Wazuh logs from a large period (eg: one year), may take too many resources for reindexing because they are RAW events so they must be parsed, analyzed and stored in Elasticsearch
as new events, Elasticsearch may have also some ingest processors which would make the reindexing slower. The main benefit of doing this is that as I said, Wazuh manager will have always all the information
while Elasticsearch could fail at some point.
On the other hand, restoring Elasticsearch snapshots is faster than reindexing rotated logs because the events are already processed so you are skipping all that logic including Filebeat/Logstash and any
other pre-phase such as the ingest processors I told you previously.
Just an idea, there are many possibilities to achieve this goal and some of them are cheaper than other ones.
I hope you have now a clearer vision about the data retention on Wazuh-Elastic stacks.
Let us know your thoughts.
Regards,
Jesús
--
You received this message because you are subscribed to the Google Groups "Wazuh mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wazuh+un...@googlegroups.com.
To post to this group, send email to wa...@googlegroups.com.
Visit this group at https://groups.google.com/group/wazuh.
To view this discussion on the web visit https://groups.google.com/d/msgid/wazuh/d131fe6c-fbb5-4d42-af19-5a11a0647d58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.