hard time understanding Index and Index Management

120 views
Skip to first unread message

Andrew

unread,
Mar 2, 2023, 1:17:03 PM3/2/23
to Wazuh mailing list
Hi, 

Coming from Splunk and being fairly new to SIEM administration, I'm having hard time understanding how logs are indexed on Wazuh. For example, in Splunk GUI, I create an index name "linux" and index all my linux syslog messages with "linux" and define log retention policy per index. I used this index to filter my search, as well, so it's easier to filter alerts and logs by platform. Also, I was able to allow users to search only certain index that they have permission for. (i.e. linux team could only search linux logs using this linux index)

But in Wazuh, there is daily alert index created, all the logs that wazuh receives from various sources on that day is indexed with that wazuh-alerts-4.x.date index.

I've read the official docs on Wazuh indices and index management. I created and applied 30 day cold, 365 day deletion policy to all my alert indices anyways, but this is very different way of managing index compared to Splunk

Are these just 2 different methods of indexing logs, or was what I was doing with Splunk not ideal method? 
Can I create a static index like "linux" and index all my linux logs with it on wazuh?


appreciate your guidelines/thoughts!

Alexander Bohorquez

unread,
Mar 3, 2023, 9:54:37 AM3/3/23
to Wazuh mailing list
Hi Andrew,

If we talk about wazuh-indexer and the data engine we use (Opensearch/Elasticsearch). By default, it uses time-based indices as you mentioned. Although it is also a possibility to switch to indices created for X period of time defining different conditions.

Within the product, ISM/ILM is offered and allows us to rotate and delete data based on different criteria but it all depends on your requirements, how much data you are indexing per day, how many nodes you have in your cluster, and if HA is a requirement for you. The time you can use for an index will depend on all these criteria.

An important question here is how long you need to keep the data. By far most efficient way to delete data in Wazuh-indexer/Elasticsearch is to delete complete indices, and this is one of the main reasons why time-based indices are used. If you have a single index you need to delete using delete-by-query, which is much less efficient and will cause a lot higher load on your system.

As previously said, based on the size of your indices, this rotation or the index creation configuration must be defined, since under different criteria you could have performance problems in your cluster. This blog explains what makes up an index, what a shard is, and how I should configure it:

https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster

I would recommend using time-based indices. You can either use rollover to create new indices based on a combination of size and/or age.

Irrespective of whether you use rollover or time-based indices based on the index name you can use ILM to manage the rollover (if applicable) and retention.

I hope this information helps. Please let me know if you have any other questions.

Reply all
Reply to author
Forward
0 new messages