Wazuh Archives - Indices Rotation

1,394 views
Skip to first unread message

Khul Sat

unread,
Jun 16, 2023, 3:55:17 PM6/16/23
to Wazuh mailing list
Hello Team..

Indexer part of the Wazuh components is I found to be complex. At least my brain stops working when it comes to Indexers or Opensearch. Yes, it's been just 6months I am exploring Wazuh by my own with the help of this awesome community group.

(Apologies in advance if I am using different terms per my understanding)

What I want to achieve:
Create an Index Pattern which can move 60days older data to the new pattern. I found this article and followed the steps, hoping that older data will be moved to archives. But I am still confused that I did not get any option to choose that 60days rotation. I thought that `wazuh-archives-*` would be empty but it has got huge data already.

Could someone please help me? I read couple of documents but my small brain failed to absorb things.

I also read this article about creating policies but that too was difficult to understand.
As I have moved my setup to production, I fear of loosing data if do something wrong.
Please help!

Thanks, KS




Jose Camargo

unread,
Jun 16, 2023, 5:34:01 PM6/16/23
to Wazuh mailing list
Hi Khul,

Thank you for reaching out to us. We understand that managing the indexer component can be complex, especially when it comes to index patterns and data rotation.
The 'wazuh-archives-*' indices are being used to store data from /var/ossec/logs/archives/, and such data is only available if the <logall_json> option (in your manager's ossec.conf) is enabled. Having this option enabled will save every log Wazuh receives, even if it does not match any rule or decoder. So, if you see a big amount of data there, you should try disabling the logall_json option (if you don't need that data).

I'm not sure why would you want to move data from one index pattern to another, as it can cause issues depending on the templates applied. If your final goal is to save space and have a clean indexer, you can apply the recommended policy:

{
    "policy": {
        "description": "Wazuh index state management for OpenDistro to move indices into a cold state after 60 days and delete them after a year.",
        "default_state": "hot",
        "states": [
            {
                "name": "hot",
                "actions": [
                    {
                        "replica_count": {
                            "number_of_replicas": 1
                        }
                    }
                ],
                "transitions": [
                    {
                        "state_name": "cold",
                        "conditions": {
                            "min_index_age": "60d"
                        }
                    }
                ]
            },
            {
                "name": "cold",
                "actions": [
                    {
                        "read_only": {}
                    }
                ],
                "transitions": [
                    {
                        "state_name": "delete",
                        "conditions": {
                            "min_index_age": "365d"
                        }
                    }
                ]
            },
            {
                "name": "delete",
                "actions": [
                    {
                        "delete": {}
                    }
                ],
                "transitions": []
            }
        ],
       "ism_template": {
           "index_patterns": ["wazuh-alerts*"],
           "priority": 100
       }
    }
}

What this policy does is move indices older than 60 days to a "cold state", meaning the data is still there (it can be recovered), but it's not searchable; and then deletes data older than 365 days. You can modify these time periods as you prefer, but it is recommended to apply this kind of policy.

I'll be awaiting your comments.

Regards,
Jose Camargo

Khul Sat

unread,
Jun 16, 2023, 9:44:07 PM6/16/23
to Wazuh mailing list

Thank a lot for a quick reply!

<logall_json> is enabled to have all raw logs stored as per the compliance requirement. 90 days older archive logs are then dumped to the remote location. Is there a different approach should I adapt to achieve this? I would like to explore other options if any.
I read in multiple threads/posts that once the indices grow bigger, it becomes difficult to manage and increases the searching time. Hence my idea was to move 60days older data from wazuh-alerts-* to wazuh-archives-*. May be my understanding is wrong!?!? Now get the relation between logall_json & wazuh-archives-*. If asked during the audits to show old logs, I can import archived zips and present the data to the auditors, right?

I checked few posts about policies you mentioned above; query here is, where does this data go after 60days? Will it be there in wazuh-alerts-* itself or it will create another index pattern? This is kind of confusing for a noob like me.
This hot / cold would be a storage parameter right? How would I be able to check my configuration and decide further course of action?

Regards, KS

Khul Sat

unread,
Jun 21, 2023, 4:28:46 AM6/21/23
to Wazuh mailing list
Any suggestions please?

Jose Camargo

unread,
Jun 21, 2023, 3:19:58 PM6/21/23
to Wazuh mailing list
Hi Khul,

For your first question, you have different ways of doing it. One is to use a cronjob to move the log files in /var/ossec/logs/archives to another disk or network location; you can also write a script that does the same but uploads such logs to an AWS bucket or similar, depending on your needs. The other one would be to use the process described here to perform a similar approach but with the indices instead of the logs.

As you are going to have archives enabled, which as you mentioned can consume lots of space, you will probably have to set up a backup procedure for either physical logs or indices.

If you choose to backup physical logs, then it's ok to set up an Index Policy that removes indices older than. let's say, 60 days because all of the data will still be on the logs; so in case of an audit, you can show such logs as proof. The same applies to the reverse process, if you back up your indices, then you can probably create a simple cronjob that removes the physical logs after x amount of time, as you will still have the backed-up indices to recover in case of an audit.

Regarding the Index Policy and the hot/cold states, you can check more info on that here, but basically what it means is that hot data refers to recent data that is searchable and modifiable if necessary, so it's data that you will need to access regularly; and cold means that the data lies on a partially mounted index, meaning that the data still exists but it's not searchable unless you need it to be (recovering it). The data lies on the same index pattern, so it is not moved to a different one, but the level of access to it is modified, thus saving disk space and processing power.

I strongly suggest that you test such behavior in a local lab before applying such policies to a production environment. You can set up a test policy that sends indices to cold state after one day, just so you can see how it behaves.


I'll be awaiting your comments.


Regards,
Jose Camargo

Khul Sat

unread,
Jul 6, 2023, 8:26:38 AM7/6/23
to Wazuh mailing list
Thank you Jose!

What is advised and recommended when it comes to backup & restore?
For now, I have taken a first approach:
Enable logall_json > move archived logs from /var/ossec/logs/archives to s3 bucket on monthly basis.

This is an approach applied on Manager. As mentioned about the 2nd option, which involves Indexer, I shall certainly study it.

If we talk about the disk utilization, which option would consume more disk?


Regards - KS

Khul Sat

unread,
Jul 14, 2023, 1:17:46 AM7/14/23
to Wazuh mailing list
Please suggest.
Thanks, KS

Jose Camargo

unread,
Jul 17, 2023, 6:59:24 PM7/17/23
to Wazuh mailing list
Hi Khul,

Sorry for the late response. Again, you can do the same process for logs in /var/ossec/logs/alerts, depending on your compliance standard.
This also depends on how you have your environment deployed and how much data you are ingesting. Still, it is always recommended to have these policies at least for alerts, archives, and the indices, as they will consume the most disk space. If you allow me to put it in a ranking of "disk space consumption", first will be archives, then the indices, and then alerts. Archives will be bigger because they catch everything; the indices because each index has the respective day's data plus extra enrichment information, and then alerts which are the regular logs.
This does not mean that you can only take care of one and forget about the others, as I mentioned before, it is recommended to have policies for the 3 of them.

I'll be awaiting your comments.

Regards,
Jose Camargo

Reply all
Reply to author
Forward
0 new messages