Wazuh not ingesting some Cloudtrail entries

69 views
Skip to first unread message

Jeremy Utley

unread,
Nov 5, 2025, 4:07:32 PM (6 days ago) Nov 5
to Wazuh | Mailing List
We have Wazuh analyzing Cloudtrail entries from our AWS accounts.  One of the things we trigger alerts on is IAM user password changes.  This morning one of our users changed their password, and we did not get an alert, so I started investigating the issue.  First I checked the event history in the AWS Cloudtrail web console, and was able to see it there.  Next, I downloaded the Cloudtrail log files from S3, decompressed them, and looked for that same entry by it's eventID field.  I located it in the file:

464811824699_CloudTrail_us-east-1_20251105T1550Z_sKVFnqkhhQ8df26K.json.gz

So the event definately got written into the S3 logs.  I searched the Wazuh Dashboard's archives index by that eventID, and no response.  So I compared the file on S3, with the events in my archives.json file, and got some interesting results.

First, the file from S3:

$ cat 464811824699_CloudTrail_us-east-1_20251105T1550Z_sKVFnqkhhQ8df26K.json| jq '.Records[].eventID' | wc -l
4274

So the file on S3 has over 4k events in it.

So, since the Wazuh events include the source log_file in them, I look at the Wazuh archives.json file for any events related to this file:

# cat /var/ossec/logs/archives/archives.json | grep "sKVFnqkhhQ8df26K" | wc -l
7

And if I examine those 7 events, they are actually later events from me downloading that file from S3.  There are absolutely no events in the archive.json file with this file as their source.

Unfortunately, I'm investigating this after the issue occured, so of course I don't have debug logs from the AWS wodle to look at.  The normal OSSEC logs show no evidence of any issues, tho.

Any ideas on why this might be occuring?

Jeremy Utley

Javier Adán Méndez Méndez

unread,
Nov 5, 2025, 10:52:24 PM (6 days ago) Nov 5
to Wazuh | Mailing List

Hi Jeremy Utley

I did some digging and a few things could explain why the CloudTrail file wasn’t processed even though it’s in S3:

  1. Time mismatch – If the events inside the file have timestamps outside the range that Wazuh expects (for example, clock difference between AWS and the manager), the wodle might skip the file.
    → Check the manager’s system time and compare it with the event timestamps.

  2. File parsing error – Sometimes the file downloads correctly but fails during decompression or JSON parsing (for example, if the .gz is corrupted or incomplete).
    → Try decompressing it manually with gzip -t or jq to confirm it’s valid JSON.

  3. Filters or region mismatch – The wodle configuration may only include certain regions or services.
    → Review your <wodle name="aws"> block in /var/ossec/etc/ossec.conf and confirm that us-east-1 and cloudtrail are included.

please let me know if one of those works.

Javier Mendez
Wazuh teams

Jeremy Utley

unread,
Nov 6, 2025, 10:59:08 AM (6 days ago) Nov 6
to Wazuh | Mailing List

Hi Javier!

Thanks for the suggestions!

I don't think a time mismatch is the problem, as the clock on the Wazuh EC2 instance is correct, and other Cloudtrail logs around the same timeframe were successfully imported.  I definately was able to manually download, decompress, and run the file thru jq without any error.  Our Cloudtrail bucket is defined as:

<bucket type="cloudtrail">
      <name>cloudtrail-logs-5a3ea689</name>
      <path>Cloudtrail/</path>
      <only_logs_after>2024-JAN-01</only_logs_after>
    </bucket>

According to the docs, the region configuration is optional, and again, since many other events are being ingested, I believe this configuration is working properly.

Right now, I'm going to work on designing some scripting that will get a list of all log files for a specific day in the Cloudtrail bucket, then check each log file against Opensearch to see if any events were received from that log file.  Hopefully, I can find out if this is a recurring problem or a one-off issue this way, and if it is recurring, maybe I can enable the debug logs and wait for it to occur again.

One thought I had - we use the "subscriber type=buckets" method using S3 write events and SQS for some other things (both SecurityHub and ingesting Cloudflare logs that they write into our S3).  This seems to be a little more efficient (less possibility of missing files due to race conditions, and more ability to distribute ingestion to multiple nodes).  Would this be a more reliable method of ingesting Cloudtrail as well?   Would the JSON syntax end up changing so much that the existing Wazuh rules would no longer work?

Jeremy

Jeremy Utley

unread,
Nov 7, 2025, 12:33:49 PM (5 days ago) Nov 7
to Wazuh | Mailing List
Hi Wazuh devs!

I did some deep investigating on this issue yesterday and this morning, and have come to the conclusion that there is a serious bug in Wazuh's AWS Cloudtrail ingestion.  This email might get a little bit long, but I feel the info is necessary to document and explain the problem.

Environment:
Wazuh Version: 4.12.0 (x86_64)
OS: Amazon Linux 2

Yesterday, I set wazuh_modules.debug = 2 on our running instance, then waited for a little while to see some of the debug logs coming in.  I decided to focus my investigation on one specific set of logs - namely the Cloudtrail logs in a single account (464811824699), in the us-east-1 region.  This is our most heavily used account, generating thousands of cloudtrail entries every few minutes.

First, the relevant ossec.log entries with debug enabled, but non-relevant data cut out:

2025/11/06 18:20:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:84 at wm_aws_main(): INFO: Starting fetching of logs.
2025/11/06 18:20:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:136 at wm_aws_main(): INFO: Executing Bucket Analysis: (Bucket: cloudtrail-logs-5a3ea689, Path: Cloudtrail/, Type: cloudtrail)
2025/11/06 18:20:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:379 at wm_aws_run_s3(): DEBUG: Create argument list
2025/11/06 18:20:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:494 at wm_aws_run_s3(): DEBUG: Launching S3 Command: wodles/aws/aws-s3 --bucket cloudtrail-logs-5a3ea689 --trail_prefix Cloudt
rail/ --only_logs_after 2024-JAN-01 --type cloudtrail --debug 2
...
DEBUG: +++ Marker: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1815Z_mT5isOkzuCsvdT5N.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1815Z_nN57He9CcCZI0atM.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1815Z_quTqxDRfWIKewbhi.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1815Z_xNLYCe5lOhLx9iL4.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_3YHZBJUq7RjnXVpa.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_7ekYStSqGjmlhLme.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_8xDhhJkwDX3fl9Bp.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_A4M6Q8yoBrMtPvuv.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_N2tOEzzp8W4IciuF.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_RlOLdU0pRkOjC0HW.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_TaFUOtg6dvR20792.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_V4SDZUzTLKf4sixS.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_XRSWN9BACzkHYNBl.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_c7jYmEd6afCn0Aup.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_jfNx1jvrIxTyTKid.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_mqF9rwRfdkgF3vcd.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_n2OXQS2jyDE7cmZf.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_wntHuU3rAE1GJKue.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1820Z_zNO4a2mNGPPmJhpm.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_VVX7lWU4WAcepzRk.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_ceZmYcMYFd5XXP1n.json.gz
DEBUG: +++ DB Maintenance
...
2025/11/06 18:23:38 wazuh-modulesd:aws-s3[1426] wm_aws.c:201 at wm_aws_main(): INFO: Fetching logs finished.
...
2025/11/06 18:25:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:84 at wm_aws_main(): INFO: Starting fetching of logs.
2025/11/06 18:25:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:136 at wm_aws_main(): INFO: Executing Bucket Analysis: (Bucket: cloudtrail-logs-5a3ea689, Path: Cloudtrail/, Type: cloudtrail)
2025/11/06 18:25:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:379 at wm_aws_run_s3(): DEBUG: Create argument list
2025/11/06 18:25:46 wazuh-modulesd:aws-s3[1426] wm_aws.c:494 at wm_aws_run_s3(): DEBUG: Launching S3 Command: wodles/aws/aws-s3 --bucket cloudtrail-logs-5a3ea689 --trail_prefix Cloudt
rail/ --only_logs_after 2024-JAN-01 --type cloudtrail --debug 2
...
DEBUG: +++ Marker: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_ceZmYcMYFd5XXP1n.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_eV7ZTAbrNkqqV7s8.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_hCxmodiASno13OOK.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_utekoji3oD5VOsnh.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_wEMhQYC1iO5xpz9m.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1825Z_yQ0kNGbZKabKNgBd.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1830Z_X5Cd1Pguwp2C8nfw.json.gz
DEBUG: ++ Found new log: Cloudtrail/AWSLogs/464811824699/CloudTrail/us-east-1/2025/11/06/464811824699_CloudTrail_us-east-1_20251106T1830Z_mC9GAucLKKJOomZb.json.gz
DEBUG: +++ DB Maintenance

So from this, we can see the following:
18:20:46 - S3 wodle starts the log fetching process for this account.

Once it gets to the us-east-1 region, it uses the file 464811824699_CloudTrail_us-east-1_20251106T1815Z_mT5isOkzuCsvdT5N.json.gz as the marker, which has a last-modified time of 2025-11-06T18:10:37.000Z.  It proceeds to download a number of files - 3 with a "T1815Z" timestamp, 15 with "T1820Z", and 2 with "T1825Z".

18:25:46 - S3 wodle starts the next log fetch cycle.

At this point, it uses the file 464811824699_CloudTrail_us-east-1_20251106T1825Z_ceZmYcMYFd5XXP1n.json.gz the marker, with last-modified time of 2025-11-06T18:20:26.000Z.  It proceeds to download 5 files with "T1825Z", and 2 with "T1830Z".

Now here comes the bug!  When I compare those 15 files with the "T1820Z" timestamp with the list in S3, S3 actually has 18 files with that timestamp.  Comparing the 2 lists, and getting the last modified time for those 3 files shows the following:

464811824699_CloudTrail_us-east-1_20251106T1820Z_1kg87Tbe9k8bmKlp.json.gz - last modified 2025-11-06T18:22:22.000Z
464811824699_CloudTrail_us-east-1_20251106T1820Z_99MbCyo86R00FiBR.json.gz - last modified 2025-11-06T18:21:57.000Z
464811824699_CloudTrail_us-east-1_20251106T1820Z_cJ0PSMsYmdeR07VL.json.gz - last modified 2025-11-06T18:24:20.000Z

So each of those 3 missing files were actually written AFTER 464811824699_CloudTrail_us-east-1_20251106T1825Z_ceZmYcMYFd5XXP1n.json.gz at 18:20:26.  But the polling cycle that started at 18:25:46 completely ignored them!  The root cause appears to be that the marker advances based on the last processed filename (lexicographically), not by timestamp. Once the marker moves to a file from T1825Z (ceZmYcMYFd5XXP1n.json.gz at 18:20:26), files from T1820Z that arrive later are never reconsidered, even though CloudTrail's eventual consistency means files can arrive several minutes after their timestamp.

I scanned the entire log file for Nov 6 this morning, and I see no evidence that those files were ever pulled in yesterday.

I also ran some comparison scripts between the entire list of files in S3, vs the data in Opensearch (referencing the data.aws.log_info.log_file field).  In our environment, when I ran this test yesterday against roughly 18 hours of data, I saw somewhere around 40% of my Cloudtrail log files were missed, and I suspect it's all due to this bug.

I think this needs to be treated as a very serious bug.  For ourselves, I'm going to test using the subscriber="buckets" option (S3 Data events + SQS) I mentioned in my previous email - assuming the data structure is similar enough that our existing rules will still work properly when ingesting the data in that method, I think that's a safer solution.

Jeremy Utley

Javier Adán Méndez Méndez

unread,
Nov 7, 2025, 1:19:51 PM (5 days ago) Nov 7
to Wazuh | Mailing List

Hi Jeremy,

Thanks a lot for the detailed analysis — this is really helpful.

It looks like the issue might indeed be related to how the marker logic works when handling CloudTrail files that arrive later. I’ll escalate this internally with the AWS team so they can review the behavior and confirm if any adjustments are needed on that side.

I’ll let you know as soon as I have any updates from them.


Javier Mendez

Wazuh dev tema

Javier Adán Méndez Méndez

unread,
Nov 7, 2025, 3:51:15 PM (4 days ago) Nov 7
to Wazuh | Mailing List
Hi Jeremy Utley

Could you please open a issues on the wazuh repository https://github.com/wazuh/wazuh/issues, the claud team wold take  tu fix that. please share all the inforamtion that you can and be as clear as you can.

thanks 

Javier Mendez
Wazuh Teams

Javier Adán Méndez Méndez

unread,
Nov 8, 2025, 1:41:03 AM (4 days ago) Nov 8
to Wazuh | Mailing List

Could you please open an issue on the Wazuh repository: https://github.com/wazuh/wazuh/issues
The Cloud team will take care of fixing that. Please share as much information as possible and be as clear as you can.

Thanks,
Javier Mendez
Wazuh Team

Reply all
Reply to author
Forward
0 new messages