Per_bucket monitor performance

никита какдела

unread,

Jan 22, 2026, 7:06:20 AMJan 22

to Wazuh | Mailing List

Hi!
I've noticed that sometimes I don't receive a notification for a triggered alert, meaning the action isn't triggered.
I checked the cluster logs and saw this:
I've noticed that these errors often appear on both of my monitors. That is, these monitors sometimes send alerts correctly, and sometimes they don't, and I can't figure out what's causing this.
"""

[2026-01-22T06:39:22,345][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message X4L_5pkBS6jN-8SDuQFi: org.opensearch.notifications.spi.model.MessageContent@e87df90
[2026-01-22T06:58:49,820][ERROR][o.o.a.r.RestSearchMonitorAction] [node-1] The monitor parsing failed. Will return response as is.
[2026-01-22T06:59:02,315][ERROR][o.o.a.r.RestSearchMonitorAction] [node-1] The monitor parsing failed. Will return response as is.
[2026-01-22T06:59:02,374][ERROR][o.o.a.r.RestSearchMonitorAction] [node-1] The monitor parsing failed. Will return response as is.
[2026-01-22T07:06:15,094][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message X4L_5pkBS6jN-8SDuQFi: org.opensearch.notifications.spi.model.MessageContent@576b9a8d
[2026-01-22T07:06:19,061][ERROR][o.o.a.BucketLevelMonitorRunner] [node-1] Failed to retrieve sample documents for alert hcWG5JsB-dPPuwmWgt0Z from trigger CVEEApsBoeamHjFYzAQF of monitor ClEEApsBoeamHjFYzAQL during execution ClEEApsBoeamHjFYzAQL_2026-01-22T07:06:18.998200819_7b9b6f74-a924-46c7-8283-d1a8e7a03d96.
[2026-01-22T07:06:19,280][ERROR][o.o.a.BucketLevelMonitorRunner] [node-1] Failed to retrieve sample documents for alert hcWG5JsB-dPPuwmWgt0Z from trigger CVEEApsBoeamHjFYzAQF of monitor ClEEApsBoeamHjFYzAQL during execution ClEEApsBoeamHjFYzAQL_2026-01-22T07:06:18.998200819_7b9b6f74-a924-46c7-8283-d1a8e7a03d96.
[2026-01-22T07:06:19,485][ERROR][o.o.a.BucketLevelMonitorRunner] [node-1] Failed to retrieve sample documents for alert hcWG5JsB-dPPuwmWgt0Z from trigger CVEEApsBoeamHjFYzAQF of monitor ClEEApsBoeamHjFYzAQL during execution ClEEApsBoeamHjFYzAQL_2026-01-22T07:06:18.998200819_7b9b6f74-a924-46c7-8283-d1a8e7a03d96.
[2026-01-22T08:34:42,189][ERROR][o.o.a.r.RestSearchMonitorAction] [node-1] The monitor parsing failed. Will return response as is.
[2026-01-22T08:37:52,580][ERROR][o.o.a.r.RestSearchMonitorAction] [node-1] The monitor parsing failed. Will return response as is.

[2026-01-22T10:21:43,173][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message xF8JnJoBovKpQ5b8ijIc: org.opensearch.notifications.spi.model.MessageContent@61042c0e
[2026-01-22T10:24:43,255][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message xF8JnJoBovKpQ5b8ijIc: org.opensearch.notifications.spi.model.MessageContent@3ef079c4
[2026-01-22T10:27:48,203][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message xF8JnJoBovKpQ5b8ijIc: org.opensearch.notifications.spi.model.MessageContent@b0229c6
"""
I'd like you guys to look at my monitors and suggest how I can improve them or "lighten" them. Because one of the monitors has over 3,000 hits. I don't know if this is correct or not.
I pinned 2 txt files (my both JSON monitors) please look at this.

Monitor_1.txt

Monitor_2.txt

Message has been deleted

hasitha.u...@wazuh.com

unread,

Jan 26, 2026, 1:49:42 AMJan 26

to Wazuh | Mailing List

Hi никита какдела,

Your Wazuh setup with OpenSearch Alerting is experiencing inconsistent notifications. Based on your logs and monitor configurations, here's what's happening and how to fix it.

Note: Monitor_2 is currently disabled, so some errors may be from previous runs.

Error: "Exception sending webhook message"
Causes:

Network connectivity issues between Docker containers
DNS resolution failures
Firewall blocking requests

Fix:

Go to Wazuh Dashboard > Explore > Notifications > Channels
Test each channel (Yandex and Kaiten) using "Send test message"
If it fails, verify your webhook URLs are using correct Docker container names/IPs

2. Monitor Parsing Errors

Error: "The monitor parsing failed."
Causes:

Syntax issues in monitor JSON
Schema mismatches in configuration

Fix: Review and validate your monitor JSON syntax

3. Sample Document Retrieval Failures
Error: "Failed to retrieve sample documents for alert."
Causes:

Query returns no results
Index unavailable
Field access issues
This is common in bucket-level monitors and may not always indicate a critical problem.

About Your Current Setup
You're running bucket-level monitors that:

Track Windows logon events (rule 100014)
Aggregate by username
Count unique IPs and hosts
Monitor_1 (enabled): Scans last 5 minutes, triggers when >4 unique IPs detected
Monitor_2 (disabled): Scans last 30 minutes, triggers when >2 unique hosts detected

The "3,000+ hits" means your aggregations are processing many documents, which can strain your two-node cluster (high CPU/memory usage, slower performance).

Recommended Fixes Quick Wins

1. Reduce the data load:

Shorten time windows: Change Monitor_1 from 5 minutes to 1 minute if you're seeing 3,000+ documents
Add more filters: Exclude common usernames or internal IP addresses
Run less frequently: Change schedule from every 1 minute to every 5 minutes

2. Test your webhooks:

Use "Send test message" in the Notifications channels
Preview triggers before enabling

3. Monitor execution:
Check Dashboard > Alerting > Overview for execution history and errors

Specific Configuration Changes

For Monitor_1:
Change: {{period_end}}||-300s To: {{period_end}}||-60s (Reduces from 5 minutes to 1 minute)

For Monitor_2:
Change: -30m To: -5m (Reduces from 30 minutes to 5 minutes)

Schedule adjustment:
Change: interval: 1 To: interval: 5 (Runs every 5 minutes instead of every 1 minute)

Additional Optimizations

Keep your composite.size: 20/30 as is (already good)
Simplify your alert templates—remove unnecessary loops or links
Keep "per-alert policy" on NEW alerts only (you're already doing this correctly)
Only enable Monitor_2 after testing in a staging environment

General Best Practices

Use narrow time windows and specific filters
Schedule monitors less frequently to reduce cluster load
Test channels with "Send test message"
Monitor execution history regularly
Start conservative and adjust based on actual alert volume

Let me know if you need further assistance on this.

Ref: https://docs.opensearch.org/latest/observing-your-data/alerting/monitors/

никита какдела

unread,

Jan 26, 2026, 8:32:30 AMJan 26

to Wazuh | Mailing List

So how can i find out where this error in?
[2026-01-26T12:37:22,038][ERROR][o.o.c.a.u.AlertingException] [node-1] Alerting error: Failed to execute phase [query], all shards failed; shardFailures {[0r5gW0bPQbOsUlC_k27ryg][.opendistro-alerting-config][0]: RemoteTransportException[[node-1][127.0.0.1:9300][indices:data/read/search[phase/query]]]; nested: QueryShardException[Failed to parse query [*d[j*]]; nested: ParseException[Cannot parse '*d[j*': Encountered "<EOF>" at line 1, column 5.

How can i find it? What is monitor or what?

понедельник, 26 января 2026 г. в 09:49:42 UTC+3, hasitha.u...@wazuh.com:

hasitha.u...@wazuh.com

unread,

Jan 26, 2026, 10:57:32 PMJan 26

to Wazuh | Mailing List

Hi никита,

This error is coming from OpenSearch Alerting, and it’s basically a query parsing issue.

The alert monitor is trying to execute this query:

*d[j*

But the query parser fails with:

Cannot parse '*d[j*': Encountered "<EOF>"

OpenSearch’s official docs list reserved characters in a query_string query and show how they need to be escaped (like [ and *) to avoid parsing errors:

The following is a list of reserved characters for the query string:

+, -, =, &&, ||, >, <, !, (, ),{, }, [, ], ^, ", ~, *, ?, :, \, /

Use a backslash (\) to escape reserved characters. When working with JSON requests, use a double backslash (\\) since the backslash itself is a reserved character that needs escaping.
For more details regarding the query string, please visit here.

If you want to match patterns like *d[j*, the recommended approach is to use a wildcard query rather than raw query_string syntax, because wildcard queries treat the pattern as a value and avoid strict Lucene syntax parsing:

OpenSearch documentation explains how to use the wildcard query and its operators (*, ?).

For more details regarding wildcard option usage, please check this guide.

Let me know if you need further assistance on this.

никита какдела

unread,

Jan 27, 2026, 12:40:51 AMJan 27

to Wazuh | Mailing List

How can I figure out which monitor is giving the error?

вторник, 27 января 2026 г. в 06:57:32 UTC+3, hasitha.u...@wazuh.com:

никита какдела

unread,

Jan 27, 2026, 3:49:50 AMJan 27

to Wazuh | Mailing List

root@wazuh:/home/uwazuh# tail -f /var/log/wazuh-indexer/wazuh-cluster.log | grep ERROR
[2026-01-27T05:38:37,343][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message tAAOOpoBAqvA3MNHy-lM: org.opensearch.notifications.spi.model.MessageContent@3420759a
[2026-01-27T05:42:37,267][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message tAAOOpoBAqvA3MNHy-lM: org.opensearch.notifications.spi.model.MessageContent@ee45b47
[2026-01-27T05:44:37,248][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message tAAOOpoBAqvA3MNHy-lM: org.opensearch.notifications.spi.model.MessageContent@2bb936dd
[2026-01-27T05:48:37,545][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message tAAOOpoBAqvA3MNHy-lM: org.opensearch.notifications.spi.model.MessageContent@4bc098b8
[2026-01-27T05:51:37,254][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message tAAOOpoBAqvA3MNHy-lM: org.opensearch.notifications.spi.model.MessageContent@6fc4be36
[2026-01-27T06:10:27,497][ERROR][o.o.a.BucketLevelMonitorRunner] [node-1] Failed to retrieve sample documents for alert XSoT_psBx4tf4mdOKerh from trigger f7QlspsBd-k2aqqxq1jt of monitor gbQlspsBd-k2aqqxq1j8 during execution gbQlspsBd-k2aqqxq1j8_2026-01-27T06:10:27.374425006_eb1ea7dc-1f00-462a-8392-37cf9469bc49.
[2026-01-27T06:10:27,646][ERROR][o.o.a.BucketLevelMonitorRunner] [node-1] Failed to retrieve sample documents for alert XSoT_psBx4tf4mdOKerh from trigger f7QlspsBd-k2aqqxq1jt of monitor gbQlspsBd-k2aqqxq1j8 during execution gbQlspsBd-k2aqqxq1j8_2026-01-27T06:10:27.374425006_eb1ea7dc-1f00-462a-8392-37cf9469bc49.
[2026-01-27T06:29:37,139][ERROR][o.o.a.c.l.LockService ] [node-1] Lock is null. Nothing to release.
[2026-01-27T06:30:17,318][ERROR][o.o.n.c.t.WebhookDestinationTransport] [node-1] Exception sending webhook message tAAOOpoBAqvA3MNHy-lM: org.opensearch.notifications.spi.model.MessageContent@70e9a953
[2026-01-27T07:00:57,345][ERROR][o.o.a.r.RestSearchMonitorAction] [node-1] The monitor parsing failed. Will return response as is.

I've provided you with an example of my errors from wazuh-cluster.log.

Please describe in detail how I can figure out what error is causing it. And which monitor is causing the problem. I've noticed that from time to time, some alerts don't trigger.

вторник, 27 января 2026 г. в 08:40:51 UTC+3, никита какдела:

hasitha.u...@wazuh.com

unread,

Jan 31, 2026, 12:37:04 AMJan 31

to Wazuh | Mailing List

Hi никита
Main Problem (Causing Missed Alerts)

[ERROR][o.o.a.BucketLevelMonitorRunner] [node-1] Failed to retrieve sample documents for alert XSoT_psBx4tf4mdOKerh from trigger f7QlspsBd-k2aqqxq1jt of monitor gbQlspsBd-k2aqqxq1j8 ...

Monitor ID gbQlspsBd-k2aqqxq1j8 is failing

This monitor can't retrieve sample documents when trying to create alerts. This causes alerts to either fail completely or get skipped, which explains why you're missing some alerts.

To investigate:

Check which monitor is affected:

GET /_plugins/_alerting/monitors/gbQlspsBd-k2aqqxq1j8

(Run this in Indexer Management → Dev Tools)

Look for more details in the logs:

cat /var/log/wazuh-indexer/wazuh-cluster.log | grep "<monitorID>"

Replace the problematic monitor id with this: <monitorID>
For example: cat /var/log/wazuh-indexer/wazuh-cluster.log | grep "gbQlspsBd-k2aqqxq1j8"
Secondary Issue (Notification Delivery)

[ERROR][o.o.n.c.t.WebhookDestinationTransport] Exception sending webhook message tAAOOpoBAqvA3MNHy-lM

Webhook notification failures

The system is trying to send notifications to a webhook but failing repeatedly. This doesn't stop alerts from triggering, but it means notifications aren't being delivered.

Common causes:

Wrong webhook URL
Authentication problems
Firewall blocking the connection
The receiving server rejecting messages

Next steps: Fix the monitor issue first (it's causing the missed alerts), then address the webhook delivery if you need those notifications working.

Let me know if you need help troubleshooting either issue!

никита какдела

unread,

Feb 2, 2026, 2:33:25 AMFeb 2

to Wazuh | Mailing List

I'm having this problem primarily with this monitor (and similar monitors that aggregate a large number of events).
This doesn't happen with other monitors, even though the configuration via sample_documents is the same.

I'm not currently using sample_documents in it for testing, but I need to get information about the user and IP address.
Before this I used a similar action mustache:
{
"monitor_id": "{{ctx.trigger.id}}",
"monitor_name": "{{ctx.trigger.name}}",
"trigger_severity": {{ctx.trigger.severity}},
"bucket_keys": "{{#ctx.newAlerts}}{{bucket_keys}}{{/ctx.newAlerts}}",
"wazuh_url":"{{#ctx.newAlerts}}https://wazuh.ovp.ru/app/data-explorer/discover#?_a=(discover:(columns:!(_source),isDirty:!f,sort:!()),metadata:(indexPattern:'wazuh-alerts-*',view:discover))&_q=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:rule.id,negate:!f,params:(query:'100052'),type:phrase),query:(match_phrase:(rule.id:'100052'))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:data.win.eventdata.targetUserName,negate:!f,params:(query:{{bucket_keys}}),type:phrase),query:(match_phrase:(data.win.eventdata.targetUserName:{{bucket_keys}})))),query:(language:kuery,query:''))&_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-{{ctx.periodStart}},to:now)){{/ctx.newAlerts}}",
"raw": {
"src.ip":"{{#ctx.newAlerts}}{{#sample_documents}}{{_source.data.win.eventdata.ipAddress}} {{/sample_documents}}{{/ctx.newAlerts}}",
"agent.ip":"{{#ctx.newAlerts}}{{#sample_documents}}{{_source.agent.ip}} {{/sample_documents}}{{/ctx.newAlerts}}",
"agent.name":"{{#ctx.newAlerts}}{{#sample_documents}}{{_source.agent.name}} {{/sample_documents}}{{/ctx.newAlerts}}",
"src.username":"{{#ctx.newAlerts}}{{#sample_documents}}{{_source.data.win.eventdata.subjectUserName}} {{/sample_documents}}{{/ctx.newAlerts}}",
"dst.username":"{{#ctx.newAlerts}}{{#sample_documents}}{{_source.data.win.eventdata.targetUserName}} {{/sample_documents}}{{/ctx.newAlerts}}",
"rule":"{{#ctx.newAlerts}}{{#sample_documents}}{{_source.rule.description}} {{/sample_documents}}{{/ctx.newAlerts}}",
"periodStart": "{{ctx.periodStart}}",
"periodEnd": "{{ctx.periodEnd}}"
}
}
I use this template in all monitors, but in this one I get errors when it's triggered or when I try to send a test message. I attribute this to the fact that it's aggregating across a large number of events (>4000 hits).

I can attach you the JSON of my problematic monitor and you can tell me how to fix it.
For some reason, the monitor can't retrieve sample_documents. Maybe I don't fully understand how it works? Or how can I fix this?
I really need your help.
{ "name": "MS Windows: Успешное подключение одной УЗ с разных IP адресов", "type": "monitor", "monitor_type": "bucket_level_monitor", "enabled": true, "schedule": { "period": { "unit": "MINUTES", "interval": 2 } }, "inputs": [ { "search": { "indices": [ "wazuh-alerts-current" ], "query": { "size": 0, "query": { "bool": { "filter": [ { "range": { "@timestamp": { "from": "{{period_end}}||-400s", "to": "{{period_end}}", "include_lower": true, "include_upper": true, "format": "epoch_millis", "boost": 1 } } }, { "term": { "rule.id": { "value": "100014", "boost": 1 } } } ], "must_not": [ { "terms": { "data.win.eventdata.targetUserName": [ "ANONYMOUS LOGON", "АНОНИМНЫЙ ВХОД" ], "boost": 1 } } ], "adjust_pure_negative": true, "boost": 1 } }, "aggregations": { "users": { "terms": { "field": "data.win.eventdata.targetUserName", "size": 1000, "min_doc_count": 5, "shard_min_doc_count": 0, "show_term_doc_count_error": false, "order": [ { "_count": "desc" }, { "_key": "asc" } ] }, "aggregations": { "unique_ips": { "cardinality": { "field": "data.win.eventdata.ipAddress" } } } } } } } } ], "triggers": [ { "bucket_level_trigger": { "id": "xGdjA5sBoeamHjFYf3Hq", "name": "MS Windows: Успешное подключение одной УЗ с разных рабочих станций", "severity": "2", "condition": { "buckets_path": { "uniq": "unique_ips.value" }, "parent_bucket_path": "users", "script": { "source": "params.uniq > 3", "lang": "painless" }, "gap_policy": "skip" }, "actions": [ { "id": "notification810195", "name": "Send to Yandex", "destination_id": "X4L_5pkBS6jN-8SDuQFi", "message_template": { "source": "{\n \"chat_id\": \"1/0/191a25c4-b3f1-4e10-a6b1-a412c17b48e5\",\n \"text\": \"WAZUH\\n\\n- 🚨 Событие: {{ctx.monitor.name}}\\n- 🚨 Приоритет: {{ctx.trigger.severity}}\\n- ⏳ Время начала: {{ctx.periodStart}} UTC\\n- ⌛ Время окончания: {{ctx.periodEnd}} UTC {{#ctx.newAlerts}}\\n---\\n- 🙎‍♂️ Инициатор: {{bucket_keys}} {{/ctx.newAlerts}}\"\n}\n", "lang": "mustache" }, "throttle_enabled": false, "subject_template": { "source": "Alerting Notification action", "lang": "mustache" }, "action_execution_policy": { "action_execution_scope": { "per_alert": { "actionable_alerts": [ "NEW" ] } } } }, { "id": "notification644540", "name": "Send_BD", "destination_id": "yA92t5sBd-k2aqqxbrrG", "message_template": { "source": "{\n \"monitor_id\": \"{{ctx.trigger.id}}\",\n \"monitor_name\": \"{{ctx.trigger.name}}\",\n \"trigger_severity\": {{ctx.trigger.severity}},\n \"bucket_keys\": \"{{#ctx.newAlerts}}{{bucket_keys}}{{/ctx.newAlerts}}\",\n \"wazuh_url\":\"{{#ctx.newAlerts}}https://wazuh.ovp.ru/app/data-explorer/discover#?_a=(discover:(columns:!(agent.name,data.win.eventdata.targetUserName,data.win.eventdata.ipAddress,rule.id,rule.description),isDirty:!t,sort:!()),metadata:(indexPattern:'wazuh-alerts-*',view:discover))&_q=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:rule.id,negate:!f,params:(query:'100014'),type:phrase),query:(match_phrase:(rule.id:'100014'))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:data.win.eventdata.targetUserName,negate:!f,params:(query:{{bucket_keys}}),type:phrase),query:(match_phrase:(data.win.eventdata.targetUserName:{{bucket_keys}})))),query:(language:kuery,query:''))&_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-1y,to:now)){{/ctx.newAlerts}}\",\n \"raw\": {\n \"src.ip\":\"{{#ctx.newAlerts}}{{bucket_keys}}{{/ctx.newAlerts}}\",\n \"periodStart\": \"{{ctx.periodStart}}\",\n \"periodEnd\": \"{{ctx.periodEnd}}\"\n }\n}", "lang": "mustache" }, "throttle_enabled": false, "subject_template": { "source": "Alerting Notification action", "lang": "mustache" }, "action_execution_policy": { "action_execution_scope": { "per_alert": { "actionable_alerts": [ "NEW" ] } } } } ] } } ], "ui_metadata": { "schedule": { "timezone": null, "frequency": "interval", "period": { "unit": "MINUTES", "interval": 2 }, "daily": 0, "weekly": { "tue": false, "wed": false, "thur": false, "sat": false, "fri": false, "mon": false, "sun": false }, "monthly": { "type": "day", "day": 1 }, "cronExpression": "0 */1 * * *" }, "monitor_type": "bucket_level_monitor", "search": { "searchType": "query", "timeField": "@timestamp", "aggregations": [], "groupBy": [ "data.win.eventdata.targetUserName" ], "bucketValue": 1, "bucketUnitOfTime": "m", "filters": [ { "fieldName": [ { "label": "rule.id", "type": "keyword" } ], "fieldValue": "100014", "operator": "is" }, { "fieldName": [ { "label": "data.win.eventdata.workstationName", "type": "keyword" } ], "fieldValue": "", "operator": "is_not_null" } ] } } }

суббота, 31 января 2026 г. в 08:37:04 UTC+3, hasitha.u...@wazuh.com:

hasitha.u...@wazuh.com

unread,

Feb 3, 2026, 5:07:18 AMFeb 3

to Wazuh | Mailing List

Hi никита,

Since the per_alert execution handles one alert at a time, you can access values directly using {{ctx.newAlerts.0.bucket_keys}} instead of using a loop. This keeps the rendering clean, even when testing with multiple alerts.

For the Send_BD action, update message_template.source accordingly. I’ve also corrected src.ip to a more appropriate value—right now it’s set to bucket_keys (the username) instead of the actual IP.

{
"monitor_id": "{{ctx.trigger.id}}",
"monitor_name": "{{ctx.trigger.name}}",
"trigger_severity": {{ctx.trigger.severity}},

"bucket_keys": "{{ctx.newAlerts.0.bucket_keys}}",
"wazuh_url": "{{#ctx.newAlerts}}https://wazuh.ovp.ru/app/data-explorer/discover#?_a=(discover:(columns:!(agent.name,data.win.eventdata.targetUserName,data.win.eventdata.ipAddress,rule.id,rule.description),isDirty:!t,sort:!()),metadata:(indexPattern:'wazuh-alerts-*',view:discover))&_q=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:rule.id,negate:!f,params:(query:'100014'),type:phrase),query:(match_phrase:(rule.id:'100014'))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:data.win.eventdata.targetUserName,negate:!f,params:(query:{{bucket_keys}}),type:phrase),query:(match_phrase:(data.win.eventdata.targetUserName:{{bucket_keys}})))),query:(language:kuery,query:''))&_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-1y,to:now)){{/ctx.newAlerts}}",
"raw": {
"src.ip": "{{ctx.newAlerts.0.bucket_keys}}", // Note: This is still set to username; update if needed (see below for how to get actual IPs)

"periodStart": "{{ctx.periodStart}}",
"periodEnd": "{{ctx.periodEnd}}"
}
}

The Yandex action template is already fine as it's text-based and handles loops gracefully.

Incorporate sample_documents to Get User, IP, and Other Details
sample_documents are automatically available in ctx.newAlerts for bucket-level monitors (only for new alerts). They provide example documents from the bucket without needing extra configuration.
Update your "Send_BD" template to use them (similar to your old mustache). This fetches fields like IP, agent name, etc., from 1-5 sample docs per alert. Since per_alert filters to one alert, access them via ctx.newAlerts.0.sample_documents.
Revised "raw" section example:

"raw": {
  "src.ip": "{{#ctx.newAlerts.0.sample_documents}}{{_source.data.win.eventdata.ipAddress}}, {{/ctx.newAlerts.0.sample_documents}}",
  "agent.ip": "{{#ctx.newAlerts.0.sample_documents}}{{_source.agent.ip}}, {{/ctx.newAlerts.0.sample_documents}}",
  "agent.name": "{{#ctx.newAlerts.0.sample_documents}}{{_source.agent.name}}, {{/ctx.newAlerts.0.sample_documents}}",
  "src.username": "{{#ctx.newAlerts.0.sample_documents}}{{_source.data.win.eventdata.subjectUserName}}, {{/ctx.newAlerts.0.sample_documents}}",
  "dst.username": "{{#ctx.newAlerts.0.sample_documents}}{{_source.data.win.eventdata.targetUserName}}, {{/ctx.newAlerts.0.sample_documents}}",
  "rule": "{{#ctx.newAlerts.0.sample_documents}}{{_source.rule.description}}, {{/ctx.newAlerts.0.sample_documents}}",

  "periodStart": "{{ctx.periodStart}}",
  "periodEnd": "{{ctx.periodEnd}}"
}

This will list values from the samples (comma-separated). If you need unique values or all IPs (not just samples), add a sub-aggregation (see below).

Test this gradually: Start with a smaller time range (e.g., change to "{{period_end}}||-120s") to reduce hits and confirm it works before scaling back.

Optimize the Monitor for Large Event Volumes
Reduce Aggregation Size: Your terms aggregation has "size": 1000, which processes up to 1000 users. If not all are needed, lower it to 50-100 to limit buckets and alerts generated. This reduces load on sample_documents fetching and alert creation.text"users": {

"terms": {
"field": "data.win.eventdata.targetUserName",
"size": 100, // Reduced from 1000
"min_doc_count": 5,
...
}
}

Adjust Time Range and Schedule: Your query looks back 400s (~6.7min) but runs every 2min, causing overlap and potentially redundant processing. Align it to the interval: "{{period_end}}||-2m". If hits are still high, increase the interval to 5min.
Add Terms Sub-Aggregation for Full IPs (Optional, if Samples Aren't Enough): To get all unique IPs per user (up to a limit), add this inside the "users" aggregations (beside

"unique_ips"):
"ips": {
"terms": {
"field": "data.win.eventdata.ipAddress",
"size": 20 // Max expected unique IPs per user; adjust as needed
}
}

However, accessing this in templates is tricky (requires matching buckets in ctx.results[0].aggregations via Mustache logic), so stick with sample_documents unless necessary.

Let me know the update after making the above changes.
Ref:
https://docs.opensearch.org/latest/observing-your-data/alerting/monitors/
https://opensearch.org/docs/latest/observing-your-data/alerting/per-query-bucket-monitors/
https://docs.opensearch.org/latest/observing-your-data/alerting/triggers/
https://docs.opensearch.org/latest/observing-your-data/alerting/actions/
https://forum.opensearch.org/t/problem-accessing-to-ctx-newalerts-0-sample-documents/20546
https://forum.opensearch.org/t/alerting-sample-documents-for-ctx-dedupedalerts-and-ctx-completedalerts/18695

никита какдела

unread,

Feb 3, 2026, 8:13:00 AMFeb 3

to Wazuh | Mailing List

Good afternoon! I've made some adjustments to the ctx.Alerts.0 design.
I'm aggregating by unique IP addresses to use in a trigger. Basically, I'm looking for users who have successfully connected from 3 or more unique IP addresses.

Can I use the found unique ips in mustache (per-bucket monitor)?

Should i use it in every my per-bucket monitor? size + min doc count

"size": 100, // Reduced from 1000
"min_doc_count": 5,

As far as I understand, the ideal formula is schedule = range, correct?
I can add one more problem monitor to check on potential errors. But my "Timed Brute Force" monitor
looks for more than 1,000 unsuccessful login attempts for a specific user within 60 minutes.
I don't understand how this would work if the monitor runs every 60 minutes or every 30 minutes. It feels like I'm losing timeliness, because if a brute force attack starts, I'll only find out about it after 30 or 60 minutes, which won't be relevant, right?

{ "name": "MS Windows: Распределенный по времени брутфорс", "type": "monitor", "monitor_type": "bucket_level_monitor", "enabled": true, "schedule": { "period": { "unit": "MINUTES", "interval": 29 } }, "inputs": [ { "search": { "indices": [ "wazuh-alerts-current" ], "query": { "size": 0, "query": { "bool": { "filter": [ { "range": { "@timestamp": { "from": "{{period_end}}||-60m", "to": "{{period_end}}", "include_lower": true, "include_upper": true, "format": "epoch_millis", "boost": 1 } } }, { "terms": { "rule.id": [ "100015", "100016", "100017", "100018", "100019", "100020", "100053", "100054" ], "boost": 1 } }, { "exists": { "field": "data.win.eventdata.ipAddress", "boost": 1 } } ], "must_not": [ { "regexp": { "data.win.eventdata.targetUserName": { "value": ".*\\$", "flags_value": 255, "max_determinized_states": 10000, "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } }, "aggregations": { "composite_agg": { "composite": { "size": 10, "sources": [ { "data.win.eventdata.targetUserName": { "terms": { "field": "data.win.eventdata.targetUserName", "missing_bucket": false, "order": "asc" } } } ] } } } } } } ], "triggers": [ { "bucket_level_trigger": { "id": "QlrfwJsBZYoQ5E0jkO3T", "name": "MS Windows: Распределенный по времени брутфорс", "severity": "2", "condition": { "buckets_path": { "_count": "_count" }, "parent_bucket_path": "composite_agg", "script": { "source": "params._count > 999", "lang": "painless" }, "gap_policy": "skip" }, "actions": [ { "id": "notification281292", "name": "Alert to Yandex", "destination_id": "X4L_5pkBS6jN-8SDuQFi", "message_template": { "source": "{\n \"chat_id\": \"1/0/191a25c4-b3f1-4e10-a6b1-a412c17b48e5\",\n \"text\": \"WAZUH\\n\\n- 🚨 Событие: {{ctx.monitor.name}}\\n- 🚨 Приоритет: {{ctx.trigger.severity}}\\n- ⏳ Время начала: {{ctx.periodStart}} UTC\\n- ⌛ Время окончания: {{ctx.periodEnd}} UTC {{#ctx.newAlerts}}\\n---\\n- 🚨 Инициатор: {{bucket_keys}}\\n [Открыть в Wazuh](https://wazuh.ovp.ru/app/data-explorer/discover#?_a=(discover:(columns:!(agent.name,data.win.eventdata.targetUserName,data.win.eventdata.ipAddress),isDirty:!f,sort:!()),metadata:(indexPattern:'wazuh-alerts-*',view:discover))&_q=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:rule.id,negate:!f,params:!('100015','100016','100017','100018','100019','100020','100053','100054'),type:phrases,value:'100015,%20100016,%20100017,%20100018,%20100019,%20100020,%20100053,%20100054'),query:(bool:(minimum_should_match:1,should:!((match_phrase:(rule.id:'100015')),(match_phrase:(rule.id:'100016')),(match_phrase:(rule.id:'100017')),(match_phrase:(rule.id:'100018')),(match_phrase:(rule.id:'100019')),(match_phrase:(rule.id:'100020')),(match_phrase:(rule.id:'100053')),(match_phrase:(rule.id:'100054')))))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:data.win.eventdata.targetUserName,negate:!f,params:(query:{{bucket_keys}}),type:phrase),query:(match_phrase:(data.win.eventdata.targetUserName:{{bucket_keys}})))),query:(language:kuery,query:''))&_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-1h,to:now)))\\n---{{/ctx.newAlerts}}\"\n}", "lang": "mustache" }, "throttle_enabled": false, "subject_template": { "source": "Alerting Notification action", "lang": "mustache" }, "action_execution_policy": { "action_execution_scope": { "per_alert": { "actionable_alerts": [ "NEW" ] } } } }, { "id": "notification302457", "name": "Send_BD", "destination_id": "yA92t5sBd-k2aqqxbrrG", "message_template": { "source": "{\n \"monitor_id\": \"{{ctx.trigger.id}}\",\n \"monitor_name\": \"{{ctx.trigger.name}}\",\n \"trigger_severity\": {{ctx.trigger.severity}},\n \"bucket_keys\": \"{{#ctx.newAlerts}}{{bucket_keys}}{{/ctx.newAlerts}}\",\n \"wazuh_url\":\"{{#ctx.newAlerts}}https://wazuh.ovp.ru/app/data-explorer/discover#?_a=(discover:(columns:!(agent.name,data.win.eventdata.targetUserName,data.win.eventdata.ipAddress),isDirty:!f,sort:!()),metadata:(indexPattern:'wazuh-alerts-*',view:discover))&_q=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:rule.id,negate:!f,params:!('100015','100016','100017','100018','100019','100020','100053','100054'),type:phrases,value:'100015,%20100016,%20100017,%20100018,%20100019,%20100020,%20100053,%20100054'),query:(bool:(minimum_should_match:1,should:!((match_phrase:(rule.id:'100015')),(match_phrase:(rule.id:'100016')),(match_phrase:(rule.id:'100017')),(match_phrase:(rule.id:'100018')),(match_phrase:(rule.id:'100019')),(match_phrase:(rule.id:'100020')),(match_phrase:(rule.id:'100053')),(match_phrase:(rule.id:'100054')))))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'wazuh-alerts-*',key:data.win.eventdata.targetUserName,negate:!f,params:(query:{{bucket_keys}}),type:phrase),query:(match_phrase:(data.win.eventdata.targetUserName:{{bucket_keys}})))),query:(language:kuery,query:''))&_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'{{ctx.periodStart}}',to:now)){{/ctx.newAlerts}}\",\n \"raw\": {\n \"src.ip\":\"{{#ctx.newAlerts.0}}{{#sample_documents}}{{_source.data.win.eventdata.ipAddress}} {{/sample_documents}}{{/ctx.newAlerts.0}}\",\n \"agent.ip\":\"{{#ctx.newAlerts.0}}{{#sample_documents}}{{_source.agent.ip}} {{/sample_documents}}{{/ctx.newAlerts.0}}\",\n \"agent.name\":\"{{#ctx.newAlerts.0}}{{#sample_documents}}{{_source.agent.name}} {{/sample_documents}}{{/ctx.newAlerts.0}}\",\n \"src.username\":\"{{#ctx.newAlerts.0}}{{#sample_documents}}{{_source.data.win.eventdata.subjectUserName}} {{/sample_documents}}{{/ctx.newAlerts.0}}\",\n \"dst.username\":\"{{#ctx.newAlerts.0}}{{#sample_documents}}{{_source.data.win.eventdata.targetUserName}} {{/sample_documents}}{{/ctx.newAlerts.0}}\",\n \"rule\":\"{{#ctx.newAlerts.0}}{{#sample_documents}}{{_source.rule.description}} {{/sample_documents}}{{/ctx.newAlerts.0}}\",\n \"periodStart\": \"{{ctx.periodStart}}\",\n \"periodEnd\": \"{{ctx.periodEnd}}\"\n }\n}", "lang": "mustache" }, "throttle_enabled": false, "subject_template": { "source": "Alerting Notification action", "lang": "mustache" }, "action_execution_policy": { "action_execution_scope": { "per_alert": { "actionable_alerts": [ "NEW" ] } } } } ] } } ], "ui_metadata": { "schedule": { "timezone": null, "frequency": "interval", "period": { "unit": "MINUTES", "interval": 29 }, "daily": 0, "weekly": { "tue": false, "wed": false, "thur": false, "sat": false, "fri": false, "mon": false, "sun": false }, "monthly": { "type": "day", "day": 1 }, "cronExpression": "0 */1 * * *" }, "monitor_type": "bucket_level_monitor", "search": { "searchType": "query", "timeField": "@timestamp", "aggregations": [], "groupBy": [ "data.win.eventdata.targetUserName" ], "bucketValue": 1, "bucketUnitOfTime": "m", "filters": [ { "fieldName": [ { "label": "rule.id", "type": "keyword" } ], "fieldValue": "100020", "operator": "is" } ] } } }

вторник, 3 февраля 2026 г. в 13:07:18 UTC+3, hasitha.u...@wazuh.com:

hasitha.u...@wazuh.com

unread,

Feb 8, 2026, 11:29:02 PMFeb 8

to Wazuh | Mailing List

Hi никита, Right now, you only count how many IPs a user used (cardinality agg), but you don’t keep the actual list.

To get the real IPs into your message:

Add this inside your “users” aggregation:

"ip_list": {
"terms": {
"field": "data.win.eventdata.ipAddress",
"size": 10
}
}

Then, in your Mustache template you can show them (example for Send_BD or Yandex):

IPs: {{#ctx.newAlerts.0.sample_documents}}{{_source.data.win.eventdata.ipAddress}}, {{/ctx.newAlerts.0.sample_documents}}

(or use the full ip_list buckets if you really need every IP, but samples are usually enough and much lighter)

Ref:

2. Why you should always use “size” and “min_doc_count”

When you group by username or anything else; especially with thousands of events then always add:

"size": 50 or 100 → only look at the top 50–100 groups
"min_doc_count": 5 → ignore groups with very few events

This stops your monitor from slowing down or crashing when there are 4000+ hits.

Ref:

Terms aggregation: size and min_doc_count explained

3. What’s the best schedule + time window?

Simple rule that works well:

Look back a bit longer than how often you check
Example: look back 10 minutes, run every 3–5 minutes → small overlap, catches almost everything

If you make them exactly equal (5 min look-back + every 5 min), sometimes events on the edge get missed.

Ref:

4. Your “Distributed Brute Force in Time” monitor; quick fixes

Current setup: looks back 60 min, but only checks every 29 min → you might see the attack 20–30 min late. Not great for brute force.

Easy changes:

Change interval to 5 minutes (or 10 min max)
Keep look-back 60 minutes → now it checks “last hour” every 5 min → much faster alerts
Add "min_doc_count": 500 inside the composite source → skip quiet users

Ref:

Let me know if you need further assistance on this.

никита какдела

unread,

Feb 11, 2026, 6:57:48 AMFeb 11

to Wazuh | Mailing List

In this case, how can I use all the necessary data correctly?
For example, IPAddress,agent.id ,agent.name ,target/subjectusername, rule.id .
What would my aggregations and mustache look like correctly, could you write it in full, please?

понедельник, 9 февраля 2026 г. в 07:29:02 UTC+3, hasitha.u...@wazuh.com:

никита какдела

unread,

Feb 11, 2026, 10:24:19 AMFeb 11

to Wazuh | Mailing List

Let me restate the question. How is it in the Per Bucket monitor when the Bucket is triggered (it has several documents, for example 5). How do I get the data of all 5 documents in Mustache? Conditionally 5 IPAddress, 5 agent.name , 5 agent.ip and so on. In other words, I want to output all the data of all the documents that triggered the bucket. Is this possible correctly?

среда, 11 февраля 2026 г. в 14:57:48 UTC+3, никита какдела:

никита какдела

unread,

Feb 19, 2026, 3:43:22 AMFeb 19

to Wazuh | Mailing List

Guys can you help me?

среда, 11 февраля 2026 г. в 18:24:19 UTC+3, никита какдела:

Reply all

Reply to author

Forward