Wazuh 4.7.5 CPU high usage issue

804 views
Skip to first unread message

Emil David

unread,
Jul 3, 2024, 4:29:27 AM7/3/24
to Wazuh | Mailing List
Hi Team,
We have installed wazuh 4.7.5.
Added one firewall syslog and 59 agents, getting around 400 EPS.
Server spec: 12 core and 16GB RAM
Note: Distributed installation, wazuh, filbeat in one server and wazuh index and dashboard in another server.

We are facing issue that all 12 CPUs is taking 99, 100%, please let us know why is taking too much cpu load for 400 eps.
How to fix the issue since we want to add more firewall and agents also.

Thanks,

Stuti Gupta

unread,
Jul 3, 2024, 6:50:29 AM7/3/24
to Wazuh | Mailing List
Hi  Emil David

It's normal to have 100% CPU usage for wazuh-modulesd when starting the manager, especially if you have Syscollector enabled, as it processes all received packages. The same applies to all Wazuh agents reporting Syscollector packages to the manager. Depending on the design of each module, they may create secondary threads. For example, both Vulnerability Detector and Syscollector are single-threaded modules, which can cause Modulesd to reach a peak of 200% CPU usage. Vulnerability Detector performs database fetching, synchronization, and matching against the software list of each agent, which can consume 100% CPU for an extended period since it is designed as a single-thread module.
The issue appears to be related to the server ingesting more events per second (EPS) than it can handle. Our suggestions are focused on scaling your architecture. Keep in mind that each Wazuh manager node with 4GB of RAM and 8 CPUs can handle around 5000 EPS, and you currently have only  12 core and 16GB RAM , so you need to increase your core resources.
https://documentation.wazuh.com/current/installation-guide/wazuh-dashboard/index.html#hardware-requirements
Wazuh managers scale better horizontally than vertically, meaning it is more effective to have 2 Wazuh manager nodes in a cluster with half the resources of a single node. Additionally, if you are heavily using the Wazuh indexer on the same node as the Wazuh manager master node, this can create resource conflicts. In such cases, we recommend using a distributed architecture for your environment.
Best regards, https://documentation.wazuh.com/current/user-manual/upscaling/adding-server-node.html

Hope this helps

ismailctest C

unread,
Jul 4, 2024, 3:24:09 AM7/4/24
to Wazuh | Mailing List
Hi,
Could you give me more clarity on this?
Presently CPU usage is hitting 100% in wazuh manager master server.

Server Spec and details are given below,
Wazuh manager master - Count 1 (Spec: CPU12, Mem16, OS Ubuntu 22.04)
Wazuh manager worker - Count 1 ( Spec: CPU12, Mem16, OS Ubuntu 22.04 )
Wazuh Indexer Node1 - Count 1 ( Spec: CPU32, Mem126, OS Ubuntu 22.04 )
Wazuh Indexer Node2 - Count 1 ( Spec: CPU12, Mem48, OS Ubuntu 22.04 )
Wazuh Indexer Node3 - Count 1 ( Spec: CPU8, Mem32, OS Ubuntu 22.04 )

Pointed to wazuh master : 53 Agents and 1 firewall (syslog)
Pointed to wazuh worker : 12 Agents.

Issue:
Wazuh manager master server all 12 CPUs is taking 100. (Note: Remain all servers are running smoothly)
Current EPS : 400 EPS only.
Please let us know what is the reason and how to rectify the issue?

Stuti Gupta

unread,
Jul 5, 2024, 6:47:52 AM7/5/24
to Wazuh | Mailing List
Hi ismailctest C,

If you study the wazuh-manager hardware requirement, you can see the CPU (core) is required more than the RAM. https://documentation.wazuh.com/current/installation-guide/wazuh-server/index.html#hardware-requirements. Because CPU usage can go up to 200% for wazuh-modulesd, especially if you have Syscollector enabled, as it processes all received packages. The same applies to all Wazuh agents reporting Syscollector packages to the manager. Depending on the design of each module, they may create secondary threads.

To solve this situation, it will be necessary a re-sizing of your architecture. To do so, we will need the following information:
You mentioned you have 53 agents, for the agents, what is their OS?
Can you share with us the output of the following API command?
GET /cluster/:node_id/daemons/stats
This will provide the exact number of EPS your Wazuh manager is ingesting.
To know more about EPS you can refer to: https://documentation.wazuh.com/current/user-manual/reference/statistics-files/wazuh-remoted-state.html

Hope to hear from you soon

ismailctest C

unread,
Jul 8, 2024, 12:49:41 AM7/8/24
to Wazuh | Mailing List
Hi,
What could be the reason that 12 cores are getting used 100%, EPS is 400 only.

Please find the attached text file for  'GET /cluster/:node_id/daemons/stat'
Agent OS - Windows, Ubunutu, Cent OS and Redhat.

Please find the wazuh-analysisd and wazuh-remotd logs.

root@lcccccc:~# cat /var/ossec/var/run/wazuh-analysisd.state
# State file for wazuh-analysisd
# THIS FILE WILL BE DEPRECATED IN FUTURE VERSIONS

# Total events decoded
total_events_decoded='43368728'

# Syscheck events decoded
syscheck_events_decoded='795'

# Syscollector events decoded
syscollector_events_decoded='167768'

# Rootcheck events decoded
rootcheck_events_decoded='12235'

# Security configuration assessment events decoded
sca_events_decoded='205'

# Winevt events decoded
winevt_events_decoded='2191607'

# Database synchronization messages dispatched
dbsync_messages_dispatched='33176'

# Other events decoded
other_events_decoded='40962942'

# Events processed (Rule matching)
events_processed='43319480'

# Events received
events_received='116476047'

# Events dropped
events_dropped='73090214'

# Alerts written to disk
alerts_written='30797081'

# Firewall alerts written to disk
firewall_written='0'

# FTS alerts written to disk
fts_written='0'

# Syscheck queue
syscheck_queue_usage='0.00'

# Syscheck queue size
syscheck_queue_size='16384'

# Syscollector queue
syscollector_queue_usage='0.00'

# Syscollector queue size
syscollector_queue_size='16384'

# Rootcheck queue
rootcheck_queue_usage='0.00'

# Rootcheck queue size
rootcheck_queue_size='16384'

# Security configuration assessment queue
sca_queue_usage='0.00'

# Security configuration assessment queue size
sca_queue_size='16384'

# Hostinfo queue
hostinfo_queue_usage='0.00'

# Hostinfo queue size
hostinfo_queue_size='16384'

# Winevt queue
winevt_queue_usage='0.00'

# Winevt queue size
winevt_queue_size='16384'

# Database synchronization message queue
dbsync_queue_usage='0.00'

# Database synchronization message queue size
dbsync_queue_size='16384'

# Upgrade module message queue
upgrade_queue_usage='0.00'

# Upgrade module message queue size
upgrade_queue_size='16384'

# Event queue
event_queue_usage='1.00'

# Event queue size
event_queue_size='16384'

# Rule matching queue
rule_matching_queue_usage='1.00'

# Rule matching queue size
rule_matching_queue_size='16384'

# Alerts log queue
alerts_queue_usage='0.00'

# Alerts log queue size
alerts_queue_size='16384'

# Firewall log queue
firewall_queue_usage='0.00'

# Firewall log queue size
firewall_queue_size='16384'

# Statistical log queue
statistical_queue_usage='0.00'

# Statistical log queue size
statistical_queue_size='16384'

# Archives log queue
archives_queue_usage='0.00'

# Archives log queue size
archives_queue_size='16384'

root@lccccc:~# cat /var/ossec/var/run/wazuh-remoted.state
# State file for wazuh-remoted
# THIS FILE WILL BE DEPRECATED IN FUTURE VERSIONS
# Updated every 5 seconds.

# Queue size
queue_size='0'

# Total queue size
total_queue_size='131072'

# TCP sessions
tcp_sessions='51'

# Events sent to Analysisd
evt_count='22525786'

# Control messages received
ctrl_msg_count='423846'

# Discarded messages
discarded_count='0'

# Total number of bytes sent
sent_bytes='37722906'

# Total number of bytes received
recv_bytes='7429948544'

# Messages dequeued after the agent closes the connection
dequeued_after_close='0'
stats.txt

Stuti Gupta

unread,
Jul 9, 2024, 6:33:09 AM7/9/24
to Wazuh | Mailing List
Hi 

You can see the number of events_dropped is 73090214', these events are dropped because the rate of incoming events exceeds the limit set by the average/peak EPS setting, causing the event queues to become full and resulting in event loss. To calculate the approx Eps you can follow these steps 
Run this command: GET /cluster/node_id/daemons/stats?daemons_list=wazuh-analysisd
In the output you can see 
{
        "uptime": "2024-07-05T06:09:21+00:00",
        "timestamp": "2024-07-09T10:21:42+00:00",
        "name": "wazuh-analysisd",
        "metrics": {
          "bytes": {
            "received": 19858951
          },
          "events": {
            "processed": 30819,
            "received": 33519,


EPS = event  received/ (timestamp - uptime)
Refer: https://documentation.wazuh.com/current/cloud-service/your-environment/monitor-environment-usage.html#events-dropped-over-time-and-events-processed-vs-dropped-metrics

To resolve this, I suggest configuring the firewall (syslog) on the worker node. Worker nodes can handle more tasks since they aren't responsible for API or cluster synchronization duties like the master node. Additionally, Wazuh managers scale better horizontally than vertically. This means it's more effective to have two Wazuh manager nodes in a cluster, each with half the resources, rather than a single Wazuh manager node.

Additionally, set up retention policies: Configure Wazuh Indexer to automatically maintain and delete old data using retention policies. Retention policies help control disk usage and prevent excessive growth of logs. For detailed instructions on setting up retention policies, refer to the Wazuh documentation: [https://documentation.wazuh.com/current/user-manual/wazuh-indexer/index-life-management.html)
Analyzing the logs facilitates the detection of anomalies or patterns, use this information to fine-tune Wazuh rules and filters to focus on the most relevant events and reduce false positives. https://wazuh.com/blog/creating-decoders-and-rules-from-scratch/

Looking forward to your response.
Reply all
Reply to author
Forward
0 new messages