Wazuh worker is unable to sync client keys and stops new agent from connecting

42 views
Skip to first unread message

Daniel D'Angeli

unread,
Mar 26, 2024, 2:49:12 PMMar 26
to Wazuh | Mailing List
Hi,

we have a multi node docker compose infrastructure in 4.7.3 using the provided docker compose. We tried to add a 10th agent and noticed that it kept the state of "never connected" for many hours.

The only solution was to stop the wazuh worker container, and we also noticed that the client.keys of the Wazuh Worker was not synchronized with the master client.keys and this may casue the issue.

Any help?
Regards,
Daniel D.

Matias Pereyra

unread,
Mar 26, 2024, 11:33:58 PMMar 26
to Wazuh | Mailing List
Hello!

If the worker node doesn't have the agent key it will reject the connection as you suggest. But this situation isn't expected, and having only a few agents  shouldn't overload the cluster.

Any synchronization issue among the cluster nodes should be reflected in the logs with an error or warning message.
Could you please upload the ossec.log files from the worker node, master node and agent? Don't forget to remove any sensitive information.
Also, include the cluster.log file from the worker and master.

Regards.

Daniel D'Angeli

unread,
Apr 10, 2024, 11:41:04 AMApr 10
to Wazuh | Mailing List
Hi,

very sorry for the late response we had some infrastructure issue to handle and couldn't work much on the test environment.

This information are from the test environment since on production we applied the workaround of turning off the worker node.

I cannot attach the files because even in tar.gz they are too large to be uploaded here, so i created a shared folder on Google Drive that you can access here: https://drive.google.com/drive/folders/1oA1E_QdI9qiVNqWfHRqhq-dEAxBTd_rP?usp=drive_link

Regards,
Daniel D.

Matias Pereyra

unread,
Apr 12, 2024, 8:38:52 PMApr 12
to Wazuh | Mailing List
Hi again!

I can't access the shared folder.
But it might be easier to share only the events near the date of failure, this way the file will be much lighter.

Regards.

Daniel D'Angeli

unread,
Apr 17, 2024, 10:20:05 AMApr 17
to Wazuh | Mailing List
Hi,

these are the most recurring events regarding errors:
agent:
2024/04/10 17:30:12 wazuh-agentd: INFO: Trying to connect to server ([10.10.144.54]:1514/tcp).
2024/04/10 17:30:12 wazuh-agentd: INFO: Requesting a key from server: 10.10.144.54
2024/04/10 17:30:12 wazuh-agentd: INFO: No authentication password provided
2024/04/10 17:30:12 wazuh-agentd: INFO: Using agent name as: siemcld01.syncsec.coll
2024/04/10 17:30:12 wazuh-agentd: INFO: Waiting for server reply
2024/04/10 17:30:12 wazuh-agentd: ERROR: Duplicate agent name: siemcld01.syncsec.coll (from manager)
2024/04/10 17:30:12 wazuh-agentd: ERROR: Unable to add agent (from manager)
2024/04/10 17:30:22 wazuh-agentd: WARNING: (4101): Waiting for server reply (not started). Tried: '10.10.144.54'. Ensure that the manager version is 'v4.7.2' or higher.
2024/04/10 17:30:22 wazuh-agentd: WARNING: Unable to connect to any server.
2024/04/10 17:30:22 wazuh-agentd: INFO: Closing connection to server ([10.10.144.54]:1514/tcp).

cluster: no errors found
2023/10/12 00:02:09 INFO: [Worker worker01] [Agent-groups send] Starting.
2023/10/12 00:02:09 INFO: [Worker worker01] [Agent-groups send] Finished in 0.028s. Updated 1 chunks.
2023/10/12 00:02:11 INFO: [Worker worker01] [Integrity check] Starting.
2023/10/12 00:02:11 INFO: [Worker worker01] [Integrity check] Finished in 0.022s. Received metadata of 53 files. Sync not required.
2023/10/12 00:02:14 INFO: [Worker worker01] [Agent-info sync] Starting.
2023/10/12 00:02:14 INFO: [Worker worker01] [Agent-info sync] Finished in 0.011s. Updated 1 chunks.
2023/10/12 00:02:14 INFO: [Master] [Local integrity] Starting.
2023/10/12 00:02:14 INFO: [Master] [Local integrity] Finished in 0.018s. Calculated metadata of 53 files.
2023/10/12 00:02:18 INFO: [Master] [Local agent-groups] Starting.
2023/10/12 00:02:19 INFO: [Master] [Local agent-groups] Finished in 0.002s.

manager:
2023/10/12 00:08:54 wazuh-remoted: ERROR: Cannot create multigroup directory 'var/multigroups/4a34e273': Permission denied (13)
2023/10/12 14:46:17 wazuh-remoted: INFO: (1410): Reading authentication keys file.
2023/10/12 12:46:17 wazuh-authd: INFO: New connection from 10.10.144.54
2023/10/12 12:46:17 wazuh-authd: INFO: Received request for a new agent (siemcld01.syncsec.priv) from: 10.10.144.54
2023/10/12 12:46:17 wazuh-authd: WARNING: Duplicate name 'siemcld01.syncsec.priv', rejecting enrollment. Agent '002' has not been disconnected long enough to be replaced.

worker:
2023/10/12 00:01:59 wazuh-remoted: ERROR: Cannot create multigroup directory 'var/multigroups/4a34e273': Permission denied (13)

Regards,
Daniel D.

Matias Pereyra

unread,
Apr 18, 2024, 9:01:36 PMApr 18
to Wazuh | Mailing List
Hello again!

Thank you for the logs.
This information is partial, the logs from the worker when the agent fails to connect would be useful.

What I can see is that the agent is unable to connect, and after some retries, it requests a new key but it's rejected. This is the auto-enrollment feature and the behavior is expected because the agent isn't being disconnected long enough. 

But there is an ERROR message related to permissions and this might indicate a problem in the installation. Maybe that explains the problem with the synchronization. Were you able to confirm that the client.keys files aren't synced when this issue happens?

Please provide more details of your deployment, have you changed something of the docker files?

Regards.
Reply all
Reply to author
Forward
0 new messages