Cluster does not start anymore

93 views
Skip to first unread message

Fabrice Le Dorze

unread,
May 21, 2024, 4:46:29 AM5/21/24
to Wazuh | Mailing List
Hi all
We have a Wazuh infra with one dashboard and a cluster of 2 nodes indexed/manager (on Prem, one in cloud)
By mistake, we broke nodes by installing wazuh-agent.
We could restore OmPrem node.
But we had to re-install manager and indexer with install script on Cloud node.
Since , cluster does not start anymore and on Dashboard side, it says " Wazuh dashboard server is not ready yet "

on OnPrem node :
wazuh-indexer  4.7.3-1 
wazuh-manager  4.7.3-1

/var/ossec/bin/cluster_control -l
NAME              TYPE    VERSION  ADDRESS
wazuh-node-1-onprem  master  4.7.3    10.15.100.131

In /var/log/wazuh-indexer/wazuh-indexer-cluster.log,
 [wazuh-node-1-vic] cluster-manager not discovered or elected yet, an election requires a node with id [q1epkbDbRFaBx2RbwSm4Rw], have discovered [{wazuh-node-1-vic}{-PnONKBfSWamy3zqDkpedw}{XYjqeHKST5qd5z5jnQEGYg}{10.15.100.131}{10.15.100.131:9300}{dimr}{shard_indexing_pressure_enabled=true}, {wazuh-node-1-az}{bU5Q1VnXQ9-xBLAUFWlSvA}{XUBfnlAdR5CS7DJ2wmDCNA}{10.205.5.131}{10.205.5.131:9300}{dimr}{shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using [10.205.5.131:9300] from hosts providers and [{wazuh-node-1-vic}{-PnONKBfSWamy3zqDkpedw}{XYjqeHKST5qd5z5jnQEGYg}{10.15.100.131}{10.15.100.131:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 17, last-accepted version 114202 in term 17

on Cloud node :
wazuh-indexer  4.7.2-1 
wazuh-manager  4.7.2-1

/var/ossec/bin/cluster_control -l
ERROR: Error 3012 - Cluster is not running

In /var/log/wazuh-indexer/wazuh-indexer-cluster.log,
[wazuh-node-1-az] cluster-manager not discovered or elected yet, an election requires two nodes with ids [-PnONKBfSWamy3zqDkpedw, bU5Q1VnXQ9-xBLAUFWlSvA], have discovered [{wazuh-node-1-az}{bU5Q1VnXQ9-xBLAUFWlSvA}{XUBfnlAdR5CS7DJ2wmDCNA}{10.205.5.131}{10.205.5.131:9300}{dimr}{shard_indexing_pressure_enabled=true}, {wazuh-node-1-vic}{-PnONKBfSWamy3zqDkpedw}{XYjqeHKST5qd5z5jnQEGYg}{10.15.100.131}{10.15.100.131:9300}{dimr}{shard_indexing_pressure_enabled=true}] which is a quorum; discovery will continue using [10.15.100.131:9300] from hosts providers and [{wazuh-node-1-az}{bU5Q1VnXQ9-xBLAUFWlSvA}{XUBfnlAdR5CS7DJ2wmDCNA}{10.205.5.131}{10.205.5.131:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

I understand well, on OnPrem side, an ID q1epkbDbRFaBx2RbwSm4Rw is required but it does fit the current nodes ID ( OnPrem  : -PnONKBfSWamy3zqDkpedw , Cloud : bU5Q1VnXQ9-xBLAUFWlSvA )

Is it the reason why the cluster does not start anymore ?
How to fix it ?

Or the root cause is something else ?
Thx


Reply all
Reply to author
Forward
0 new messages