$ curl -k -u user:pass https://127.0.0.1:9200/_cluster/allocation/explain
{"index":".opendistro-ism-managed-index-history-2024.05.31-000554","shard":0,"primary":false,"current_state":"unassigned","unassigned_info":{"reason":"CLUSTER_RECOVERED","at":"2024-06-13T20:28:55.996Z","last_allocation_status":"no_attempt"},"can_allocate":"no","allocate_explanation":"cannot allocate because allocation is not permitted to any of the nodes","node_allocation_decisions":[{"node_id":"QMuy1mhfTHy4GWXntDrbyA","node_name":"node-1","transport_address":"10.148.120.132:9300","node_attributes":{"shard_indexing_pressure_enabled":"true"},"node_decision":"no","deciders":[{"decider":"same_shard","decision":"NO","explanation":"a copy of this shard is already allocated to this node [[.opendistro-ism-managed-index-history-2024.05.31-000554][0], node[QMuy1mhfTHy4GWXntDrbyA], [P], s[STARTED], a[id=NiwBAkSMQr2LLMcotioQOg]]"}]}]}
Any thoughts appreciated.
Hi, I have investigated your case. From the message a copy of this shard is already allocated to this node it seems that the cluster size has been reduced. The message indicates that there is already a replica on that node and you cannot have a replica number greater than the number of nodes in the cluster.
How many nodes do you have? Is it the same number you had before upgrading to 4.8.0?
If this is the problem, the solution is to adjust the number of replicas. You have more information here https://documentation.wazuh.com/current/user-manual/wazuh-indexer/wazuh-indexer-tuning.html#shards-and-replicas
curl -k -u user:pass -X PUT -H "Content-Type: application/json" -d \
'{"persistent":{"opendistro":{"index_state_management":{"history":{"number_of_replicas":"0"}}}}}' https://127.0.0.1:9200/_cluster/settings
Followed by this the remove the duplicate shards:
curl -k -u user:pass https://127.0.0.1:9200/_cat/shards | grep UNASSIGNED | \
awk '{print $1}' | xargs -i curl -k -XDELETE -u user:pass "https://127.0.0.1:9200/{}"
curl -k -u user:pass -XDELETE "https://127.0.0.1:9200/.kibana_*"
I cannot be sure, but I suspect the real problem was "Wazuh-indexer error: "index template ss4o_metrics_template has index patterns ss4o_metrics" and everything else was caused by trying to migrate and restart while that issue was making it impossible to run cleanly.