We have a 3 node mariadb cluster with orchestrator running on all 3 nodes in a raft configuration. Recently we had an issue where one node in the cluster was in an amazon availability zone that had an outage. In this case the node in that AZ was the mysql master and the orchestrator leader. We see that orchestrator failed over to a different node as expected, but it did not ever do a mysql failover. We see the following error message repeating a few times that seems to be the culprit but we dont see any evidence that a failover was attempted before this in a log or on the db. Can anyone provide any guidance on how we ended up in this state, and how to troubleshoot?
ERROR AttemptRecoveryRegistration: cluster NODE1 has recently experienced a failover (of NODE1) and is in active period. It will not be failed over again. You may acknowledge the failure on this cluster (-c ack-cluster-recoveries) or on NODE1 (-c ack-instance-recoveries) to remove this blockage
thanks,--This message and any attachments are solely for the intended recipient. If you are not the intended recipient, disclosure, copying, use, or distribution of the information included in this message is prohibited -- please immediately and permanently delete this message. --
You received this message because you are subscribed to the Google Groups "orchestrator-mysql" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orchestrator-my...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/orchestrator-mysql/e289f0d3-6a86-474e-8e0b-838a386416c9n%40googlegroups.com.