Hi there,
I'm getting an intermittent issue in my cluster thats crashing nodes.
I had a crash on a node this weekend which had wsrep_debug = 1 so I have slightly more detailed logs of what occured but I'm still not sure what the issue is.
150607 12:06:33 [Note] WSREP: Forcing release of transactional locks for thd 83
150607 12:06:33 [Note] WSREP: Forcing release of transactional locks for thd 83
150607 12:06:33 [Warning] WSREP: SQL statement was ineffective, THD: 83, buf: 187
QUERY: commit
=> Skipping replication
150607 12:06:33 [Note] WSREP: commit failed for reason: 3
150607 12:06:33 [Note] WSREP: conflict state: 0
150607 12:06:33 [Note] WSREP: cluster conflict due to certification failure for threads:
150607 12:06:33 [Note] WSREP: Victim thread:
THD: 83, mode: local, state: executing, conflict: cert failure, seqno: -1
SQL: commit
150607 12:06:33 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
150607 12:06:33 [ERROR] mysqld got signal 6 ;
So far everything I've read says that this issue is caused by an incorrectly set bin_log but I'm using ROW on all 5 nodes
MariaDB [(none)]> show variables like 'binlog_format';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW |
+---------------+-------+
1 row in set (0.00 sec)
I've attached a copy of the node that crashed no the weekends my.cnf
All nodes have almost identical my.cnf files (barring the obvious name changes)