WSREP: MDL BF-BF conflict

425 views
Skip to first unread message

dcz01

unread,
Apr 24, 2024, 6:20:28 AM4/24/24
to codership
Hi,

Can anyone explain to me why one node is always disconnecting of the galera cluster?
I'm now searching very long for a solution or the problem but can't find anything...

Here is the problem completly documented: [MDEV-33908] WSREP: MDL BF-BF conflict - Jira (mariadb.org)

Hope someone can help me please.

Greetings
dcz01

john danilson

unread,
Apr 24, 2024, 1:29:35 PM4/24/24
to codership
dcz01

Here's a theory.  You have a two node cluster wihch is suspect to any interruption of communication between nodes since this this may cause the cluster to lose quorum.  I would suggest you add a thrid node. 
If that is not possible, then this article may help:   https://galeracluster.com/library/kb/two-node-clusters.html

You might also consider increasing time out values for galera to give it more time to recover from intermittent network failures.  

dcz01

unread,
Apr 26, 2024, 4:30:36 AM4/26/24
to codership
Hey,

Thanks for your reply.
Well i have encountered this article or documentation before but the thing is that not the whole cluster is going down.
It's only always just one node of them and then the other is running stable and the clients can still execute statements on it.
But we always need to restart the second one so that it does a fresh SST and then is back online normally again.
One time this happens is when an migration sql-replication is still running from mysql 5.7 to a new mariadb-galera 10.5 cluster or when the maintenance tasks in the night are running like a OPTIMIZE TABLE.

Do you know a solution for that?

Were or with what variable can i increase the timeouts for galera?

john danilson

unread,
Apr 26, 2024, 10:24:35 AM4/26/24
to codership
dz, 

I do not know maria db.  We do not run galera but have the package percona xtradb cluster which combines mysql & galera.  So some answers might not be applicable in your environment.

"...not the whole cluster is going down."    Correrct, but the cluster has lost quorum. becuase you only have two nodes.  The remaining node should handle reads, it should not handle writes.  Can you confirm that the executed commands on the surviving node are reads and not writes.  What are the values of wsrep_cluster_status and wsrep_ready on the surviving node when the failed node is down.  Look at the Galera documentation for the meaning.  The "good" values are Synced/Primary.  

"...But we always need to restart the second one so that it does a fresh SST"    
I presume once you lose quorum and the failed node is down, you are starting it manually?  How large is your gcache fille; it may be too small to do ist. Or the grastate.dat may be invalid after the failure.   Is the ../<datadir>/grastate.dat file showing a correct uuid and seqno=-1 or some other values?  If grastate does not have a valid uuid and a non zero and non negative seqno then it will run sst.  You should be able to get correct values and manually update the grastat.dat file by running something like:
    mysqld --wsrep-recover
then grep for Recover (case sensitive)  in the mysql log for the correct values of uuid and seqno to put in  your grastate.dat file.  Then when you start the node it will do ist not sst. 

I would also set gmcast.segment=2 for your failing node in the wsrep_provider_options since it's effectively in a differnt segment since you only have two.  Again, I recommend you get a thrid node or at least a garbd node so you have an odd number and then you can suffer a single outage and the cluster will remain valid.  

The timeout is evs.suspect_timeout.  Afther this the node is suspect to be dead.  If surviving nodes agree it will drop from the cluster evern before evs.inactive_timeout.  

tschüss,.  
John 

Vinicius Grippa

unread,
Apr 26, 2024, 1:09:31 PM4/26/24
to john danilson, codership
Hi,

Sorry to ask a side question, but what is MDL BF-BF? Metadalock BF-BF?

Vinicius M. Grippa

Celular / Cell Phone: + 55 17 99128 2987
Skype: vgrippa

Antes de imprimir pense em seu compromisso com o meio ambiente.
Before printing, think about your commitment to the Environment.


--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/codership-team/92107349-dffb-49b0-8375-8b007169226fn%40googlegroups.com.

john danilson

unread,
Apr 26, 2024, 4:52:11 PM4/26/24
to Vinicius Grippa, codership
galera multi master meta data collision lock


Galera supports multi master.   But don't do it.  You need one, and only one writer in the cluster at any point in time.    Front your cluster with something which will direct traffic to one node.  We use f5 vips to proxysql thence to the pxc cluster.  But there are many dragons here and it's really hard to correctly control what happens when one node crashes.   proxysql does a good job but it is not perfect because when the failing node restarts, unless you intervene in proxysql, it will resume connections to that node; now  you will likely have connections to two nodes and begin the dreaded collision of transactions.  It's a hard problem to solve; we wrote a cluster scheduler script that tries to avoid the issue but even it cannot correct 100% of the issues.  
--
John Danilson
703.615.4997
Message has been deleted

dcz01

unread,
Apr 29, 2024, 4:40:26 AM4/29/24
to codership
Hi,

Well i can confirm that the remaining node has the "good" status with both variables Synced and Primary set.
And it already can write the SQLs even if is the last node... Thats the thing.
Our gcache file is only default value (which should be about 128 MB or?). Should this be greater?
The status of the grastate.dat file is good on the normal running node with the Synced/Primary status and "bad" on the second one with the seqno=-1 like you said and an UUID of only zeros (000000-000000-000000-00000). But then only an systemctl restart mariadb is necesary an then the nodes makes an normal SST and has gone back normally online after that.

The first error in error.log is:
2024-04-29  7:15:02 2 [ERROR] Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 42, event_type: -94
2024-04-29  7:15:02 2 [ERROR] WSREP: applier could not read binlog event, seqno: 861477354, len: 102
2024-04-29  7:15:02 0 [Note] WSREP: Member 0(server1) initiates vote on 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354,b5ba3e01ce281c50:
2024-04-29  7:15:02 0 [Note] WSREP: Votes over 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354:
   b5ba3e01ce281c50:   1/2
Waiting for more votes.
2024-04-29  7:15:02 0 [Note] WSREP: Member 1(server2) responds to vote on 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354,0000000000000000: Success
2024-04-29  7:15:02 0 [Note] WSREP: Votes over 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354:
   0000000000000000:   1/2
   b5ba3e01ce281c50:   1/2
Winner: 0000000000000000
2024-04-29  7:15:02 2 [ERROR] WSREP: Inconsistency detected: Inconsistent by consensus on 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354
         at ./galera/src/replicator_smm.cpp:process_apply_error():1351
2024-04-29  7:15:02 2 [Note] WSREP: Closing send monitor...
2024-04-29  7:15:02 2 [Note] WSREP: Closed send monitor.
2024-04-29  7:15:02 2 [Note] WSREP: gcomm: terminating thread
2024-04-29  7:15:02 2 [Note] WSREP: gcomm: joining thread
2024-04-29  7:15:02 2 [Note] WSREP: gcomm: closing backend
2024-04-29  7:15:03 2 [Note] WSREP: view(view_id(NON_PRIM,719a00bd-8959,10) memb {
        719a00bd-8959,0
} joined {
} left {
} partitioned {
        bf0f2119-9863,0
})
2024-04-29  7:15:03 2 [Note] WSREP: PC protocol downgrade 1 -> 0
2024-04-29  7:15:03 2 [Note] WSREP: view((empty))
2024-04-29  7:15:03 2 [Note] WSREP: gcomm: closed
2024-04-29  7:15:03 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2024-04-29  7:15:03 0 [Note] WSREP: Flow-control interval: [16, 16]
2024-04-29  7:15:03 0 [Note] WSREP: Received NON-PRIMARY.
2024-04-29  7:15:03 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 861477369)
2024-04-29  7:15:03 0 [Note] WSREP: New SELF-LEAVE.
2024-04-29  7:15:03 0 [Note] WSREP: Flow-control interval: [0, 0]
2024-04-29  7:15:03 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2024-04-29  7:15:03 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 861477369)
2024-04-29  7:15:03 0 [Note] WSREP: RECV thread exiting 0: Success
2024-04-29  7:15:03 2 [Note] WSREP: recv_thread() joined.
2024-04-29  7:15:03 2 [Note] WSREP: Closing replication queue.
2024-04-29  7:15:03 2 [Note] WSREP: Closing slave action queue.
2024-04-29  7:15:03 2 [Note] WSREP: ================================================
View:
  id: 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477369
  status: non-primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 0
  members(1):
        0: 719a00bd-03ad-11ef-8959-5b0fff21bb05, server1
=================================================
2024-04-29  7:15:03 2 [Note] WSREP: Non-primary view
2024-04-29  7:15:03 2 [Note] WSREP: Server status change synced -> connected
2024-04-29  7:15:03 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-04-29  7:15:03 2 [Note] WSREP: ================================================
View:
  id: 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477369
  status: non-primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: yes
  own_index: -1
  members(0):
=================================================
2024-04-29  7:15:03 2 [Note] WSREP: Non-primary view
2024-04-29  7:15:03 2 [Note] WSREP: Server status change connected -> disconnected
2024-04-29  7:15:03 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-04-29  7:15:03 0 [Note] WSREP: Service thread queue flushed.
2024-04-29  7:15:03 2 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: 5
2024-04-29  7:15:03 2 [Note] WSREP: Applier thread exiting ret: 0 thd: 2



And the other error:
2024-04-20  0:23:45 7 [Note] WSREP: MDL BF-BF conflict
schema:  xxxxxxxxxxx
request: (7     seqno 492509518         wsrep (toi, exec, committed) cmd 0 45   OPTIMIZE TABLE exportdb_sqls)
granted: (8     seqno 492509523         wsrep (high priority, exec, committing) cmd 0 161       (null))
2024-04-20  0:23:45 7 [ERROR] Aborting
Reply all
Reply to author
Forward
0 new messages