Hi,
Well i can confirm that the remaining node has the "good" status with both variables Synced and Primary set.
And it already can write the SQLs even if is the last node... Thats the thing.
Our gcache file is only default value (which should be about 128 MB or?). Should this be greater?
The status of the grastate.dat file is good on the normal running node with the Synced/Primary status and "bad" on the second one with the seqno=-1 like you said and an UUID of only zeros (000000-000000-000000-00000). But then only an systemctl restart mariadb is necesary an then the nodes makes an normal SST and has gone back normally online after that.
The first error in error.log is:
2024-04-29 7:15:02 2 [ERROR] Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 42, event_type: -94
2024-04-29 7:15:02 2 [ERROR] WSREP: applier could not read binlog event, seqno: 861477354, len: 102
2024-04-29 7:15:02 0 [Note] WSREP: Member 0(server1) initiates vote on 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354,b5ba3e01ce281c50:
2024-04-29 7:15:02 0 [Note] WSREP: Votes over 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354:
b5ba3e01ce281c50: 1/2
Waiting for more votes.
2024-04-29 7:15:02 0 [Note] WSREP: Member 1(server2) responds to vote on 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354,0000000000000000: Success
2024-04-29 7:15:02 0 [Note] WSREP: Votes over 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354:
0000000000000000: 1/2
b5ba3e01ce281c50: 1/2
Winner: 0000000000000000
2024-04-29 7:15:02 2 [ERROR] WSREP: Inconsistency detected: Inconsistent by consensus on 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477354
at ./galera/src/replicator_smm.cpp:process_apply_error():1351
2024-04-29 7:15:02 2 [Note] WSREP: Closing send monitor...
2024-04-29 7:15:02 2 [Note] WSREP: Closed send monitor.
2024-04-29 7:15:02 2 [Note] WSREP: gcomm: terminating thread
2024-04-29 7:15:02 2 [Note] WSREP: gcomm: joining thread
2024-04-29 7:15:02 2 [Note] WSREP: gcomm: closing backend
2024-04-29 7:15:03 2 [Note] WSREP: view(view_id(NON_PRIM,719a00bd-8959,10) memb {
719a00bd-8959,0
} joined {
} left {
} partitioned {
bf0f2119-9863,0
})
2024-04-29 7:15:03 2 [Note] WSREP: PC protocol downgrade 1 -> 0
2024-04-29 7:15:03 2 [Note] WSREP: view((empty))
2024-04-29 7:15:03 2 [Note] WSREP: gcomm: closed
2024-04-29 7:15:03 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2024-04-29 7:15:03 0 [Note] WSREP: Flow-control interval: [16, 16]
2024-04-29 7:15:03 0 [Note] WSREP: Received NON-PRIMARY.
2024-04-29 7:15:03 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 861477369)
2024-04-29 7:15:03 0 [Note] WSREP: New SELF-LEAVE.
2024-04-29 7:15:03 0 [Note] WSREP: Flow-control interval: [0, 0]
2024-04-29 7:15:03 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2024-04-29 7:15:03 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 861477369)
2024-04-29 7:15:03 0 [Note] WSREP: RECV thread exiting 0: Success
2024-04-29 7:15:03 2 [Note] WSREP: recv_thread() joined.
2024-04-29 7:15:03 2 [Note] WSREP: Closing replication queue.
2024-04-29 7:15:03 2 [Note] WSREP: Closing slave action queue.
2024-04-29 7:15:03 2 [Note] WSREP: ================================================
View:
id: 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477369
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(1):
0: 719a00bd-03ad-11ef-8959-5b0fff21bb05, server1
=================================================
2024-04-29 7:15:03 2 [Note] WSREP: Non-primary view
2024-04-29 7:15:03 2 [Note] WSREP: Server status change synced -> connected
2024-04-29 7:15:03 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-04-29 7:15:03 2 [Note] WSREP: ================================================
View:
id: 5ad80727-7fad-11ee-a05e-4e9bdce1cc77:861477369
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: yes
own_index: -1
members(0):
=================================================
2024-04-29 7:15:03 2 [Note] WSREP: Non-primary view
2024-04-29 7:15:03 2 [Note] WSREP: Server status change connected -> disconnected
2024-04-29 7:15:03 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-04-29 7:15:03 0 [Note] WSREP: Service thread queue flushed.
2024-04-29 7:15:03 2 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: 5
2024-04-29 7:15:03 2 [Note] WSREP: Applier thread exiting ret: 0 thd: 2
And the other error:
2024-04-20 0:23:45 7 [Note] WSREP: MDL BF-BF conflict
schema: xxxxxxxxxxx
request: (7 seqno 492509518 wsrep (toi, exec, committed) cmd 0 45 OPTIMIZE TABLE exportdb_sqls)
granted: (8 seqno 492509523 wsrep (high priority, exec, committing) cmd 0 161 (null))
2024-04-20 0:23:45 7 [ERROR] Aborting