node won't join Galera cluster with mariaDB

Chachia Mohamed

unread,

Mar 2, 2023, 4:56:34 AM3/2/23

to codership

I have setup a cluster with 3 nodes :
galera.cnf file is

cat /etc/my.cnf.d/galera.cnf
[mysqld]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_address="gcomm://1xx.1.23.112,1xx.1.23.97,1xx.1.23.95"
binlog_format=row
wsrep_sst_method=mariabackup
wsrep_sst_auth = xxxxxxx:yyyyy
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_node_address=1xx.1.23.97
wsrep_node_name=node2

the cluster has been started well with the three nodes , but when I stopped mariaBD service on 1xx.1.23.112 , it doesn't want to start again
when I type : systemctl status -l mariadb

● mariadb.service - MariaDB 10.4.28 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: activating (start) since خ 2023-03-02 09:03:51 UTC; 2s ago
Docs: man:mysqld(8)
https://mariadb.com/kb/en/library/systemd/
Process: 10370 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
Process: 10368 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Main PID: 10453 (mysqld)
CGroup: /system.slice/mariadb.service
└─10453 /usr/sbin/mysqld --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1

مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: restore pc from disk failed
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: GMCast version 0
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: EVS version 1
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '1xx.1.23.95:,1xx.1.23.97:,1xx.1.23.112:'
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://1xx.1.23.112:4567
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') connection established to bc2b6b87-9739 tcp://xx.1.23.95:4567
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02 9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') connection established to 3cbdb95a-b46c tcp://xx.1.23.97:4567

mohsen shahbazi

unread,

Mar 2, 2023, 8:29:59 AM3/2/23

to codership

Hi Mohammed ,

would you please provide some last lines of mysql error log ? it should be in a place like /var/log/mysq/error.log or a place like that .

also you can set some wsrep_provider_options to make specific recovery options like "pc.recovery" or "gcache.recovery" specialy second option to use gcache for recover node and join cluster again . more detailed explanation is possible after viewing the logs .

Message has been deleted

Chachia Mohamed

unread,

Mar 10, 2023, 3:46:10 AM3/10/23

to codership

We fixed it , and here's how you can fix that :

connect to one of the working nodes, and check " mysql -u root -p "
type SHOW GLOBAL STATUS LIKE 'wsrep_%';
copy the wsrep_cluster_state_uuid value and copy the wsrep_last_committed value
now connect to the node you want it to join and paste those values in /var/lib/mysql/grastate.dat in uuid , and seqno
try now to restart mariadb service systemctl restart mariadb
now on the previous node we go to mysql prompt again and type SET GLOBAL wsrep_cluster_address="gcomm://ip of node1,ip of node 2 , ip of node 3";
now we go back to the joiner node and restart mariadb service systemctl restart mariadb

Reply all

Reply to author

Forward