node won't join Galera cluster with mariaDB

163 views
Skip to first unread message

Chachia Mohamed

unread,
Mar 2, 2023, 4:56:34 AM3/2/23
to codership
I have setup a cluster with 3 nodes : 
galera.cnf file is 

cat /etc/my.cnf.d/galera.cnf
[mysqld]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_address="gcomm://1xx.1.23.112,1xx.1.23.97,1xx.1.23.95"
binlog_format=row
wsrep_sst_method=mariabackup
wsrep_sst_auth = xxxxxxx:yyyyy
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_node_address=1xx.1.23.97
wsrep_node_name=node2

the cluster has been started well with the three nodes , but when I stopped mariaBD service on 1xx.1.23.112 , it doesn't want to start again 
when I type : systemctl status -l mariadb

● mariadb.service - MariaDB 10.4.28 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: activating (start) since خ 2023-03-02 09:03:51 UTC; 2s ago
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
  Process: 10370 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
  Process: 10368 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
 Main PID: 10453 (mysqld)
   CGroup: /system.slice/mariadb.service
           └─10453 /usr/sbin/mysqld --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1

مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: restore pc from disk failed
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: GMCast version 0
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: EVS version 1
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '1xx.1.23.95:,1xx.1.23.97:,1xx.1.23.112:'
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://1xx.1.23.112:4567
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') connection established to bc2b6b87-9739 tcp://xx.1.23.95:4567
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
مار 02 09:03:53 testupgrade.xx.eu mysqld[10453]: 2023-03-02  9:03:53 0 [Note] WSREP: (2805daf1-bd63, 'tcp://0.0.0.0:4567') connection established to 3cbdb95a-b46c tcp://xx.1.23.97:4567


mohsen shahbazi

unread,
Mar 2, 2023, 8:29:59 AM3/2/23
to codership
Hi Mohammed , 
would you please provide some last lines of mysql error log ? it should be in a place like /var/log/mysq/error.log or a place like that . 
also you can set some wsrep_provider_options to make specific recovery options like "pc.recovery" or "gcache.recovery" specialy second option to use gcache for recover node and join cluster again . more detailed explanation is possible after viewing the logs . 
Message has been deleted

Chachia Mohamed

unread,
Mar 10, 2023, 3:46:10 AM3/10/23
to codership
We fixed it , and here's how you can fix that : 
  1. connect to one of the working nodes, and check " mysql -u root -p  "
  2. type  SHOW GLOBAL STATUS LIKE 'wsrep_%'; 
  3. copy the wsrep_cluster_state_uuid value and copy the wsrep_last_committed value 
  4. now connect to the node you want it to join and paste those values in /var/lib/mysql/grastate.dat in uuid , and seqno 
  5. try now to restart mariadb service systemctl restart mariadb
  6. now on the previous node we go to mysql prompt again and type SET GLOBAL wsrep_cluster_address="gcomm://ip of node1,ip of node 2 , ip of node 3";
  7. now we go back to the joiner node and restart mariadb service systemctl restart mariadb 
Reply all
Reply to author
Forward
0 new messages