Hi there guys
I'm new to galera, I was able to configure a three nodes cluster, it was working fine until I shutted down the instances, this my config:
my.cnf
#
# This group is read both both by the client and the server
# use it for options that affect everything
#
[client-server]
#
# include all files from the config directory
#
!includedir /etc/my.cnf.d
[mysqld_safe]
#debug=d,info,error,query:o,/var/log/mysqld.trace
log_error=/var/log/mysql_error.log
[mysqld]
#log_error=/var/log/mysql_error.log
general_log_file = /var/log/mysql.log
general_log = 1
/etc/my.cnf.d/server.cnf
#
# These groups are read by MariaDB server.
# Use it for options that only the server (but not clients) should see
#
# See the examples of server my.cnf files in /usr/share/mysql/
#
# this is read by the standalone daemon and embedded servers
[server]
# this is only for the mysqld standalone daemon
[mysqld]
#
# * Galera-related settings
#
#[galera]
# Mandatory settings
#wsrep_on=ON
#wsrep_provider=
#wsrep_cluster_address=
#binlog_format=row
#default_storage_engine=InnoDB
#innodb_autoinc_lock_mode=2
#
# Allow server to accept connections on all interfaces.
#
#bind-address=0.0.0.0
#
# Optional setting
#wsrep_slave_threads=1
#innodb_flush_log_at_trx_commit=0
# this is only for embedded server
### Start Lumiserv Configuration
[galera]
# Mandatory settings
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
#wsrep_cluster_address='gcomm://'
wsrep_cluster_address='gcomm://
172.31.24.77,172.31.37.80,172.31.4.41'
wsrep_cluster_name='lumiservgalera'
wsrep_node_address='172.31.4.41'
wsrep_node_name='galera3'
wsrep_sst_method=rsync
wsrep_debug=ON
wsrep_log_conflicts=ON
wsrep_dbug_option=ON
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
### End Lumiserv Configuration
[embedded]
# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
# This group is only read by MariaDB-10.1 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mariadb-10.1]
When I started the instances, the service doesn't start , I've got this error:
Sep 3 18:43:50 ip-172-31-4-41 systemd: Starting MariaDB database server...
Sep 3 18:43:52 ip-172-31-4-41 sh: WSREP: Recovered position 70588c9b-5c03-11e6-bca8-1f3f45fdb070:127
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] /usr/sbin/mysqld (mysqld 10.1.16-MariaDB) starting as process 3622 ...
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Setting wsrep_ready to 0
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Read nil XID from storage engines, skipping position init
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: wsrep_load(): Galera 25.3.15(r3578) by Codership Oy <
in...@codership.com> loaded successfully.
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: CRC-32C: using hardware acceleration.
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Found saved state: 70588c9b-5c03-11e6-bca8-1f3f45fdb070:-1
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 172.31.4.41; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0;
gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false;
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619389335296 [Note] WSREP: Service thread queue flushed.
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Assign initial position for certification: 127, protocol version: -1
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: wsrep_sst_grab()
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Start replication
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Setting initial position to 70588c9b-5c03-11e6-bca8-1f3f45fdb070:127
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: protonet asio version 0
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: Using CRC-32C for message checksums.
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: backend: asio
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: restore pc from disk failed
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: GMCast version 0
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: (5c8ef3f0, 'tcp://
0.0.0.0:4567') listening at tcp://
0.0.0.0:4567Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: (5c8ef3f0, 'tcp://
0.0.0.0:4567') multicast: , ttl: 1
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: EVS version 0
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Note] WSREP: gcomm: connecting to group 'lumiservgalera', peer '
172.31.24.77:,
172.31.37.80:,
172.31.4.41:'
Sep 3 18:43:52 ip-172-31-4-41 mysqld: 2016-09-03 18:43:52 140619638335616 [Warning] WSREP: (5c8ef3f0, 'tcp://
0.0.0.0:4567') address 'tcp://
172.31.4.41:4567' points to own listening address, blacklisting
Sep 3 18:43:55 ip-172-31-4-41 mysqld: 2016-09-03 18:43:55 140619638335616 [Warning] WSREP: no nodes coming from prim view, prim not possible
Sep 3 18:43:55 ip-172-31-4-41 mysqld: 2016-09-03 18:43:55 140619638335616 [Note] WSREP: view(view_id(NON_PRIM,5c8ef3f0,1) memb {
Sep 3 18:43:55 ip-172-31-4-41 mysqld: 5c8ef3f0,0
Sep 3 18:43:55 ip-172-31-4-41 mysqld: } joined {
Sep 3 18:43:55 ip-172-31-4-41 mysqld: } left {
Sep 3 18:43:55 ip-172-31-4-41 mysqld: } partitioned {
Sep 3 18:43:55 ip-172-31-4-41 mysqld: })
Sep 3 18:43:56 ip-172-31-4-41 mysqld: 2016-09-03 18:43:56 140619638335616 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50319S), skipping check
Sep 3 18:44:25 ip-172-31-4-41 mysqld: 2016-09-03 18:44:25 140619638335616 [Note] WSREP: view((empty))
Sep 3 18:44:25 ip-172-31-4-41 mysqld: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
Sep 3 18:44:25 ip-172-31-4-41 mysqld: at gcomm/src/pc.cpp:connect():162
Sep 3 18:44:25 ip-172-31-4-41 mysqld: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
Sep 3 18:44:25 ip-172-31-4-41 mysqld: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'lumiservgalera' at 'gcomm://
172.31.24.77,172.31.37.80,172.31.4.41': -110 (Connection timed out)
Sep 3 18:44:25 ip-172-31-4-41 mysqld: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: gcs connect failed: Connection timed out
Sep 3 18:44:25 ip-172-31-4-41 mysqld: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: wsrep::connect(gcomm://
172.31.24.77,172.31.37.80,172.31.4.41) failed: 7
Sep 3 18:44:25 ip-172-31-4-41 mysqld: 2016-09-03 18:44:25 140619638335616 [ERROR] Aborting
Sep 3 18:44:26 ip-172-31-4-41 systemd: mariadb.service: main process exited, code=exited, status=1/FAILURE
Sep 3 18:44:26 ip-172-31-4-41 systemd: Failed to start MariaDB database server.
Sep 3 18:44:26 ip-172-31-4-41 systemd: Unit mariadb.service entered failed state.
Sep 3 18:44:26 ip-172-31-4-41 systemd: mariadb.service failed.
systemctl status mysql -l
● mariadb.service - MariaDB database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: failed (Result: exit-code) since Sat 2016-09-03 18:44:26 UTC; 1min 57s ago
Process: 3015 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Process: 3622 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
Process: 3528 ExecStartPre=/bin/sh -c VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
Process: 3526 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Main PID: 3622 (code=exited, status=1/FAILURE)
Status: "MariaDB server is down"
Sep 03 18:44:25 ip-172-31-4-41.eu-west-1.compute.internal mysqld[3622]: at gcomm/src/pc.cpp:connect():162
Sep 03 18:44:25 ip-172-31-4-41.eu-west-1.compute.internal mysqld[3622]: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
Sep 03 18:44:25 ip-172-31-4-41.eu-west-1.compute.internal mysqld[3622]: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'lumiservgalera' at 'gcomm://
172.31.24.77,172.31.37.80,172.31.4.41': -110 (Connection timed out)
Sep 03 18:44:25 ip-172-31-4-41.eu-west-1.compute.internal mysqld[3622]: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: gcs connect failed: Connection timed out
Sep 03 18:44:25 ip-172-31-4-41.eu-west-1.compute.internal mysqld[3622]: 2016-09-03 18:44:25 140619638335616 [ERROR] WSREP: wsrep::connect(gcomm://
172.31.24.77,172.31.37.80,172.31.4.41) failed: 7
Sep 03 18:44:25 ip-172-31-4-41.eu-west-1.compute.internal mysqld[3622]: 2016-09-03 18:44:25 140619638335616 [ERROR] Aborting
Sep 03 18:44:26 ip-172-31-4-41.eu-west-1.compute.internal systemd[1]: mariadb.service: main process exited, code=exited, status=1/FAILURE
Sep 03 18:44:26 ip-172-31-4-41.eu-west-1.compute.internal systemd[1]: Failed to start MariaDB database server.
Sep 03 18:44:26 ip-172-31-4-41.eu-west-1.compute.internal systemd[1]: Unit mariadb.service entered failed state.
Sep 03 18:44:26 ip-172-31-4-41.eu-west-1.compute.internal systemd[1]: mariadb.service failed.
Is there any timeout parameter to let the node wait for a minute till the other nodes are available? I do not get what the problem is.
Any help really appreciated.
Regards