MariaDB Galera Cluster stalls.

116 views
Skip to first unread message

Дмитрий Проняев

unread,
Aug 11, 2017, 8:32:20 AM8/11/17
to codership

I have cluster consists of 2 physical nodes (2x Intel Xeon E5-2670 2.6Ghz , 192Gb, RAID1 SSD, ОС Debian 8, mariadb-server-10.2) + arbitrator (on Proxmox). I write to only one node at time. My cluster works fine from 1 hour to several days and stalls. It doesn't show any errors in syslog or mysql-error.log

PROCESSLIST shows some queries in "Query End". It cannot be killed (it just changes status from Query End to Killed. mysql -u root -e "SHOW STATUS LIKE 'wsrep_%' " shows that nodes are Synced, wsrep_local_recv_queue = 0 and so on.

The only way back it to work is to restart node.

Please help me to realise what's the problem and how to fix it.


Here is my.cnf with galera options:


[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock

# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0

[mysqld]
#
# * Basic Settings
#
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc_messages_dir = /usr/share/mysql
lc_messages = en_US
skip-external-locking
performance_schema=ON

# MyISAM #
key-buffer-size = 32M
myisam-recover = FORCE,BACKUP

# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000
skip-name-resolve

# DATA STORAGE #
datadir = /var/lib/mysql/

# BINARY LOGGING #
log-bin = /var/lib/mysql/mysql-bin
expire-logs-days = 14
sync-binlog = 1

# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 50
open-files-limit = 65535
table-definition-cache = 4096
table-open-cache = 4096
innodb_flush_log_at_trx_commit = 1

# INNODB #
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 512M
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table = 1
innodb-buffer-pool-size = 160G

# LOGGING #
log-error = /var/lib/mysql/mysql-error.log
log-queries-not-using-indexes = 1
slow-query-log = 1
slow-query-log-file = /var/lib/mysql/mysql-slow.log



#GALERA
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
# Galera Cluster Configuration
wsrep_cluster_name="galera-cluster"
wsrep_cluster_address="gcomm://ip1,ip2"
# Galera Synchronization Configuration
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=user:password
# Galera Node Configuration
wsrep_node_address="ip_ноды"
wsrep_node_name="galera-node1"

# Tuning
wsrep_retry_autocommit = 4
wsrep_slave_threads = 64
wsrep_provider_options="gcache.size=5G; gcs.fc_limit = 320; gcs.fc_factor=0.8;"




!includedir /etc/mysql/conf.d/

Дмитрий Проняев

unread,
Aug 14, 2017, 3:00:26 AM8/14/17
to codership


пятница, 11 августа 2017 г., 15:32:20 UTC+3 пользователь Дмитрий Проняев написал:
UPDATE: this problem issues only if I writing to node1 (and reading from both nodes). If I write to node2 (and reading from both nodes) - everything works good. Both nodes have the same hardware configuretion, I've reinstalled Debian on node1, changed the LAN cable and connected node1 in the same CISCO as node2 - no result.
Reply all
Reply to author
Forward
0 new messages