Maxscale auto failover occurs when auto_failover is disabled

366 views
Skip to first unread message

Darcy Westfall

unread,
Jan 6, 2022, 2:38:37 PM1/6/22
to MaxScale
Hello,

We're running Maxscale with MariaDB configured in a Master/Slave configuration.  We've experienced an issue where auto failover occurs even when we have explicitly configured maxscale not to auto failover.  This happens in a very specific scenario.  In a dev environment, I've been able to reproduce this with the following:

Container Versions:

mariadb/maxscale:2.4.12 (although i also tested with 2.5.17)
mariabd/mariadb:10.4

Maxscale Configuration:

    [mariadb-0.mariadb]
    type=server
    address=mariadb-0.mariadb
    port=3306
    protocol=MariaDBBackend
 
    [mariadb-1.mariadb]
    type=server
    address=mariadb-1.mariadb
    port=3306
    protocol=MariaDBBackend
 
    [cluster-monitor]
    type=monitor
    module=mariadbmon
    servers=mariadb-0.mariadb, mariadb-1.mariadb
    user=monitor_user
    password=<redacted>
    monitor_interval=5000
    auto_failover=0
    failover_timeout=20
    auto_rejoin=false
    failcount=5
    master_failure_timeout=10
    verify_master_failure=true
    switchover_timeout=90
    replication_user=repl
    replication_password=<redacted>
    detect_replication_lag=true
    enforce_read_only_slaves=true
 
    [master-service]
    type=service
    router=readconnroute
    router_options=master
    servers=mariadb-0.mariadb, mariadb-1.mariadb
    user=maxscale
    password=<redacted>
    enable_root_user=1
 
    [slave-service]
    type=service
    router=readconnroute
    router_options=slave
    servers=mariadb-0.mariadb, mariadb-1.mariadb
    user=maxscale
    password=<redacted>
 
    [master-listener]
    type=listener
    service=master-service
    protocol=MariaDBClient
    port=3306
 
    [slave-listener]
    type=listener
    service=slave-service
    protocol=MariaDBClient
    port=3307

Steps to Reproduce:

1.  Verify everything is working and that mariadb-1 is a slave of mariadb-0
2.  Stop the Maxscale container
3.  Restart the MariaDB slave container
4.  Issue the 'stop slave' command on the slave (mariadb-1).
5.  Stop the Master container
6.  Start the Maxscale container

Here are the logs when maxscale is started:

+ mkdir -p /maxscale/logs
+ mkdir -p /maxscale/cache
+ mkdir -p /maxscale/data
+ maxscale -d -l stdout --logdir=/maxscale/logs --cachedir=/maxscale/cache --piddir=/maxscale --datadir=/maxscale/data
Info : MaxScale will be run in the terminal process.
2022-01-06 19:08:12   notice : syslog logging is enabled.
2022-01-06 19:08:12   notice : maxlog logging is enabled.
2022-01-06 19:08:12   notice : Using up to 4.69GiB of memory for query classifier cache
2022-01-06 19:08:12   notice : The collection of SQLite memory allocation statistics turned off.
2022-01-06 19:08:12   notice : Threading mode of SQLite set to Multi-thread.
2022-01-06 19:08:12   notice : MariaDB MaxScale 2.4.12 started (Commit: 7756d7c25708811b7a52b72156de752d6ac91a35)
2022-01-06 19:08:12   notice : MaxScale is running in process 1
Configuration file : /etc/maxscale.cnf
Log directory      : /maxscale/logs
Data directory     : /maxscale/data
Module directory   : /usr/lib/x86_64-linux-gnu/maxscale
Service cache      : /maxscale/cache

2022-01-06 19:08:12   notice : Configuration file: /etc/maxscale.cnf
2022-01-06 19:08:12   notice : Log directory: /maxscale/logs
2022-01-06 19:08:12   notice : Data directory: /maxscale/data
2022-01-06 19:08:12   notice : Module directory: /usr/lib/x86_64-linux-gnu/maxscale
2022-01-06 19:08:12   notice : Service cache: /maxscale/cache
2022-01-06 19:08:12   notice : Worker message queue size: 1.00MiB
2022-01-06 19:08:12   notice : No query classifier specified, using default 'qc_sqlite'.
2022-01-06 19:08:12   notice : Loaded module qc_sqlite: V1.0.0 from /usr/lib/x86_64-linux-gnu/maxscale/libqc_sqlite.so
2022-01-06 19:08:12   notice : Query classification results are cached and reused. Memory used per thread: 600.21MiB
2022-01-06 19:08:12   notice : Loading /etc/maxscale.cnf.
2022-01-06 19:08:12   notice : Loading /etc/maxscale.cnf.d/maxscale.cnf.
2022-01-06 19:08:12   error  : Failed to create directory '/var/lib/maxscale/maxscale.cnf.d': 13, Permission denied
2022-01-06 19:08:12   notice : /var/lib/maxscale/maxscale.cnf.d does not exist, not reading.
2022-01-06 19:08:12   notice : Loaded module MariaDBClient: V1.1.0 from /usr/lib/x86_64-linux-gnu/maxscale/libmariadbclient.so
2022-01-06 19:08:12   notice : [readconnroute] Initialise readconnroute router module.
2022-01-06 19:08:12   notice : Loaded module readconnroute: V2.0.0 from /usr/lib/x86_64-linux-gnu/maxscale/libreadconnroute.so
2022-01-06 19:08:12   notice : [mariadbmon] Initialise the MariaDB Monitor module.
2022-01-06 19:08:12   notice : Loaded module mariadbmon: V1.5.0 from /usr/lib/x86_64-linux-gnu/maxscale/libmariadbmon.so
2022-01-06 19:08:12   warning: Parameter 'detect_replication_lag' for module 'cluster-monitor' is deprecated and will be ignored.
2022-01-06 19:08:12   notice : Loaded module MariaDBBackend: V2.0.0 from /usr/lib/x86_64-linux-gnu/maxscale/libmariadbbackend.so
2022-01-06 19:08:12   notice : Loaded module mariadbbackendauth: V1.0.0 from /usr/lib/x86_64-linux-gnu/maxscale/libmariadbbackendauth.so
2022-01-06 19:08:12   notice : Loaded module mariadbauth: V1.1.0 from /usr/lib/x86_64-linux-gnu/maxscale/libmariadbauth.so
2022-01-06 19:08:12   notice : Encrypted password file /maxscale/data/.secrets can't be accessed (No such file or directory). Password encryption is not used.
2022-01-06 19:08:12   notice : Started REST API on [0.0.0.0]:8989
2022-01-06 19:08:12   notice : MaxScale started with 8 worker threads, each with a stack size of 8388608 bytes.
2022-01-06 19:08:12   notice : Starting a total of 2 services...
2022-01-06 19:08:12   error  : [MariaDBAuth] [master-service] Failed to connect to server 'mariadb-0.mariadb' ([mariadb-0.mariadb]:3306) when checking authentication user credentials and permissions: 2005 Unknown MySQL server host 'mariadb-0.mariadb' (-2)
2022-01-06 19:08:12   notice : Server 'mariadb-1.mariadb' charset: utf8mb4
2022-01-06 19:08:12   notice : Server 'mariadb-1.mariadb' version: 10.4.17-MariaDB-1:10.4.17+maria~focal-log
2022-01-06 19:08:12   error  : [MariaDBAuth] Failure loading users data from backend [mariadb-0.mariadb:3306] for service [master-service]. MySQL error 2005, Unknown MySQL server host 'mariadb-0.mariadb' (-2)
2022-01-06 19:08:12   notice : [MariaDBAuth] [master-service] Loaded 14 MySQL users for listener 'master-listener' from server 'mariadb-1.mariadb' with checksum 0x1351e17b.
2022-01-06 19:08:12   notice : Listening for connections at [::]:3306
2022-01-06 19:08:12   notice : Service 'master-service' started (1/2)
2022-01-06 19:08:12   error  : [MariaDBAuth] [slave-service] Failed to connect to server 'mariadb-0.mariadb' ([mariadb-0.mariadb]:3306) when checking authentication user credentials and permissions: 2005 Unknown MySQL server host 'mariadb-0.mariadb' (-2)
2022-01-06 19:08:12   error  : [MariaDBAuth] Failure loading users data from backend [mariadb-0.mariadb:3306] for service [slave-service]. MySQL error 2005, Unknown MySQL server host 'mariadb-0.mariadb' (-2)
2022-01-06 19:08:12   notice : [MariaDBAuth] [slave-service] Loaded 12 MySQL users for listener 'slave-listener' from server 'mariadb-1.mariadb' with checksum 0x51a2b091.
2022-01-06 19:08:12   notice : Listening for connections at [::]:3307
2022-01-06 19:08:12   notice : Service 'slave-service' started (2/2)
2022-01-06 19:08:12   notice : Loaded server states from journal file: /maxscale/data/cluster-monitor/monitor.dat
2022-01-06 19:08:12   error  : Monitor was unable to connect to server mariadb-0.mariadb[mariadb-0.mariadb:3306] : ''
2022-01-06 19:08:12   warning: [mariadbmon] 'mariadb-1.mariadb' is a better master candidate than the current master 'mariadb-0.mariadb'. Master will change when 'mariadb-0.mariadb' is no longer a valid master.
2022-01-06 19:08:12   notice : Server changed state: mariadb-0.mariadb[mariadb-0.mariadb:3306]: master_down. [Master, Running] -> [Down]
2022-01-06 19:08:12   notice : Server changed state: mariadb-1.mariadb[mariadb-1.mariadb:3306]: lost_slave. [Slave, Running] -> [Running]
2022-01-06 19:08:37   warning: [mariadbmon] The current master server 'mariadb-0.mariadb' is no longer valid because it has been down over 5 (failcount) monitor updates and it does not have any running slaves. Selecting new master server.
2022-01-06 19:08:37   warning: [mariadbmon] 'mariadb-0.mariadb' is not a valid master candidate because it is down.
2022-01-06 19:08:37   notice : [mariadbmon] Setting 'mariadb-1.mariadb' as master.
2022-01-06 19:08:37   notice : Server changed state: mariadb-1.mariadb[mariadb-1.mariadb:3306]: new_master. [Running] -> [Master, Running]

Is there a configuration setting we're missing or not understanding?  I don't believe any automatic failover should occur.

Thanks!

Darcy
Reply all
Reply to author
Forward
0 new messages