MySQL Galera pacemaker does not promote to master when master node fails

2,136 views
Skip to first unread message

Oscar Segarra

unread,
Oct 17, 2016, 5:16:22 PM10/17/16
to codership
Hi,

I have configured galera cluster with the following configuration:

[root@vdiccs01 ~]# pcs status
Cluster name: vdic-storage-cluster
Last updated: Mon Oct 17 23:00:58 2016          Last change: Mon Oct 17 23:00:41 2016 by hacluster via crmd on vdiccs01
Stack: corosync
Current DC: vdiccs01 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
2 nodes and 10 resources configured
Online: [ vdiccs01 vdiccs02 ]
Full list of resources:
 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ vdiccs01 vdiccs02 ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ vdiccs01 vdiccs02 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ vdiccs01 vdiccs02 ]
 Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
     Started: [ vdiccs01 vdiccs02 ]
 Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
     Masters: [ vdiccs02 ]
     Slaves: [ vdiccs01 ]
PCSD Status:
  vdiccs01: Online
  vdiccs02: Online
Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@vdiccs01 ~]#

Now, If I poweroff vdiccs01 (Slave) and I start it again I can notice mysqld is not started automatically in node vdiccs01:

Failed Actions:
* vdic-galera-cluster_promote_0 on vdiccs02 'unknown error' (1): call=59, status=complete, exitreason='Failure, Attempted to promote Master instance of vdic-galera-cluster before bootstrap node has been detected.',
    last-rc-change='Mon Oct 17 23:06:04 2016', queued=0ms, exec=72ms


I have made some tests and with this configuration:

Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
    Masters: [ vdiccs02 ]
    Slaves: [ vdiccs01 ]

I get the same error stopping node vdiccs01 or vdiccs02.

Any help will be welcome!

Thanks a lot.

Jervin R

unread,
Oct 17, 2016, 7:50:07 PM10/17/16
to codership
Which resource agent are you using? It sounds like as if you are expecting MySQL to "start" during a failover, the second node is not yet running. If this is the case, it will definitely not start since Galera requires either all nodes to be up and part of the cluster or bootstrapped.

Oscar Segarra

unread,
Oct 18, 2016, 2:46:41 AM10/18/16
to Jervin R, codership

Hi Jervin,

I'm using the default one coming with galera. ofc:heartbeat:galera.

https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/galera.

I'd like to clarify that at the beginning all mysql daemons are running. After a reboot of a node, the promotion error I have sent appears.

Thanks a lot

Jervin R

unread,
Oct 18, 2016, 3:31:19 AM10/18/16
to codership
Oscar,

OK somehow I missed the part you only have 2 nodes in this cluster. If you "POWER OFF" the machine and caused MySQL to shutdown unclean, you are likely to be losing quorum. Check your the surviving node logs to confirm this. Add a 3rd node to this cluster and try the power off test again. :)

Jervin R

unread,
Oct 18, 2016, 5:16:07 AM10/18/16
to codership
Oscar,

This setting only affects corosync quorum - not PXC which has its own quorum resolution.

On Tue, Oct 18, 2016 at 4:37 PM, Oscar Segarra
 wrote:
Hi Jervin, 

Yes, but I have set this parameter: 

no-quorum-policy=ignore

In order to make cluster work properly with two nodes. 

To be exact, what I do is to shutdown the phisical interface of the node because I want to simulate a scenario of lost connectivity.

Thanks a lot and sorry for the lack of detail in my explanations. 

James Wang

unread,
Oct 18, 2016, 11:29:04 AM10/18/16
to codership
Is there a load balancer in your architecture please?

Can not the load balancer do the auto-failover once it detects vdiccs02 is down?

Oscar Segarra

unread,
Oct 18, 2016, 6:35:27 PM10/18/16
to codership
Hi, 

I have added a new node vdiccs03:

3 nodes and 15 resources configured

Online: [ vdiccs01 vdiccs02 vdiccs03 ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
     Masters: [ vdiccs01 ]
     Slaves: [ vdiccs02 vdiccs03 ]

Failed Actions:
* nfs-grace_monitor_0 on vdiccs03 'unknown error' (1): call=16, status=complete, exitreason='none',
    last-rc-change='Tue Oct 18 23:41:37 2016', queued=0ms, exec=112ms


PCSD Status:
  vdiccs01: Online
  vdiccs02: Online
  vdiccs03: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

If I make a graceful poweroff of node vdiccs01 (Master)

3 nodes and 15 resources configured

Online: [ vdiccs02 vdiccs03 ]
OFFLINE: [ vdiccs01 ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ vdiccs02 vdiccs03 ]
     Stopped: [ vdiccs01 ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ vdiccs02 vdiccs03 ]
     Stopped: [ vdiccs01 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ vdiccs02 vdiccs03 ]
     Stopped: [ vdiccs01 ]
 Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
     Started: [ vdiccs02 vdiccs03 ]
     Stopped: [ vdiccs01 ]
 Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
     Slaves: [ vdiccs03 ]
     Stopped: [ vdiccs01 vdiccs02 ]

Failed Actions:
* vdic-galera-cluster_demote_0 on vdiccs02 'unknown error' (1): call=44, status=Timed Out, exitreason='Failure, Attempted to promote Master instance of vdic-galera-cluster before bootstrap node has been detected.',
    last-rc-change='Wed Oct 19 00:22:59 2016', queued=0ms, exec=120011ms

PCSD Status:
  vdiccs01: Offline
  vdiccs02: Online
  vdiccs03: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Looks system is not able to promote the next node to Master.

Some questions: 
1.- Service mariadb must be enabled or disabled in all nodes?
2.- Service mysql must be enabled or disabled in all nodes?

Thanks a lot.


Damien Ciabrini

unread,
Oct 19, 2016, 5:13:11 AM10/19/16
to Oscar Segarra, codership
Hi Oscar,

What does crm_admin -1A show? I'm looking for galera-* attributes

Also, I co-maintain the galera resource agent upstream [1]. It may be worth
opening a bug there and provide sosreports so that I can have a look at
the sequence which led to bootstrap failure.

[1] https://github.com/ClusterLabs/resource-agents/

--
Damien
> *If I make a graceful poweroff of node vdiccs01 (Master)*
>
> 3 nodes and 15 resources configured
>
> Online: [ vdiccs02 vdiccs03 ]
> OFFLINE: [ vdiccs01 ]
>
> Full list of resources:
>
> Clone Set: nfs_setup-clone [nfs_setup]
> Started: [ vdiccs02 vdiccs03 ]
> Stopped: [ vdiccs01 ]
> Clone Set: nfs-mon-clone [nfs-mon]
> Started: [ vdiccs02 vdiccs03 ]
> Stopped: [ vdiccs01 ]
> Clone Set: nfs-grace-clone [nfs-grace]
> Started: [ vdiccs02 vdiccs03 ]
> Stopped: [ vdiccs01 ]
> Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
> Started: [ vdiccs02 vdiccs03 ]
> Stopped: [ vdiccs01 ]
> Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
> Slaves: [ vdiccs03 ]
> Stopped: [ vdiccs01 vdiccs02 ]
>
> Failed Actions:
> ** vdic-galera-cluster_demote_0 on vdiccs02 'unknown error' (1): call=44,
> status=Timed Out, exitreason='Failure, Attempted to promote Master instance
> of vdic-galera-cluster before bootstrap node has been detected.',*
> * last-rc-change='Wed Oct 19 00:22:59 2016', queued=0ms, exec=120011ms*
>
> PCSD Status:
> vdiccs01: Offline
> vdiccs02: Online
> vdiccs03: Online
>
> Daemon Status:
> corosync: active/enabled
> pacemaker: active/enabled
> pcsd: active/enabled
>
> Looks system is not able to promote the next node to Master.
>
> Some questions:
> 1.- Service mariadb must be enabled or disabled in all nodes?
> 2.- Service mysql must be enabled or disabled in all nodes?
>
> Thanks a lot.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "codership" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to codership-tea...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Oscar Segarra

unread,
Oct 20, 2016, 9:21:19 AM10/20/16
to codership, oscar....@gmail.com
Hi Damien, 

In my Centos 7 platform there is no crm_admin in /usr/sbin. I've found crmadmin:

[root@vdiccs01 sbin]# ls crm*
crm  crmadmin  crm_attribute  crm_diff  crm_error  crm_failcount  crm_master  crm_mon  crm_node  crm_report  crm_resource  crm_shadow  crm_simulate  crm_standby  crm_ticket  crm_verify

[root@vdiccs01 sbin]# crmadmin -1A
crmadmin: invalid option -- '1'
crmadmin - Development tool for performing some crmd-specific commands.
  Likely to be replaced by crm_node in the future
Usage: crmadmin command [options]
Options:
 -?, --help             This text
 -$, --version          Version information
 -q, --quiet            Display only the essential query information
 -V, --verbose          Increase debug output

Commands:
 -i, --debug_inc=value  Increase the crmd's debug level on the specified host
 -d, --debug_dec=value  Decrease the crmd's debug level on the specified host
 -S, --status=value     Display the status of the specified node.

        Result is the node's internal FSM state which can be useful for debugging

 -D, --dc_lookup        Display the uname of the node co-ordinating the cluster.

        This is an internal detail and is rarely useful to administrators except when deciding on which node to examine the logs.

 -N, --nodes            Display the uname of all member nodes
 -E, --election (Advanced) Start an election for the cluster co-ordinator
 -K, --kill=value       (Advanced) Shut down the crmd (not the rest of the clusterstack ) on the specified node

Additional Options:
 -t, --timeout=value    Time (in milliseconds) to wait before declaring the operation failed
 -B, --bash-export      Create Bash export entries of the form 'export uname=uuid'

Notes:
 The -i,-d,-K and -E commands are rarely used and may be removed in future versions.


Nevertheless, before opening formerly a bug I'd like to be sure my configuration is correct.

In my system I've installed mariadb 1:5.5.50-1.el7_2 --> This creates two services mariadb and mysql what can be enabled or disabled to be launched on boot or not.

systemctl enable mariadb
systemctl enable mysql 

In this configuration, which services have to be enabled or disabled in order to be managed completely by the agent?

Relating to galera attributes in /etc/my.cnf.d/server-vdic.conf (it is analogous for vdiccs01, vdiccs02 and vdiccs03)

[galera]
# Mandatory settings
wsrep_on=ON
wsrep_provider="/usr/lib64/galera/libgalera_smm.so"
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
binlog_format=row
wsrep_cluster_address="gcomm://"
bind-address=0.0.0.0
wsrep_node_name="vdiccs01-galera-node"
wsrep_cluster_name="vdic-galera-cluster"
innodb_doublewrite=1
query_cache_size=0
wsrep_node_address=192.168.200.101

Thanks a lot.

Damien Ciabrini

unread,
Oct 20, 2016, 9:33:33 AM10/20/16
to Oscar Segarra, codership


----- Original Message -----
> Hi Damien,
>
> In my Centos 7 platform there is no crm_admin in /usr/sbin. I've found
> crmadmin:

Sorry, I meant crm_mon -1A :)

>
> [root@vdiccs01 sbin]# ls crm*
> crm crmadmin crm_attribute crm_diff crm_error crm_failcount
> crm_master crm_mon crm_node crm_report crm_resource crm_shadow
> crm_simulate crm_standby crm_ticket crm_verify
>
> *[root@vdiccs01 sbin]# crmadmin -1A*
> *crmadmin: invalid option -- '1'*
mysql service is a symlink to mariadb, and both service need to be disabled
if you intend to manage galera via pacemaker.

Then you'd create a galera resource in pacemaker with something like:
pcs resource create galera galera enable_creation=true wsrep_cluster_address='gcomm://vdiccs01,vdiccs02,vdiccs03' meta master-max=3 ordered=true --master

the pacemaker resource agent would take care of bootstrapping the galera cluster automatically.

> Relating to galera attributes in /etc/my.cnf.d/server-vdic.conf (it is
> analogous for vdiccs01, vdiccs02 and vdiccs03)
>
> [galera]
> # Mandatory settings
> wsrep_on=ON
> wsrep_provider="/usr/lib64/galera/libgalera_smm.so"
> default_storage_engine=InnoDB
> innodb_autoinc_lock_mode=2
> binlog_format=row
> wsrep_cluster_address="gcomm://"
I would remove the line above as if for any reason you start the cluster via systemctl, that would bootstrap a new cluster
> bind-address=0.0.0.0
> wsrep_node_name="vdiccs01-galera-node"
That line above is not necessary for bootstrapping the cluster
> wsrep_cluster_name="vdic-galera-cluster"
> innodb_doublewrite=1
> query_cache_size=0
> wsrep_node_address=192.168.200.101
>
> Thanks a lot.

If you need additional help for setting up the galera cluster under pacemaker,
you can join #clusterlabs on freenode.

Oscar Segarra

unread,
Oct 25, 2016, 3:09:24 PM10/25/16
to codership, oscar....@gmail.com
Hi Damien, 

Sorry for the delay in my response, I've been to hospital with my wife.

I have set master-max=100 and ordered=true

I startup ONLY first node vdiccs01:

[root@vdiccs01 ~]# crm_mon -1A
Last updated: Tue Oct 25 20:44:39 2016          Last change: Tue Oct 25 20:40:31 2016 by root via crm_attribute on vdiccs02
Stack: corosync
Current DC: vdiccs01 (version 1.1.13-10.el7_2.4-44eb2dd) - partition WITHOUT quorum
3 nodes and 15 resources configured

Online: [ vdiccs01 ]
OFFLINE: [ vdiccs02 vdiccs03 ]

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ vdiccs01 ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ vdiccs01 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ vdiccs01 ]
 Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
     Started: [ vdiccs01 ]
 Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
     Slaves: [ vdiccs01 ]

Node Attributes:
* Node vdiccs01:
    + ganesha-active                    : 1
    + grace-active                      : 1
    + vdic-galera-cluster-last-committed        : 30844


Now, I start second node vdiccs02:

[root@vdiccs01 ~]# crm_mon -1A
Last updated: Tue Oct 25 20:50:23 2016          Last change: Tue Oct 25 20:48:39 2016 by root via crm_attribute on vdiccs02
Stack: corosync
Current DC: vdiccs01 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
3 nodes and 15 resources configured

Online: [ vdiccs01 vdiccs02 ]
OFFLINE: [ vdiccs03 ]

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ vdiccs01 vdiccs02 ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ vdiccs01 vdiccs02 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ vdiccs01 vdiccs02 ]
 Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
     Started: [ vdiccs01 vdiccs02 ]
 Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
     Slaves: [ vdiccs01 vdiccs02 ]

Node Attributes:
* Node vdiccs01:
    + ganesha-active                    : 1
    + grace-active                      : 1
    + vdic-galera-cluster-last-committed        : 30844
* Node vdiccs02:
    + ganesha-active                    : 1
    + grace-active                      : 1
    + vdic-galera-cluster-last-committed        : 30844

In this scenario is expected to start both nodes, isn't it?

In logs:

galera(vdic-galera-cluster)[13209]:     2016/10/25_20:52:04 INFO: Waiting on node <vdiccs03> to report database status before Master instances can start.
Oct 25 20:52:08 [1601] vdiccs01        cib:     info: cib_process_ping: Reporting our current digest to vdiccs01: 91c1a4fa3838bf90335cad337bb6b69e for 0.220.16 (0xf42f70 0)
Oct 25 20:52:12 [1601] vdiccs01        cib:     info: cib_process_request:      Completed cib_modify operation for section nodes: OK (rc=0, origin=vdiccs02/crm_attribute/4, version=0.220.16)
Oct 25 20:52:13 [1601] vdiccs01        cib:     info: cib_process_request:      Forwarding cib_modify operation for section nodes to master (origin=local/crm_attribute/4)
Oct 25 20:52:13 [1601] vdiccs01        cib:     info: cib_process_request:      Completed cib_modify operation for section nodes: OK (rc=0, origin=vdiccs01/crm_attribute/4, version=0.220.16)

It looks all nodes must be available before cluster starts mysql daemons in nodes.

[root@vdiccs01 ~]# crm_mon -1A
Last updated: Tue Oct 25 20:58:16 2016          Last change: Tue Oct 25 20:54:29 2016 by hacluster via cibadmin on vdiccs01
Stack: corosync
Current DC: vdiccs01 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
3 nodes and 15 resources configured

Online: [ vdiccs01 vdiccs02 vdiccs03 ]

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
     Started: [ vdiccs01 vdiccs02 vdiccs03 ]
 Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
     Masters: [ vdiccs01 vdiccs02 vdiccs03 ]

Node Attributes:
* Node vdiccs01:
    + ganesha-active                    : 1
    + grace-active                      : 1
    + master-vdic-galera-cluster        : 100
* Node vdiccs02:
    + ganesha-active                    : 1
    + grace-active                      : 1
    + master-vdic-galera-cluster        : 100
* Node vdiccs03:
    + ganesha-active                    : 1
    + grace-active                      : 1
    + master-vdic-galera-cluster        : 100

My conclusion:

If I set master-max=1 system is able to start mysqld daemon in vdiccs01 leaving vdiccs02 and vdiccs03 as slave. If I poweroff that node (vdiccs01) the remaining nodes vdiccs02 and vdiccs03 require that node vdiccs01 to be started in order to get the most recent commit.

That is, as looks cluster works, master-max=1 can never work, at least it must be set to master-max=2 in order that system relays on a running mysql daemon controlling the last commit number.

My question:

Is this correct? 
Is there any way to configure system with master-max=1 ?
Is there any way to force stickiness of master in any special node?

thanks a lot


Damien Ciabrini

unread,
Oct 26, 2016, 4:30:36 AM10/26/16
to Oscar Segarra, codership


----- Original Message -----
> Hi Damien,
>
> Sorry for the delay in my response, I've been to hospital with my wife.
>
> I have set master-max=100 and ordered=true
>
> I startup ONLY first node vdiccs01:
>
> [root@vdiccs01 ~]# crm_mon -1A
> Last updated: Tue Oct 25 20:44:39 2016 Last change: Tue Oct 25
> 20:40:31 2016 by root via crm_attribute on vdiccs02
> Stack: corosync
> Current DC: vdiccs01 (version 1.1.13-10.el7_2.4-44eb2dd) - partition
> WITHOUT quorum
> 3 nodes and 15 resources configured
>
> Online: [ vdiccs01 ]
> OFFLINE: [ vdiccs02 vdiccs03 ]
>
> Clone Set: nfs_setup-clone [nfs_setup]
> Started: [ vdiccs01 ]
> Clone Set: nfs-mon-clone [nfs-mon]
> Started: [ vdiccs01 ]
> Clone Set: nfs-grace-clone [nfs-grace]
> Started: [ vdiccs01 ]
> Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
> Started: [ vdiccs01 ]
> Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
> *Slaves: [ vdiccs01 ]*
>
> Node Attributes:
> * Node vdiccs01:
> + ganesha-active : 1
> + grace-active : 1
> + vdic-galera-cluster-last-committed : 30844
>
>
> Now, I start second node vdiccs02:
>
> [root@vdiccs01 ~]# crm_mon -1A
> Last updated: Tue Oct 25 20:50:23 2016 Last change: Tue Oct 25
> 20:48:39 2016 by root via crm_attribute on vdiccs02
> Stack: corosync
> Current DC: vdiccs01 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with
> quorum
> 3 nodes and 15 resources configured
>
> Online: [ vdiccs01 vdiccs02 ]
> OFFLINE: [ vdiccs03 ]
>
> Clone Set: nfs_setup-clone [nfs_setup]
> Started: [ vdiccs01 vdiccs02 ]
> Clone Set: nfs-mon-clone [nfs-mon]
> Started: [ vdiccs01 vdiccs02 ]
> Clone Set: nfs-grace-clone [nfs-grace]
> Started: [ vdiccs01 vdiccs02 ]
> Clone Set: vdic-nfs-cluster-clone [vdic-nfs-cluster]
> Started: [ vdiccs01 vdiccs02 ]
> Master/Slave Set: vdic-galera-cluster-master [vdic-galera-cluster]
> *Slaves: [ vdiccs01 vdiccs02 ]*
>
> Node Attributes:
> * Node vdiccs01:
> + ganesha-active : 1
> + grace-active : 1
> + vdic-galera-cluster-last-committed : 30844
> * Node vdiccs02:
> + ganesha-active : 1
> + grace-active : 1
> + vdic-galera-cluster-last-committed : 30844
>
> In this scenario is expected to start both nodes, isn't it?
>
> In logs:
>
> galera(vdic-galera-cluster)[13209]: 2016/10/25_20:52:04 INFO:* Waiting
> on node <vdiccs03> to report database status before Master instances can
> start.*
> *My conclusion:*
>
> If I set master-max=1 system is able to start mysqld daemon in vdiccs01
> leaving vdiccs02 and vdiccs03 as slave. If I poweroff that node (vdiccs01)
> the remaining nodes vdiccs02 and vdiccs03 require that node vdiccs01 to be
> started in order to get the most recent commit.
>

master-max should be set the 3, that's just a hint to pacemaker for how many
galera servers it's allowed to spawn. In your case, you want 1 galera server
to be spawned per host, hence master-nmax=3

> That is, as looks cluster works, master-max=1 can never work, at least it
> must be set to master-max=2 in order that system relays on a running mysql
> daemon controlling the last commit number.
>
> *My question:*
>
> Is this correct?

Well, you have it correct that in order to bootstrap the galera cluster with
pacemaker, the galera resource agent expects all nodes to be available for
fetching the last seqno before determining which node to bootstrap the cluster
from.

> Is there any way to configure system with master-max=1 ?

You don't want it to happen, that would mean only one galera node allowed.

> Is there any way to force stickiness of master in any special node?

You don't want it either, because should you need to rebootstrap, you want
to make sure you start the cluster from the most recent node.

What you may want however is a means to override the automatic boot sequence,
say in the case where you know that a node is out for maintenance and its not
the most recent node. In that case you could follow this procedure to force
bootstrap manually:

http://damien.ciabrini.name/posts/2015/10/galera-boot-process-in-open-stack-ha-and-manual-override.html

However please make sure you understand how pacemakefr boot process works so
you don't risk losing data when forcing bootstrap manually :)


>
> *thanks a lot*
Reply all
Reply to author
Forward
0 new messages