Galera cluster does not consider safe_to_bootstrap

612 views
Skip to first unread message

Oscar Segarra

unread,
Jan 3, 2017, 4:38:49 PM1/3/17
to codership
Hi, 

I have configured a mariadb cluster using the latest available version. My cluster has 3 nodes (vdicdb01-priv, vdicdb02-priv, vdicdb03-priv). 

But looks the resource agent code does not consider the new parameter: safe_to_bootstrap


Anybody has found a solution for this issue?

Thanks a lot.

Damien Ciabrini

unread,
Jan 6, 2017, 3:31:50 AM1/6/17
to Oscar Segarra, codership
Hey Oscar,

Thanks for the link, I wasn't aware of that. I'll work on the resource
agent upstream to add support for that new flag.

In the short term, is that really an issue, i.e. does the cluster refused
to be bootstrapped? I'm asking because the resource agent already
implements logics to ensure that the bootstrap node it's selecting
is "safe to bootstrap" from, based on the last seqno retrieved from all
nodes.


----- Original Message -----
> Hi,
>
> I have configured a mariadb cluster using the latest available version. My
> cluster has 3 nodes (vdicdb01-priv, vdicdb02-priv, vdicdb03-priv).
>
> *But looks the resource agent code does not consider the new parameter: *
> safe_to_bootstrap
>
> http://galeracluster.com/2016/11/introducing-the-safe-to-bootstrap-feature-in-galera-cluster/
>
> Anybody has found a solution for this issue?
>
> Thanks a lot.
>
> --
> You received this message because you are subscribed to the Google Groups
> "codership" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to codership-tea...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Oscar Segarra

unread,
Jan 8, 2017, 8:43:24 AM1/8/17
to codership, oscar....@gmail.com
Hi Damien, 

Yes, the cluster refused to be bootstrapped with the following error:

2016-11-07 01:49:19 5572 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node.
It was not the last one to leave the cluster and may not contain all the updates.
To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .

As it is explained in the following note:


"In the case when all nodes crashed simultaneously, no node will be considered safe to bootstrap until the grastate.dat file is edited manually."

If you have to edit the resource agent, I'd suggest to add the following lines:

/usr/lib/ocf/resource.d/heartbeat/galera

detect_last_commit()
{
    local last_commit
    local recover_args="--defaults-file=$OCF_RESKEY_config \
                        --pid-file=$OCF_RESKEY_pid \
                        --socket=$OCF_RESKEY_socket \
                        --datadir=$OCF_RESKEY_datadir \
                        --user=$OCF_RESKEY_user"
    local recovered_position_regex='s/.*WSREP\:\s*[R|r]ecovered\s*position.*\:\(.*\)\s*$/\1/p'

    ocf_log info "attempting to detect last commit version by reading ${OCF_RESKEY_datadir}/grastate.dat"
    last_commit="$(cat ${OCF_RESKEY_datadir}/grastate.dat | sed -n 's/^seqno.\s*\(.*\)\s*$/\1/p')"
    if [ -z "$last_commit" ] || [ "$last_commit" = "-1" ]; then
        local tmp=$(mktemp)
        local tmperr=$(mktemp)

        # if we pass here because grastate.dat doesn't exist,
        # try not to bootstrap from this node if possible
        if [ ! -f ${OCF_RESKEY_datadir}/grastate.dat ]; then
            set_no_grastate
        fi

        ocf_log info "now attempting to detect last commit version using 'mysqld_safe --wsrep-recover'"

        ${OCF_RESKEY_binary} $recover_args --wsrep-recover > $tmp 2> $tmperr

        mysql_log_file=`cat $tmp`
        mysql_log_file=/${mysql_log_file#*/}
        mysql_log_file=${mysql_log_file%\'\.}

        last_commit="$(cat $mysql_log_file | sed -n $recovered_position_regex)"


Because $tmp does not contain last commit sequence, it just contains the name of the real log file where it is logging.

Thanks a lot.




Reply all
Reply to author
Forward
0 new messages