Thank you for your answer. I understand the importance of SST and the need
for it to run when the nodes are inconsistent and that is the way I use
(xtrabackup) it on another production cluster with ~120GB of DATA.
From what I am seeing examining the sst scripts it seems IST is done there
on CASE BYPASS so a good idea would be to modify one of the scripts to work
for BYPASS and fail with an error on full SST so I can handle the data
consistency manually from there.
Alexey can you please confirm the above ?
On Thursday, September 27, 2012 9:41:29 PM UTC+1, Henrik Ingo wrote:
> Ilias
> My point is, rather than leaving wsrep_sst_method=skip, you should
> leave it to something else so that SST will *fail* and the node is not
> allowed to return to cluster. Now with skip method, the node will
> "succeed" in joining the cluster but will still not have the same
> data.
> As a quick and dirty solution, I would set wsrep_sst_method=rsync and
> then uninstall rsync from the servers, so then SST will fail if it is
> tried. A nicer solution of course is to create your own sst script (or
> ask Codership to) that will just return error immediately. (Heh, that
> would then be wsrep_sst_method=fail :-)
> Alex: You didn't answer the actual question: Will IST be used even
> when wsrep_sst_method=skip? (I assume yes, but I've been wrong
> before...)
> henrik
> On Thu, Sep 27, 2012 at 3:22 PM, Ilias Bertsimas <awar...@gmail.com<javascript:>>
> wrote:
> > Hello Henrik,
> > Yes I know the purpose of skip sst method is for setting up a cluster
> > manually.
> > The only reason I use it is because I do not want an sst to happen under
> any
> > circumstances and it happens once it can't do an IST and sometimes it
> can
> > happen without really needed based on my experience.
> > An SST is impractical on a 5TB dataset.
> > I have a big enough gcache size to cover at least 12 hours of data
> changes.
> > Thanks!
> > On Thursday, September 27, 2012 1:11:04 PM UTC+1, Henrik Ingo wrote:
> >> On Thu, Sep 27, 2012 at 2:28 PM, Ilias Bertsimas <awar...@gmail.com>
> >> wrote:
> >> > I have a galera cluster with a huge amount of data where a full SST
> >> > would be
> >> > pointless at it will take 3-4 days plus the amount of time needed to
> >> > apply
> >> > the new writesets to catch up.
> >> > I have set cluster's wsrep_sst_method to skip but it is not clear if
> it
> >> > will
> >> > skip IST as well.
> >> Actually, I don't think you are supposed to use the skip method as a
> >> permanent setting. If I understood correctly, Percona developed it to
> >> be used when initially starting the cluster. In this case you could
> >> manually restore the same data to all nodes, so you know they are in
> >> the same state before you start any nodes at all.
> >> Otoh if you have a running cluster and some node is disconnected long
> >> enough to need an SST, then you can't leave wsrep_sst_method to skip
> >> since the node would then have inconsistent data.
> >> > Can someone confirm how it will react if it needs an IST ?
> >> No. (I have my guess, but that's not what you want, so I'll leave to
> >> Codership guys to confirm.)
> >> But referring to what I said above, you should just make sure that
> >> your gcache.size is large enough that SST never needs to happen. And
> >> if a node is disconnected long enough that IST won't work, then you
> >> are back to square one.
> >> henrik
> >> --
> >> henri...@avoinelama.fi
> >> +358-40-8211286 skype: henrik.ingo irc: hingo
> >> www.openlife.cc
> >> My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559
> > --
> --
> henri...@avoinelama.fi <javascript:>
> +358-40-8211286 skype: henrik.ingo irc: hingo
> www.openlife.cc
> My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559