Hi Simon,
What exactly went wrong there (and something went wrong apparently)
should be seen from mysql error log and sst.err in the datadir. rsync
SST requires getting global read lock on the server and it may take some
time.
> When I have a look at the
> script '/usr/bin/wsrep_sst_rsync', I can see the sst credentials are
> passed as argument into the variable AUTH but this variable is never
> used again. In the end, the rsync call happens as user 'mysql', which
> is actually not what I wanted.
Authentication field is ignored by default rsync SST script - i.e. it
goes unprotected because of performance and simplicity considerations:
we needed a script that is generic enough. We felt that setting up
stunnel or passwordless SSH access would complicate the matter which is
complex enough already.
> So my question:
> - Is wsrep_sst_rsync intended to work with a user other than 'mysql'?
I think it is a normal security practice that the child process cannot
elevate privileges. Since sst script is called by mysqld, it will run as
whatever user mysqld is running as.
> - If yes, is this a bug in the script so the credentials aren't used?
Nope, it is just how it was initially conceived. Note that from the
script interface POV authentication credentials is just a string. It can
be anything, including SSH keys.
> - Is it recommended to exchange SSH keys?
Frankly, I'm not exactly sure how SSH keys would work with 'mysql' user
which normally does not have a home directory. And I guess you will need
to rewrite the script somewhat, because currently joiner starts rsync in
server mode. You'll want to disable this.
Too bad rsync does not support SSL - we could have used the same keys
as for mysql client and Galera replication.
Regards,
Alex
> Thx
> Simon
Well, actually it was intentionally simplified to be run without
customization. The fact that it didn't means something wrong in
configuration or system setup, e.g. firewall.
> It shouldn't be a problem for me to modify it to get it working.
> Actually, the rsync call would copy the whole /var/lib/mysql
> directory, which I think is too much. Are there any files more than
> the innodb files which have to be copied to the new node? E.g. the
> galera.cache or grastate.dat file?$
I think you're mistaken, rsync should copy only required files, there
is a filter for that in the script. In particular, only innodb data and
log files and schema directories should be copied.
> My idea is to request a lock, then creating a lvm snapshot, releasing
> the lock again and using the lvm snapshot to transfer the innodb data
> to the joining node.
>
> Any further input?
I'd strongly recommend to use the provided rsync script as a base
instead of writing your own from scratch. In particular it demonstrates
how to request a lock: echo "flush tables" and to release it: echo
"continue"
Best regards,
Alex
>
> Thx
> Simon
--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011
According to the logs donor believes that it has successfully
transferred the data, and I have little doubt about it.
Joiner however seems to have its log cut too short. Normally you should
see InnoDB recovering after that. One suspicion that I have is that data
was transferred to a wrong directory, but I don't really see how it is
possible. You should also have sst.err files on both nodes, the may
contain additional information about what happened.
One more thing to worry about is
120214 13:33:13 [Warning] WSREP: last inactive check more than PT1.5S
ago, skipping check
during state transfer. This usually means that the node was swapping so
hard that Galera main loop could not get control for several seconds. Be
it swapping or just heavy reading, the node is pretty much
non-operational in this condition, it is very-very slow and probably
should be given some time to process events. Given that those messages
continue till the very end of the joiner log, did you give it enough
time to process SST before sending the log?
In general such messages is a sign of serious node misconfiguration,
and since joiner has not yet initiated storage engines, it is probably
caused by third programs. The cause should be found and eliminated,
running the cluster with such a node is pointless.
Regards,
Alex
> Thank you very much
> Simon
--
Hi Simon,
This is all very mysterious and I must say that there is not enough
information in logs to make any conclusions. Could you confirm that
state transfer is really happening by removing _all_ contents of
/var/lib/mysql on joiner and trying again? Is there anything
non-standard in joiner configuration, have you modified wsrep_sst_rsync
scripts on joiner or on donor in any way? Can you confirm that rsync is
of the same version on joiner and on donor (so far this was the sole
cause of problems with rsync SST)?
Kind regards,
Alex
Can you see /var/lib/mysql/rsync_sst_complete file at either of the
nodes?
Is rsync process still running on joiner after donor completed his
part?
> I didn't touch the wsrep_sst_rsync script...
> Is there a way to debug this process? Setting wsrep_debug=1 didn't
> really give more informations.
You could change
#!/bin/bash -ue
to
#!/bin/bash -uex
- should log to sst.err. But there's not much chance that it'll help.
> Note: In the state where the joiner is synced with the group but
> nothing happens, I can't stop mysql properly, I always have to kill
> it...
Yes, that's the way it goes. It is very hard to shut down the server in
a "clean" way at that moment, so it is not implemented yet.
Regards,
Alex
> Thx
We have the pleasure to announce, that Codership partners with FromDual
to offer consulting and support services for Galera Cluster for MySQL.
You do not only have a great product like Galera Cluster but now you
also have the appropriate services for it!
If you are interested in support, traing or consulting related to Galera
Cluster for MySQL or MySQL itself do not hesitate to contact us.
Regards,
Oli
--
FromDual - Vendor independent and neutral MySQL consulting.
Oli Sennhauser CEO / Senior Consultant
Phone: +41 44 940 24 82 Mobile: +41 79 830 09 33
oli.sen...@fromdual.com http://www.fromdual.com
Skype: fromdual Twitter: fromdual