xtrabackup and recovery of a slave

146 views

Skip to first unread message

Todd Lyons

unread,

Feb 2, 2010, 7:06:32 PM2/2/10

to percona-discussion

Should I be turning off replication while I'm doing an extrabackup of
a slave? I'm having issues starting up a new slave with data backed
up by innobackupex.

* I have repeated this sequence about 5 times now, with the same
result each time:
I'm trying to bring up a new slave2 (percona 5.1.39) using an
xtrabackup of slave1 (5.0.77) with innobackupex --copy-back. It copies
the data with no problems (it has already had --prepare run on it as
part of the nightly cron job). Mysql-percona 5.1.39 on slave2 starts
up with no problems. (At this point I have not run mysql_upgrade, but
running it seems to make no difference to the error I will see.) The
my.cnf sets the user and password for replication but not the
master_host so that when mysql starts, it doesn't try to start
replicating immediately. I do a change master where I set the
logfile, log position, and master host. I can start the slave
io_thread and it catches up fairly quickly, 30 seconds or so for a
total of 300 Megs worth of binary logs. I start the slave sql_thread
and it works for about 10 seconds, then it gets a 1062.

It does this for last night's backed up data and for a snapshot of the
backed up data from the day before.

Any observations as to what might be happening? My next step is going
to be to put 5.0.77 on slave2 and see if it does the same thing. If
it does, then I have to start examining my backup processes to make
sure my data is actually recoverable and consistent.

I also plan on trying maatkit to start up a new slave and see how that
works for me (both in time and consistency). The database is only 5.2
GB (not counting the innodb logfiles and ibdata, which we don't use
since we use file_per_table), so it's not so terribly large that it
should just take forever for maatkit to bring it all over.

Background: This all started because last night I attempted to upgrade
a production server (slave1) from 5.0.77 to percona 5.1.39. I hit an
odd error, one that didn't make sense: it (percona 5.1.39) complained
it could not find the relay log files. It could find the index file
and knew to look for the correct file name, but it could not find it
(see below for the actual error). It was the middle of the night and
I worked at it for about 30 minutes before deciding to roll back. I
had run mysql_upgrade, so I had to restore the mysql database, but
other than that it was not difficult to roll back. What I'm
attempting to do at this point is get a second slave up and running so
that I can experiment with this upgrade path.

Actual error:
100202 10:33:01 [Warning] Neither --relay-log nor --relay-log-index
were used; so replication may break when this MySQL server acts as a
slave and has his hostname changed!! Please use
'--relay-log=ivdb52-relay-bin' to avoid this problem.
100202 10:20:03 [ERROR] Failed to open the relay log
'./mysqld-relay-bin.001030' (relay_log_pos 37495521)
100202 10:20:03 [ERROR] Could not find target log during relay log
initialization

I ignored the warning because it obviously was able to read the index
to know that it wanted the file numbered 001030.

Now that I'm looking at slave1, I see this:
CentOS54[root@ivdb52 mysql]# vdir /var/lib/mysql/ | grep relay
-rw-rw---- 1 mysql mysql 125 Feb 2 10:14 ivdb52-relay-bin.000001
-rw-rw---- 1 mysql mysql 125 Feb 2 10:14 ivdb52-relay-bin.000002
-rw-rw---- 1 mysql mysql 125 Feb 2 10:17 ivdb52-relay-bin.000003
-rw-rw---- 1 mysql mysql 125 Feb 2 10:18 ivdb52-relay-bin.000004
-rw-rw---- 1 mysql mysql 125 Feb 2 10:20 ivdb52-relay-bin.000005
-rw-rw---- 1 mysql mysql 125 Feb 2 10:24 ivdb52-relay-bin.000006
-rw-rw---- 1 mysql mysql 125 Feb 2 10:33 ivdb52-relay-bin.000007
-rw-rw---- 1 mysql mysql 125 Feb 2 10:34 ivdb52-relay-bin.000008
-rw-rw---- 1 mysql mysql 208 Feb 2 10:34 ivdb52-relay-bin.index
-rw-rw---- 1 mysql mysql 36029463 Feb 3 00:00 mysqld-relay-bin.001041
-rw-rw---- 1 mysql mysql 26 Feb 2 21:00 mysqld-relay-bin.index
-rw-rw---- 1 mysql mysql 66 Feb 3 00:00 relay-log.info

All times are UTC. So for some reason the mysql binary was creating
files named something other than the default mysqld-relay-bin.* and I
just was not quite awake enough to notice that. Probably setting the
relay_log variable would have solved my problem.
--
Regards... Todd
I seek the truth...it is only persistence in self-delusion and
ignorance that does harm. -- Marcus Aurealius

Todd Lyons

unread,

Feb 2, 2010, 8:02:52 PM2/2/10

to percona-discussion

On Tue, Feb 2, 2010 at 4:06 PM, Todd Lyons <to...@mrball.net> wrote:
> replicating immediately. I do a change master where I set the
> logfile, log position, and master host. I can start the slave
> io_thread and it catches up fairly quickly, 30 seconds or so for a
> total of 300 Megs worth of binary logs. I start the slave sql_thread
> and it works for about 10 seconds, then it gets a 1062.
>

> Any observations as to what might be happening? My next step is going
> to be to put 5.0.77 on slave2 and see if it does the same thing. If
> it does, then I have to start examining my backup processes to make
> sure my data is actually recoverable and consistent.

Followup, I downgraded mysql to 5.0.77, did the innobackupex
--copy-back, started up replication and it's catching up just fine.
It appears that I have some kind of replication issue from my 5.0.77
master to my test percona 5.1.39 system. I am not sure what that
issue is though, I didn't see anything in the upgrade notes, but I
just skimmed it and could have overlooked some big ugly warning. :-/

Reply all

Reply to author

Forward

0 new messages