Problem with restore from Percona MySQL plugin

463 views
Skip to first unread message

Kjetil Torgrim Homme

unread,
Feb 8, 2017, 2:29:24 PM2/8/17
to bareos...@googlegroups.com
hello, I'm trying to use the Percona plugin. backup seems to work
(although I am not certain), on this test server the backup jobs says:

FD Bytes Written: 6,513,699 (6.513 MB)

mysqldump without compression is ~9 MB, the backup contains just one
file, /_percona/xbstream.0001457954

so, trying out restore:

08-Feb 19:28 dump-sd JobId 1457958: Forward spacing Volume "incr/Incr-6247" to file:block 1:970713522.
08-Feb 19:28 ece026-snapshot-fd JobId 1457958: Error: restore.c:1239 Write error on /tmp/bacula-restores/_percona/xbstream.0001457954: Child exited with code -268435457
08-Feb 19:28 ece026-snapshot-fd JobId 1457958: Error: Restore command returned non-zero value: 1, message:

(if you wonder about the Incr name: I did the full backup to the
Incremental-pool for convenience - didn't want to have to wait for
tape.)

the directory in the error message is left empty.

I tried to grab the error message from xbstream, but was unable to patch
the python plugin to log it. however strace delivers the goods, sort
of. the xbstream process reads and reads and reads, and then:

write(2, "xb_stream_read_chunk(): invalid checksum at offset 0xa00025: expected 0xf96cf5a6, read 0xc7ae89da.\n", 99) = 99

it doesn't look like the trace level 200 file from FD has much
interesting to say, but see attached file.

at this point I would like to see if the problem is in the backup or in
the restore. can I extract the stream to a plain file on the client and
run xbstream on it manually? other hints on how to debug the problem?

here are my versions:

* ece026-snapshot-fd Version: 15.2.2 (16 November 2015)
x86_64-redhat-linux-gnu redhat CentOS Linux release 7.0.1406 (Core)

* percona-plugin from Git master today

* Python 2.7.5

* mysql --version
mysql Ver 14.14 Distrib 5.6.30-76.3, for Linux (x86_64) using 6.2

* rpm -qf /usr/sbin/mysqld
Percona-XtraDB-Cluster-server-56-5.6.30-25.16.1.el6.x86_64

* rpm -qf /bin/xbstream
percona-xtrabackup-2.3.5-1.el6.x86_64

Bareos server:

* dump-dir Version: 16.3.1 (01 July 2016) x86_64-unknown-linux-gnu
ubuntu Ubuntu 14.04.5 LTS (actually from Git, tag Release/16.2.4)

--
Kjetil T. Homme
Redpill Linpro AS - Changing the game
ece026-snapshot-fd.trace.gz

Maik Aussendorf

unread,
Feb 9, 2017, 5:25:56 AM2/9/17
to bareos...@googlegroups.com
Hello,


Am 08.02.2017 um 20:29 schrieb Kjetil Torgrim Homme:
> hello, I'm trying to use the Percona plugin. backup seems to work
> (although I am not certain), on this test server the backup jobs says:
>
> FD Bytes Written: 6,513,699 (6.513 MB)
>
> mysqldump without compression is ~9 MB, the backup contains just one
> file, /_percona/xbstream.0001457954
That's correct. It is the output from xtrabackup command in xbstream format.

>
> so, trying out restore:
>
> 08-Feb 19:28 dump-sd JobId 1457958: Forward spacing Volume "incr/Incr-6247" to file:block 1:970713522.
> 08-Feb 19:28 ece026-snapshot-fd JobId 1457958: Error: restore.c:1239 Write error on /tmp/bacula-restores/_percona/xbstream.0001457954: Child exited with code -268435457
> 08-Feb 19:28 ece026-snapshot-fd JobId 1457958: Error: Restore command returned non-zero value: 1, message:
Here after 'message: ' the stdError output of the xbstream is expected.

> (if you wonder about the Incr name: I did the full backup to the
> Incremental-pool for convenience - didn't want to have to wait for
> tape.)
>
> the directory in the error message is left empty.
>
> I tried to grab the error message from xbstream, but was unable to patch
> the python plugin to log it. however strace delivers the goods, sort
You can try to use 'stdOut' instead of 'stdError' in line 347 (obviously
stdErr is empty in you case). I will print both values in future versions.

> of. the xbstream process reads and reads and reads, and then:
>
> write(2, "xb_stream_read_chunk(): invalid checksum at offset 0xa00025: expected 0xf96cf5a6, read 0xc7ae89da.\n", 99) = 99
>
> it doesn't look like the trace level 200 file from FD has much
> interesting to say, but see attached file.
>
> at this point I would like to see if the problem is in the backup or in
> the restore. can I extract the stream to a plain file on the client and
> run xbstream on it manually? other hints on how to debug the problem?

Yes, you can. E.g. restore to a client with plugins disabled, then you
will get the plain xbstream-file.

I suggest that you first check, that xtrabackup produces a valid stream
that can be successfully used by xbstream again. If that works you can
have a look about the stream going into Bareos and what comes back on
restore.


>
> here are my versions:
My development box is similar:
Centos 7, Python 2.7.5, 5.5.44-MariaDB, percona-xtrabackup-2.3.3 and
tested again with updated 2.3.6

Regards
Maik
>
> * ece026-snapshot-fd Version: 15.2.2 (16 November 2015)
> x86_64-redhat-linux-gnu redhat CentOS Linux release 7.0.1406 (Core)
>
> * percona-plugin from Git master today
>
> * Python 2.7.5
>
> * mysql --version
> mysql Ver 14.14 Distrib 5.6.30-76.3, for Linux (x86_64) using 6.2
>
> * rpm -qf /usr/sbin/mysqld
> Percona-XtraDB-Cluster-server-56-5.6.30-25.16.1.el6.x86_64
>
> * rpm -qf /bin/xbstream
> percona-xtrabackup-2.3.5-1.el6.x86_64
>
> Bareos server:
>
> * dump-dir Version: 16.3.1 (01 July 2016) x86_64-unknown-linux-gnu
> ubuntu Ubuntu 14.04.5 LTS (actually from Git, tag Release/16.2.4)
>

--
With kind regards // Mit freundlichen Grüßen
--
Maik Außendorf maik.au...@bareos.com
Bareos GmbH & Co. KG Phone: +49221630693-93
http://www.bareos.com Fax: +49221630693-10

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: Stephan Dühr, M. Außendorf,
J. Steffens, P. Storz

Kjetil Torgrim Homme

unread,
Feb 9, 2017, 10:19:06 AM2/9/17
to bareos...@googlegroups.com
Maik Aussendorf <maik.au...@bareos.com> writes:

> Am 08.02.2017 um 20:29 schrieb Kjetil Torgrim Homme:
>> so, trying out restore:
>>
>> 08-Feb 19:28 dump-sd JobId 1457958: Forward spacing Volume
>> "incr/Incr-6247" to file:block 1:970713522.
>> 08-Feb 19:28 ece026-snapshot-fd JobId 1457958: Error:
>> restore.c:1239 Write error on
>> /tmp/bacula-restores/_percona/xbstream.0001457954: Child exited with
>> code -268435457
>> 08-Feb 19:28 ece026-snapshot-fd JobId 1457958: Error: Restore
>> command returned non-zero value: 1, message:
> Here after 'message: ' the stdError output of the xbstream is
> expected.

>> I tried to grab the error message from xbstream, but was unable to patch
>> the python plugin to log it. however strace delivers the goods, sort
> You can try to use 'stdOut' instead of 'stdError' in line 347 (obviously
> stdErr is empty in you case). I will print both values in future
> versions.

yes, I don't understand this. the error message from xbstream is
clearly being written to stderr, I don't understand why it doesn't work
(I changed the plugin code to pass stderr=PIPE to Popen.)

>> of. the xbstream process reads and reads and reads, and then:
>>
>> write(2, "xb_stream_read_chunk(): invalid checksum at offset
>> 0xa00025: expected 0xf96cf5a6, read 0xc7ae89da.\n", 99) = 99
>>
>> at this point I would like to see if the problem is in the backup or in
>> the restore. can I extract the stream to a plain file on the client and
>> run xbstream on it manually? other hints on how to debug the problem?
>
> Yes, you can. E.g. restore to a client with plugins disabled, then you
> will get the plain xbstream-file.

ah! ok, so I did this. the resulting file has 8 NUL bytes at
the beginning which I chopped off, and I get the same error on stderr
when running manually:

$ xbstream -x -C extr < /tmp/offs
xb_stream_read_chunk(): invalid checksum at offset 0xa00025: expected 0xdfbd8ba6, read 0x8866f355.

(the checksums are different from the previous mail since I made a
new full backup, but the offset is the same.)

running xtrabackup manually gives a file which xbstream accepts. it is
just 3 bytes smaller than /tmp/offs.

the size of the file is 121330646, 115 MiB - the error occurs just after
the 10 MiB mark. at the 10 MiB mark there are 41 NUL bytes before there
is a new header - this looks the same in the working dump taken
manually.

^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@XBSTCK01^@P^G^@^@^@ibdata1[...]

in other words, at offset 0xa00025 there are 4 NUL bytes, then there is
the new header.

the md5sum's do not match up:

from Bareos database:
1b8aee24ef2856fe0dbfe42214f0c383 /_percona/xbstream.0001459278

restored without plugin:
f776b9287ab4cf9e349da41b7aa462b4 /tmp/xbstream.0001459278

above file with first 8 NUL bytes removed:
d4f5949c8cc8a2914909b918c0e88bdb /tmp/offs

> I suggest that you first check, that xtrabackup produces a valid stream
> that can be successfully used by xbstream again. If that works you can
> have a look about the stream going into Bareos and what comes back on
> restore.

thank you for your hints! I'm trying to dig further...

Kjetil Torgrim Homme

unread,
Feb 9, 2017, 12:00:07 PM2/9/17
to bareos...@googlegroups.com
Kjetil Torgrim Homme <kjetil...@redpill-linpro.com> writes:

> thank you for your hints! I'm trying to dig further...

ok. I changed xtrabackup to be a wrapper script which ran tee to save a
copy of stdout to a file. that copy is valid for restore.

I have done two restores of the same job to a client without the plugin,
the md5sum is the same. IOW, the corruption doesn't happen (randomly)
during restore.

the tee'd off file and the restored file differ in many places. the
first difference is at byte 65527.

OFFSET ORIGINAL DUMP RESTORED DUMP
0xFFF6 00 00 -> ff f8
0x1FFED 00 00 00 -> 01 ff f0
0x2FFE0 29 24 db 23 b3 22 85 21 -> 00 00 00 00 00 02 ff e8
0x3FFDD 00 00 00 -> 03 ff e0
... and so on for every 64 KiB.

one thing I should mention is that I have enabled the autoxflate-sd
plugin. I wonder if that can corrupt data (it definitely is in the
position to do so...) unfortunately I rarely have a window where no
jobs are running so that I can restart the SD. I'll keep you posted
when I get the chance to try it out.

Bruno Friedmann

unread,
Feb 9, 2017, 4:10:06 PM2/9/17
to bareos...@googlegroups.com
Could be also related to this report
http://bugs.bareos.org/view.php?id=694

Seems your affected by the same problem.

--

Bruno Friedmann
Ioda-Net Sàrl www.ioda-net.ch
Bareos Partner, openSUSE Member, fsfe fellowship
GPG KEY : D5C9B751C4653227
irc: tigerfoot

openSUSE Tumbleweed
Linux 4.9.8-1-default x86_64 GNU/Linux, nvidia: 378.09
Qt: 5.7.1, KDE Frameworks: 5.30.0, Plasma: 5.9.0, kmail2 5.4.1

Maik Aussendorf

unread,
Feb 13, 2017, 5:06:06 AM2/13/17
to bareos...@googlegroups.com
Hello,,

you are both right: I could reproduce the behavior with autoxflate +
sparse = yes in the FileSet options just as in

http://bugs.bareos.org/view.php?id=694

Kjetil, please check, if you have Sparse=yes, turn it off and test it again.

Regards
Maik

Stephan Duehr

unread,
Feb 13, 2017, 1:12:34 PM2/13/17
to bareos...@googlegroups.com
Hello,

I've run some tests. The combination of sparse=yes and autoxflate is generally broken
even with a small non-sparse testfile, I've added a comment at
http://bugs.bareos.org/view.php?id=694

So logically, it's also implicitly broken when adding Python Plugins.

Without sparse=yes, the combination of autoxflate and Python Plugins works correctly.

However, using sparse=yes with Python Plugins but without autoxflate also has an issue,
but that only hits with blocks of zeros larger than 64K in the data stream.

Conclusion: These are two unrelated issues. Don't use sparse=yes together with autoxflate
or Python Plugins.

If it's possible to access the sparse setting from the Fileset information within a Python Plugin,
and the plugin can't handle sparse properly, it should fail the job with an appropriate
error.

Regards,

Stephan
Stephan Dühr stepha...@bareos.com
Bareos GmbH & Co. KG Phone: +49 221-630693-90
http://www.bareos.com

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: S. Dühr, M. Außendorf, J. Steffens, Philipp Storz

Kjetil Torgrim Homme

unread,
Feb 14, 2017, 9:21:49 AM2/14/17
to bareos...@googlegroups.com
Maik Aussendorf <maik.au...@bareos.com> writes:

> you are both right: I could reproduce the behavior with autoxflate +
> sparse = yes in the FileSet options just as in
>
> http://bugs.bareos.org/view.php?id=694
>
> Kjetil, please check, if you have Sparse=yes, turn it off and test it
> again.

I can confirm, like Stephan, that Sparse=no solves the problem. thanks!

incidentally, Compression=GZIP or not (on the client) does not influence
the outcome.'
Reply all
Reply to author
Forward
0 new messages