Error when writing Virtual Full to tape

83 views
Skip to first unread message

Andreas Haase

unread,
Jan 2, 2023, 2:06:00 PM1/2/23
to bareos-users
Hello,

We are using Bareos 22.0.1 on Rocky 9. Our strategy is to back up the clients to disk and writing virtual full backups once a week to tape. The first (write to disk) works well, but when trying to write to tape, only some jobs throw an error like this:

02-Jan 19:56 backup-dir JobId 130937: Start Virtual Backup JobId 130937, Job=buildsrv01-tape-job.2023-01-02_19.56.55_12

02-Jan 19:56 backup-dir JobId 130937: Warning: This Job is not an Accurate backup so is not equivalent to a Full backup.

02-Jan 19:56 backup-dir JobId 130937: Bootstrap records written to /var/lib/bareos/backup-dir.restore.1.bsr

02-Jan 19:56 backup-dir JobId 130937: Consolidating JobIds 130931 containing 206270 files

02-Jan 19:56 backup-dir JobId 130937: Connected Storage daemon at backup.DOMAIN:9103, encryption: TLS_CHACHA20_POLY1305_SHA256 TLSv1.3

02-Jan 19:56 backup-dir JobId 130937:  Encryption: TLS_CHACHA20_POLY1305_SHA256 TLSv1.3

02-Jan 19:56 backup-dir JobId 130937: Using Device "C4U_CHE_File_VTL1" to read.

02-Jan 19:56 backup-dir JobId 130937: Using Device "C4U_CHE_Tape_LTO7" to write.

02-Jan 19:56 backup-sd JobId 130937: Volume "BCS531L7" previously written, moving to end of data.

02-Jan 19:58 backup-sd JobId 130937: Ready to append to end of Volume "BCS531L7" at file=171.

02-Jan 19:58 backup-sd JobId 130937: Ready to read from volume "C4U-CHE-VTL-0316" on device "C4U_CHE_File_VTL1" (/mnt/bareos/storage/VTL).

02-Jan 19:58 backup-sd JobId 130937: Forward spacing Volume "C4U-CHE-VTL-0316" to file:block 1:223.

02-Jan 19:58 backup-sd JobId 130937: Error: stored/block.cc:290 Volume data error at 1:223! Wanted ID: "BB02", got "<A1>>^Z<90>". Buffer discarded.

02-Jan 19:58 backup-sd JobId 130937: Fatal error: stored/mac.cc:670 Fatal append error on device "C4U_CHE_Tape_LTO7" (/dev/tape/by-id/scsi-350014380272c5f31-nst): ERR=

02-Jan 19:58 backup-sd JobId 130937: Elapsed time=00:00:01, Transfer rate=0  Bytes/second

02-Jan 19:58 backup-sd JobId 130937: Releasing device "C4U_CHE_Tape_LTO7" (/dev/tape/by-id/scsi-350014380272c5f31-nst).

02-Jan 19:58 backup-sd JobId 130937: Releasing device "C4U_CHE_File_VTL1" (/mnt/bareos/storage/VTL).

02-Jan 19:58 backup-dir JobId 130937: Error: lib/bsock_tcp.cc:412 Wrote -4 bytes to Storage daemon:backup.DOMAIN:9103, but only 0 accepted.


Does anyone have a hint how to debug this?

Regards,
Andreas

Philipp Storz

unread,
Jan 2, 2023, 3:36:19 PM1/2/23
to Andreas Haase, bareos-users
Hello Andreas,

Am 02.01.23 um 20:05 schrieb Andreas Haase:
> Hello,
>
> We are using Bareos 22.0.1 on Rocky 9. Our strategy is to back up the clients to disk and writing
> virtual full backups once a week to tape. The first (write to disk) works well, but when trying to
> write to tape, only some jobs throw an error like this:

Probably you have been using version 22.0.0. The problem that you describe is the reason why we just
have published 22.0.1 on github and our repositories.

Please update as soon as possible.

The bug we have in 22.0.0 unfortunately makes wrong entries into the jobmedia table under certain
circumstances, so that the "startfile" field is wrongly incremented.

The good news is that the data on the volumes is correct, but the wrong "startfile" entry makes the
storage daemon look at the wrong address in the medium.

In your case, the problem is that it tries to look at to file:block 1:223 instead of 0:223.
The startblock in the jobmedia entry is 223, and the startfile is 1 which should be 0.

To fix your problem, you need to do something like:

update jobmedia set startfile=0 where jobmediaid='<jobmediaid of the job that is failing>';

To find the jobmediaid, use the query

`select * from jobmedia where jobid=130931;`

With this adaption, the consolidation should work again.

Sorry for the inconvenience.

best regards,

Philipp
> --
> You received this message because you are subscribed to the Google Groups "bareos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> bareos-users...@googlegroups.com <mailto:bareos-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/bareos-users/02987752-8457-4EFB-8DD1-32B187E903BC%40community4you.de <https://groups.google.com/d/msgid/bareos-users/02987752-8457-4EFB-8DD1-32B187E903BC%40community4you.de?utm_medium=email&utm_source=footer>.

--
Mit freundlichen Grüßen

Philipp Storz philip...@bareos.com
Bareos GmbH & Co. KG Phone: +49 221 63 06 93-92
http://www.bareos.com Fax: +49 221 63 06 93-10

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Geschäftsführer: Stephan Dühr, M. Außendorf,
J. Steffens, P. Storz

Andreas Haase

unread,
Jan 3, 2023, 2:33:44 AM1/3/23
to Philipp Storz, bareos-users
Hello Philipp,


Am 02.01.2023 um 21:36 schrieb Philipp Storz <philip...@bareos.com>:

To fix your problem, you need to do something like:

update jobmedia set startfile=0 where jobmediaid='<jobmediaid of the job that is failing>';

To find the jobmediaid, use the query

`select * from jobmedia where jobid=130931;`

With this adaption, the consolidation should work again.

You are right. Doing this adaption, the specific job works again. I’ll do this the same way with the other failed jobs.

To fully understand the problem: the startle value is set the wrong way in case a tape or val tape is being initialized newly? In my case, one third of the virtual full jobs failed, the other ones succeeded.

Regards,
Andreas

Philipp Storz

unread,
Jan 3, 2023, 8:47:03 AM1/3/23
to Andreas Haase, bareos-users
Am 03.01.23 um 08:33 schrieb Andreas Haase:
> Hello Philipp,
>
>> Am 02.01.2023 um 21:36 schrieb Philipp Storz <philip...@bareos.com>:
>>
>> To fix your problem, you need to do something like:
>>
>> update jobmedia set startfile=0 where jobmediaid='<jobmediaid of the job that is failing>';
>>
>> To find the jobmediaid, use the query
>>
>> `select * from jobmedia where jobid=130931;`
>>
>> With this adaption, the consolidation should work again.
>
> You are right. Doing this adaption, the specific job works again. I’ll do this the same way with the
> other failed jobs.

That's good :)
>
> To fully understand the problem: the startle value is set the wrong way in case a tape or val tape
> is being initialized newly? In my case, one third of the virtual full jobs failed, the other ones
> succeeded.
No, not when a medium is initialized newly. It can happen whenever the jobmedia entry is written,
which is the case when a job writes data to a volume.

Regards,

Philipp
Reply all
Reply to author
Forward
0 new messages