Restore/Migrate of job distributed on two tapes fails with ERR=Cannot allocate memory

161 views
Skip to first unread message

Zu We

unread,
Aug 17, 2021, 10:54:44 AM8/17/21
to bareos-users
When I try to Restore/Migrate a specific job distributed on two tapes, the first tape is read successfully but when mounting the second tape the job fails with error

```
20-Jul 08:02 bareos-sd JobId 21833: Ready to read from volume "NIX417L6" on device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst).
20-Jul 08:02 bareos-sd JobId 21833: Forward spacing Volume "NIX417L6" to file:block 0:1.
20-Jul 08:02 bareos-sd JobId 21833: Error: stored/block.cc:1057 Read error on fd=7 at file:blk 0:1 on device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst). ERR=Cannot allocate memory.
```

This only happens on this specific tape. Other jobs distributed on multiple tapes work correctly.

I did configure the Bareos Tape device /etc/bareos/bareos-sd.d/device/tapedrive-0.conf with "Maximum Block Size = 1048576" from the very beginning and all tapes were written using this configuration.
Also all relevant pools have "Maximum Block Size = 1048576" set.
I started using Bareos after Version 14.2.0 so the new block size handling (label block 63k, data blocks 1M) was used on all tapes.

But the error suggests that there is a problem with the block size.
This is seconded by /var/log/messages
```
Jul 20 08:02:14 igms07 kernel: st 3:0:0:0: [st0] Failed to read 1048576 byte block with 64512 byte transfer.
```

So it seems that Bareos is trying to read with 63k block size although it is configured otherwise?


Further tests showed that I can read the tape label correctly with btape:

```
]$ btape /dev/tape/by-id/scsi-350050763121a063c-nst
Tape block granularity is 1024 bytes.
btape: stored/butil.cc:290-0 Using device: "/dev/tape/by-id/scsi-350050763121a063c-nst" for writing.
btape: stored/btape.cc:485-0 open device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst): OK
*status
 Bareos status: file=0 block=0
 Device status: BOT ONLINE IM_REP_EN file=0 block=0
Device status: TAPE BOT ONLINE IMMREPORT. ERR=
*readlabel
btape: stored/btape.cc:532-0 Volume label read correctly.

Volume Label:
Id : Bareos 2.0 immortal
VerNo : 20
VolName : NIX417L6
PrevVolName :
VolFile : 0
LabelType : VOL_LABEL
LabelSize : 167
PoolName : Scratch
MediaType : LTO
PoolType : Backup
HostName : igms07.vision.local
Date label written: 05-Nov-2018 11:42
```


I can read the tape with dd with a blocksize of 1M:
dd if=/dev/tape/by-id/scsi-350050763121a063c-nst bs=1M


btape scan fails with the same error as Bareos:

```
]$ btape /dev/tape/by-id/scsi-350050763121a063c-nst
Tape block granularity is 1024 bytes.
btape: stored/butil.cc:290-0 Using device: "/dev/tape/by-id/scsi-350050763121a063c-nst" for writing.
btape: stored/btape.cc:485-0 open device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst): OK
*rewind
btape: stored/btape.cc:581-0 Rewound "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst)
*scan
btape: stored/btape.cc:1901-0 Starting scan at file 0
btape: stored/btape.cc:1909-0 Bad status from read -1. ERR=stored/btape.cc:1907 read error on /dev/tape/by-id/scsi-350050763121a063c-nst. ERR=Cannot allocate memory.
```

but btape scanblocks works fine:

```
]$ btape /dev/tape/by-id/scsi-350050763121a063c-nst
Tape block granularity is 1024 bytes.
btape: stored/butil.cc:290-0 Using device: "/dev/tape/by-id/scsi-350050763121a063c-nst" for writing.
btape: stored/btape.cc:485-0 open device "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst): OK
*rewind
btape: stored/btape.cc:581-0 Rewound "tapedrive-0" (/dev/tape/by-id/scsi-350050763121a063c-nst)
*scanblocks
1 block of 203 bytes in file 0
2213 blocks of 1048576 bytes in file 0
1 block of 1048566 bytes in file 0
1072 blocks of 1048576 bytes in file 0
1 block of 1048568 bytes in file 0
```

 my device configuration:

```
Device {

    Name = "tapedrive-0"
    DeviceType = tape
    Maximum Concurrent Jobs = 1

    # default:0, only required if the autoloader have multiple drives.
    DriveIndex = 0

    ArchiveDevice = /dev/tape/by-id/scsi-350050763121a063c-nst # Quantum LTO-7 standalone drive

    MediaType = LTO

    AutoChanger = no # default: no
    AutomaticMount = yes # default: no

    MaximumFileSize = 40GB
    MaximumBlockSize = 1048576

    Maximum Spool Size = 40GB
    Spool Directory = /var/lib/bareos/spool
}
```

When I read other jobs that are on the same tape (second tape of the set of two), the restore/migration works just fine.

The question is, what is the issue here?
And how do I fix it?
Does anybody have an advice for me?

Zu We

unread,
Nov 22, 2021, 10:00:16 AM11/22/21
to bareos-users
Does nobody have an idea how to rescue the data from these two tapes?

The backup was created with Bareos version 17.2.4 - the log of the job didn't show any anomalies:


2018-11-12 22:58:41
bareos-dir JobId 5822: sql_create.c:872 Insert of attributes batch table done

2018-11-12 22:58:41
bareos-dir JobId 5822: Bareos bareos-dir 17.2.4 (21Sep17):
Build OS: x86_64-redhat-linux-gnu redhat CentOS Linux release 7.4.1708 (Core)
JobId: 5822
Job: archive_bilddaten-igms00-fd.2018-11-12_15.09.14_51
Backup Level: Full (upgraded from Incremental)
Client: "igms00-fd" 17.2.4 (21Sep17) Microsoft Windows Server 2012 Standard Edition (build 9200), 64-bit,Cross-compile,Win64
FileSet: "FileSetIGMS00_Bilddaten_archive" 2018-11-12 15:09:14
Pool: "Archiv_Bilddaten" (From Job resource)
Catalog: "MyCatalog" (From Client resource)
Storage: "Tape" (From Job resource)
Scheduled time: 12-Nov-2018 15:09:14
Start time: 12-Nov-2018 15:10:04
End time: 12-Nov-2018 22:58:41
Elapsed time: 7 hours 48 mins 37 secs
Priority: 10
FD Files Written: 510,557
SD Files Written: 510,557
FD Bytes Written: 4,503,733,299,940 (4.503 TB)
SD Bytes Written: 4,503,889,327,740 (4.503 TB)
Rate: 160178.3 KB/s
Software Compression: None
VSS: yes
Encryption: no
Accurate: yes
Volume name(s): NIX416L6|NIX417L6
Volume Session Id: 25
Volume Session Time: 1541740509
Last Volume Bytes: 635,273,904,128 (635.2 GB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK


2018-11-12 22:58:31
bareos-sd JobId 5822: Sending spooled attrs to the Director. Despooling 250,502,190 bytes ...

2018-11-12 22:58:25
bareos-sd JobId 5822: Elapsed time=07:43:15, Transfer rate=162.0 M Bytes/second

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "Task Scheduler Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "VSS Metadata Store Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "Performance Counters Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "System Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "SqlServerWriter", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "ASR Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "Shadow Copy Optimization Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "FRS Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "WMI Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "Registry Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "NPS VSS Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "IIS Config Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "IIS Metabase Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "Dhcp Jet Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "NTDS", State: 0x1 (VSS_WS_STABLE)

2018-11-12 22:58:21
igms00-fd JobId 5822: VSS Writer (BackupComplete): "COM+ REGDB Writer", State: 0x1 (VSS_WS_STABLE)

2018-11-12 21:56:28
bareos-sd JobId 5822: Wrote label to prelabeled Volume "NIX417L6" on device "tapedrive-0" (/dev/tape/by-id/scsi-35005076312128141-nst)

2018-11-12 21:56:28
bareos-sd JobId 5822: New volume "NIX417L6" mounted on device "tapedrive-0" (/dev/tape/by-id/scsi-35005076312128141-nst) at 12-Nov-2018 21:56.

2018-11-12 21:56:22
bareos-sd JobId 5822: 3305 Autochanger "load slot 12, drive 0", status is OK.

2018-11-12 21:55:27
bareos-sd JobId 5822: 3304 Issuing autochanger "load slot 12, drive 0" command.

2018-11-12 21:55:26
bareos-dir JobId 5822: Using Volume "NIX417L6" from 'Scratch' pool.

2018-11-12 21:52:34
bareos-sd JobId 5822: Re-read of last block succeeded.

2018-11-12 21:52:34
bareos-sd JobId 5822: End of medium on Volume "NIX416L6" Bytes=3,869,610,408,960 Blocks=3,690,348 at 12-Nov-2018 21:52.

2018-11-12 21:52:34
bareos-sd JobId 5822: 3307 Issuing autochanger "unload slot 11, drive 0" command.

2018-11-12 21:52:28
bareos-sd JobId 5822: End of Volume "NIX416L6" at 96:28332 on device "tapedrive-0" (/dev/tape/by-id/scsi-35005076312128141-nst). Write of 1048576 bytes got -1.

2018-11-12 15:11:19
igms00-fd JobId 5822: Generate VSS snapshots. Driver="Win64 VSS", Drive(s)="D"

2018-11-12 15:11:19
igms00-fd JobId 5822: VolumeMountpoints are not processed as onefs = yes.

2018-11-12 15:11:16
bareos-sd JobId 5822: Wrote label to prelabeled Volume "NIX416L6" on device "tapedrive-0" (/dev/tape/by-id/scsi-35005076312128141-nst)

2018-11-12 15:11:10
bareos-sd JobId 5822: 3305 Autochanger "load slot 11, drive 0", status is OK.

2018-11-12 15:10:06
bareos-sd JobId 5822: 3304 Issuing autochanger "load slot 11, drive 0" command.

2018-11-12 15:10:04
bareos-dir JobId 5822: Using Volume "NIX416L6" from 'Scratch' pool.

2018-11-12 15:10:04
bareos-dir JobId 5822: Using Device "tapedrive-0" to write.

2018-11-12 15:10:04
igms00-fd JobId 5822: Created 30 wildcard excludes from FilesNotToBackup Registry key

2018-11-12 15:09:16
bareos-dir JobId 5822: Start Backup JobId 5822, Job=archive_bilddaten-igms00-fd.2018-11-12_15.09.14_51

2018-11-12 15:09:16
bareos-sd JobId 5822: 3307 Issuing autochanger "unload slot 14, drive 0" command.

2018-11-12 15:09:14
bareos-dir JobId 5822: No prior Full backup Job record found.

2018-11-12 15:09:14
bareos-dir JobId 5822: No prior or suitable Full backup found in catalog. Doing FULL backup.



Zu We

unread,
Mar 1, 2024, 12:55:07 AM3/1/24
to bareos-users
For everybodys information - in Bareos 23.0.1 this bug is fixed and I was finally able to Restore/Migrate this job.
Reply all
Reply to author
Forward
0 new messages