Smartctl open device alert error after backup

247 views
Skip to first unread message

John Bolt

unread,
Jan 28, 2015, 7:44:38 AM1/28/15
to bareos...@googlegroups.com
Hello,

Please inform the procedure to fix the following backup error after the backup:


fedora18-sd Elapsed time=01:58:01, Transfer rate=68.49 M Bytes/second
fedora18-sd Alert: smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.11.10-100.fc18.x86_64] (local build)
Alert: Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
Alert:
Alert: Smartctl open device: /dev/st0 failed: Device or resource busy
3997 Bad alert command: sh -c 'smartctl -H -l error /dev/st0': ERR=Child exited with code 2.

Marco van Wieringen

unread,
Jan 28, 2015, 9:40:34 AM1/28/15
to bareos...@googlegroups.com

First of all you should not use /dev/st0 as that is the rewinding tape
device but /dev/nst0 the non rewinding one for the normal backups.

Your actual problem is that Bareos keeps the device open so you cannot
run smartctl on either the /dev/st0 or /dev/nst0 but need if you
really want to use this alert reporting you need to use the SCSI
generic device. "lsscsi -g" should list the generic device that can
be used for the specific tape drive. You can either hardcode the device
name in the alert cmd or set the new "Diagnostic Device" option and use
the "%D" e.g. "smartctl -H -l error %D" The Alert Command however is
probably replaced by the tapealert plugin in the future as that captures
the real alerts and stores them in the database so you can at any time
see past alerts being reported by the drive. The idea is that such
functionality will eventually be included in the bareos-webui. We already
have statistics collection and this tapealert collecting in 14.2 and as
tapealert is a plugin it can use the same handle bareos has open for the
taoe device and as such has no need to use the generic SCSI device.

--
Marco van Wieringen marco.van...@bareos.com
Bareos GmbH & Co. KG Phone: +49-221-63069389
http://www.bareos.com

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: Stephan Dühr, M. Außendorf, J. Steffens,
P. Storz, M. v. Wieringen

John Bolt

unread,
Jan 28, 2015, 2:56:13 PM1/28/15
to bareos...@googlegroups.com
Hello Marco,

Thanks a lot for the clear explanation. I really appreciate.

When checking for the tape device with lsscsi --generic I got a /dev/st0 /dev/sg5 answer and not a /dev/nst0 and that misled me to conclude that your "Example Storage Daemon Configuration File" in the Chapter 10.7 of the manual should be corrected to my "real" device.

Maybe a comment should be added to the corresponding "Archive Device" lines in the manual to help avoid others to make the same error.

Searching the web for more "rewinding or non-rewinding" explanations I found the following: "usually you should use the non-rewinding device nodes to avoid having the drive rewind unexpectedly. When using Backup software like Bareos, you actually MUST use the non-rewinding device, otherwise you may get into trouble (the software does its own rewinds and doesn't expect the tape to be rewound without it having done it itself).

I hope this feedback may help anyone who may fall in similar problems or that the manuals could include some information of this kind to improve the user experience with Bareos.

John Bolt

unread,
Jan 30, 2015, 11:33:19 AM1/30/15
to bareos...@googlegroups.com, marco.van...@bareos.com
I followed your instructions to use the generic driver and smartctl did run, but bareos still reported an error (See the log bellow).

Exit error 4 is bit 2 in smartctl exit status.
man smartctl shows: "Bit 2: Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure (see '-b' option above)."

smartctl -a /dev/sg5 also shows:
Device does not support Self Test logging

Is there anything to do to fix it?
Shouldn't bareos not report an error since it looks like LTO drives just does not have selt test loggin and that is not an error?

The bareos log is:
The man also suggests it could be "SMART Self-Test Log Structure" because

Alert: smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.11.10-100.fc18.x86_64] (local build)
Alert: Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
Alert:
Alert: === START OF READ SMART DATA SECTION ===
Alert: TapeAlert: OK
Alert: Percentage used endurance indicator too short (pl=6)
Alert: Error counter log:
Alert: Errors Corrected by Total Correction Gigabytes Total
Alert: ECC rereads/ errors algorithm processed uncorrected
Alert: fast | delayed rewrites corrected invocations [10^9 bytes] errors
Alert: read: 0 0 0 0 0 0.071 0
Alert: write: 0 0 0 0 0 0.000 0
Alert:
Alert: Non-medium error count: 0
Alert:
3997 Bad alert command: sh -c 'smartctl -H -l error /dev/sg5': ERR=Child exited with code 4.
Reply all
Reply to author
Forward
0 new messages