btape random test ends in driver errors

961 views
Skip to first unread message

Gerhard Sulzberger

unread,
Nov 4, 2013, 11:50:25 AM11/4/13
to bareos...@googlegroups.com
Hi bareos community!

Thank you for your work on the bacula fork. At the moment I try to migrate our backup system to bareos.
But I have a problem with the new bareos-sd.

I use Ubuntu 12.04.03 LTS & bareos-storage-tape 12.4.4-662.1 version & Tandberg T24 Library with an HP Ultrium LTO-5 SCSI drive.

At the moment the loader is connectet with an LSI SAS 9200-8e HBA controller.
I installed the driver form LSI and loaded the modules.
I also used an Adaptec 1405 HBA, but I got nearly the same error with this adapter.

I tried to test the tape drive with btape (test, autochanger & speed).
Test and autochanger execute without errors. But when the speed test runs the random file test I got an I/O Error.

this is the log form /var/log/syslog

kernel: [ 494.025655] mpt2sas0: log_info(0x3112010c): originator(PL), code(0x12), sub_code(0x010c)
kernel: [ 494.025700] st0: Error b0000 (driver bt 0x0, host bt 0xb).

Seems to be an driver error, but this only happens when the random test is running.

Tomorrow I'll post more detailed informations.

Some Ideas?

Thanks Community


Gerhard Sulzberger

unread,
Nov 5, 2013, 2:54:22 AM11/5/13
to bareos...@googlegroups.com

Little bit more infomation:

bareos-sd01 ~ # lsscsi -vH
[0] sata_sil24
dir: /sys/class/scsi_host/host0
device dir: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/ata1/host0
[1] sata_sil24
dir: /sys/class/scsi_host/host1
device dir: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/ata2/host1
[2] mpt2sas
dir: /sys/class/scsi_host/host2
device dir: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host2
[3] ahci
dir: /sys/class/scsi_host/host3
device dir: /sys/devices/pci0000:00/0000:00:1f.2/ata3/host3
[4] ahci
dir: /sys/class/scsi_host/host4
device dir: /sys/devices/pci0000:00/0000:00:1f.2/ata4/host4
[5] ahci
dir: /sys/class/scsi_host/host5
device dir: /sys/devices/pci0000:00/0000:00:1f.2/ata5/host5
[6] ahci
dir: /sys/class/scsi_host/host6
device dir: /sys/devices/pci0000:00/0000:00:1f.2/ata6/host6
[7] ahci
dir: /sys/class/scsi_host/host7
device dir: /sys/devices/pci0000:00/0000:00:1f.2/ata7/host7
[8] ahci
dir: /sys/class/scsi_host/host8
device dir: /sys/devices/pci0000:00/0000:00:1f.2/ata8/host8

my bareos-sd.conf for this drive:

Autochanger {
Name = loader01
Device = drive-0
Changer Command = "/usr/lib/bareos/scripts/mtx-changer %c %o %S %a %d"
Changer Device = /dev/tape/by-id/scsi-3200100d08051e3c2
#Changer Device = /dev/sg4
#Changer Device = /dev/sch0
}

Device {
Name = drive-0
Media Type = LTO-5
AutoChanger = yes
Changer Device = /dev/tape/by-id/scsi-3200100d08051e3c2
#Changer Device = /dev/sch0
Changer Command = "/usr/lib/bareos/scripts/mtx-changer %c %o %S %a %d"
#Drive Index = 0
Archive Device = /dev/tape/by-id/scsi-3500110a00152f34e-nst
#Archive Device = /dev/nst0
AutomaticMount = yes; # when device opened, read it
AlwaysOpen = yes;
RemovableMedia = yes;
RandomAccess = no;
LabelMedia = no;
Spool Directory = /mnt/spool
# Enable the Alert command only if you have the mtx package loaded
Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
# If you have smartctl, enable this, it has more info than tapeinfo
## Alert Command = "sh -c 'smartctl -H -l error %c'"
}

New is that the btape test isn't working too.

btape -p drive-0
Tape block granularity is 1024 bytes.
btape: butil.c:285-0 Using device: "drive-0" for writing.
btape: btape.c:499-0 open device "drive-0" (/dev/nst0): OK
*test

=== Write, rewind, and re-read test ===

I'm going to write 10000 records and an EOF
then write 10000 records and an EOF, then rewind,
and re-read the data to verify that it is correct.

This is an *essential* feature ...

btape: btape.c:1183-0 Wrote 10000 blocks of 64412 bytes.
btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)
btape: btape.c:1199-0 Wrote 10000 blocks of 64412 bytes.
btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)
btape: btape.c:1241-0 Rewind OK.
10000 blocks re-read correctly.
Got EOF on tape.
10000 blocks re-read correctly.
=== Test Succeeded. End Write, rewind, and re-read test ===

btape: btape.c:1309-0 Block position test
btape: btape.c:1321-0 Rewind OK.
Reposition to file:block 0:4
Block 5 re-read correctly.
Reposition to file:block 0:200
Block 201 re-read correctly.
Reposition to file:block 0:9999
Block 10000 re-read correctly.
Reposition to file:block 1:0
Block 10001 re-read correctly.
Reposition to file:block 1:600
Block 10601 re-read correctly.
Reposition to file:block 1:9999
Block 20000 re-read correctly.
=== Test Succeeded. End Write, rewind, and re-read test ===

=== Append files test ===

This test is essential to Bareos.

I'm going to write one record in file 0,
two records in file 1,
and three records in file 2

btape: btape.c:601-0 Rewound "drive-0" (/dev/nst0)
btape: btape.c:1940-0 Wrote one record of 64412 bytes.
btape: btape.c:1942-0 Wrote block to device.
btape: btape.c:627-0 Bad status from weof. ERR=dev.c:1540 ioctl MTWEOF error on "drive-0" (/dev/nst0). ERR=Input/output error.

Next steps, remove the Fake-Raid Controller, change the Cable. Trigger cleaning for the tape drive. Reinstall Ubuntu on this system.

No one with similar problems or an idea?

Gerhard Sulzberger

unread,
Nov 5, 2013, 3:50:24 AM11/5/13
to bareos...@googlegroups.com
Am Montag, 4. November 2013 17:50:25 UTC+1 schrieb Gerhard Sulzberger:

Thats the kern.log message when the system crashes;

Nov 5 09:46:13 bareos-sd01 kernel: [ 1582.934434] mpt2sas0: log_info(0x31120100): originator(PL), code(0x12), sub_code(0x0100)
Nov 5 09:46:13 bareos-sd01 kernel: [ 1582.934490] st0: Error e0000 (driver bt 0x0, host bt 0xe).
Nov 5 09:46:15 bareos-sd01 kernel: [ 1584.433352] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
Nov 5 09:46:15 bareos-sd01 kernel: [ 1584.433407] IP: [<ffffffff8121188c>] sysfs_find_dirent+0x1c/0x110
Nov 5 09:46:15 bareos-sd01 kernel: [ 1584.433447] PGD 0
Nov 5 09:46:15 bareos-sd01 kernel: [ 1584.433463] Oops: 0000 [#1] SMP

Gerhard Sulzberger

unread,
Nov 5, 2013, 5:55:41 AM11/5/13
to bareos...@googlegroups.com
So new plain installation of the Storage Deamon; But same problem again.

+I reinstalled the system with ubuntu12.04.3 server.

**********bareos-sd01 /etc/bareos # uname -a
Linux *****bareos-sd01 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

+The standard mpt2sas driver for ubuntu is in use. (LSI SAS 9200-8e connected with Tandberg T24 library)

**********bareos-sd01 ~ # modinfo mpt2sas
filename: /lib/modules/3.8.0-29-generic/kernel/drivers/scsi/mpt2sas/mpt2sas.ko
version: 14.100.00.00
license: GPL
description: LSI MPT Fusion SAS 2.0 Device Driver
author: LSI Corporation <DL-MPTFu...@lsi.com>
srcversion: 1751C6D1630FAD81CCD994F
alias: pci:v00001000d0000007Esv*sd*bc*sc*i*
alias: pci:v00001000d0000006Esv*sd*bc*sc*i*
alias: pci:v00001000d00000087sv*sd*bc*sc*i*
alias: pci:v00001000d00000086sv*sd*bc*sc*i*
alias: pci:v00001000d00000085sv*sd*bc*sc*i*
alias: pci:v00001000d00000084sv*sd*bc*sc*i*
alias: pci:v00001000d00000083sv*sd*bc*sc*i*
alias: pci:v00001000d00000082sv*sd*bc*sc*i*
alias: pci:v00001000d00000081sv*sd*bc*sc*i*
alias: pci:v00001000d00000080sv*sd*bc*sc*i*
alias: pci:v00001000d00000065sv*sd*bc*sc*i*
alias: pci:v00001000d00000064sv*sd*bc*sc*i*
alias: pci:v00001000d00000077sv*sd*bc*sc*i*
alias: pci:v00001000d00000076sv*sd*bc*sc*i*
alias: pci:v00001000d00000074sv*sd*bc*sc*i*
alias: pci:v00001000d00000072sv*sd*bc*sc*i*
alias: pci:v00001000d00000070sv*sd*bc*sc*i*
depends: scsi_transport_sas,raid_class
intree: Y
vermagic: 3.8.0-29-generic SMP mod_unload modversions
parm: logging_level: bits for enabling additional logging info (default=0)
parm: max_sectors:max sectors, range 64 to 32767 default=32767 (ushort)
parm: missing_delay: device missing delay , io missing delay (array of int)
parm: max_lun: max lun, default=16895 (int)
parm: diag_buffer_enable: post diag buffers (TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0) (int)
parm: prot_mask: host protection capabilities mask, def=7 (int)
parm: max_queue_depth: max controller queue depth (int)
parm: max_sgl_entries: max sg entries (int)
parm: msix_disable: disable msix routed interrupts (default=0) (int)
parm: mpt2sas_fwfault_debug: enable detection of firmware fault and halt firmware - (default=0)
parm: disable_discovery: disable discovery (int)


**********bareos-sd01 /etc/bareos # tapeinfo -f /dev/nst0
Product Type: Tape Drive
Vendor ID: 'HP '
Product ID: 'Ultrium 5-SCSI '
Revision: 'Z51U'
Attached Changer API: No
SerialNumber: 'HU1235R0RV'
TapeAlert[50]: Undefined.
MinBlock: 1
MaxBlock: 16777215
SCSI ID: 1
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: Not Loaded
Density Code: 0x58
BlockSize: 0
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0x1
DeCompType: 0x1
BOP: yes
Block Position: 0
Partition 0 Remaining Kbytes: 1459056
Partition 0 Size in Kbytes: 1459056
ActivePartition: 0
EarlyWarningSize: 0
NumPartitions: 0
MaxPartitions: 1

+ installed bareos for ubuntu 12.04

**********bareos-sd01 /etc/bareos # dpkg -l | grep bareos
ii bareos-common 12.4.4-662.1 Backup Archiving Recovery Open Sourced - common libraries and support files
ii bareos-storage 12.4.4-662.1 Backup Archiving Recovery Open Sourced - storage daemon
ii bareos-storage-tape 12.4.4-662.1 Backup Archiving Recovery Open Sourced - storage daemon tape support

+ When I start btape mpt2sas kernel module is also in use.

**********bareos-sd01 ~ # lsmod | grep mpt2sas
+Module Size Used by
+...
mpt2sas 162505 1
scsi_transport_sas 40856 1 mpt2sas
raid_class 13525 1 mpt2sas

+ I paste my configuration for the Tape Drive into /etc/bareos/bareos-sd.conf

Autochanger {
Name = loader01
Device = drive-0
Changer Command = "/usr/lib/bareos/scripts/mtx-changer %c %o %S %a %d"
Changer Device = /dev/tape/by-id/scsi-3200100d08051e3c2
#Changer Device = /dev/sg4
#Changer Device = /dev/sch0
}

Device {
Name = drive-0
Media Type = LTO-5
AutoChanger = yes
Changer Device = /dev/tape/by-id/scsi-3200100d08051e3c2
#Changer Device = /dev/sch0
Changer Command = "/usr/lib/bareos/scripts/mtx-changer %c %o %S %a %d"
#Drive Index = 0

#Archive Device = /dev/tape/by-id/scsi-3500110a00152f34e-nst


Archive Device = /dev/nst0
AutomaticMount = yes; # when device opened, read it
AlwaysOpen = yes;
RemovableMedia = yes;
RandomAccess = no;
LabelMedia = no;
Spool Directory = /mnt/spool
# Enable the Alert command only if you have the mtx package loaded
Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
# If you have smartctl, enable this, it has more info than tapeinfo
## Alert Command = "sh -c 'smartctl -H -l error %c'"
}

+ open the device with btape

**********bareos-sd01 /etc/bareos # btape drive-0


Tape block granularity is 1024 bytes.
btape: butil.c:285-0 Using device: "drive-0" for writing.
btape: btape.c:499-0 open device "drive-0" (/dev/nst0): OK
*test

=== Write, rewind, and re-read test ===

I'm going to write 10000 records and an EOF
then write 10000 records and an EOF, then rewind,
and re-read the data to verify that it is correct.

This is an *essential* feature ...

btape: btape.c:1183-0 Wrote 10000 blocks of 64412 bytes.
btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)
btape: btape.c:1199-0 Wrote 10000 blocks of 64412 bytes.
btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)
btape: btape.c:1241-0 Rewind OK.
10000 blocks re-read correctly.
Got EOF on tape.
10000 blocks re-read correctly.
=== Test Succeeded. End Write, rewind, and re-read test ===

...... And So on ......


We should be in file 5. I am at file 5. This is correct!

=== End Forward space files test ===


Ah, I see you have an autochanger configured.
To test the autochanger you must have a blank tape
that I can write on in Slot 1.

Do you wish to continue with the Autochanger test? (y/n): n
*speed
btape: btape.c:1081-0 Test with zero data, should give the maximum throughput.
btape: btape.c:930-0 Begin writing 3 files of 1.073 GB with raw blocks of 64512 bytes.
++++++++++++++++++++++++++++++++++

.......

btape: btape.c:930-0 Begin writing 3 files of 4.294 GB with raw blocks of 64512 bytes.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)

btape: btape.c:433-0 Volume bytes=4.295 GB. Write rate = 286.3 MB/s
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)

btape: btape.c:433-0 Volume bytes=4.295 GB. Write rate = 286.3 MB/s
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)

btape: btape.c:433-0 Volume bytes=4.295 GB. Write rate = 306.7 MB/s
btape: btape.c:407-0 Total Volume bytes=12.88 GB. Total Write rate = 292.8 MB/s

...... At the Random Test I always got the I/O Error

btape: btape.c:1093-0 Test with random data, should give the minimum throughput.
btape: btape.c:930-0 Begin writing 3 files of 1.073 GB with raw blocks of 64512 bytes.
++++++++++++++++++++++++++++++++++


btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)

btape: btape.c:433-0 Volume bytes=1.073 GB. Write rate = 71.58 MB/s
+++++++++++++++++++++++++++++++++


btape: btape.c:631-0 Wrote 1 EOF to "drive-0" (/dev/nst0)

btape: btape.c:433-0 Volume bytes=1.073 GB. Write rate = 76.70 MB/s
+++++++++++++++++++++++++++++++++


btape: btape.c:627-0 Bad status from weof. ERR=dev.c:1540 ioctl MTWEOF error on "drive-0" (/dev/nst0). ERR=Input/output error.

btape: btape.c:433-0 Volume bytes=1.073 GB. Write rate = 89.48 MB/s
btape: btape.c:407-0 Total Volume bytes=3.221 GB. Total Write rate = 78.57 MB/s

btape: btape.c:930-0 Begin writing 3 files of 2.147 GB with raw blocks of 64512 bytes.

Write failed at block 0. status=-1 ERR=Input/output error
*quit
**********bareos-sd01 /etc/bareos # tail /var/log/kern.log
Nov 5 10:16:30 *****bareos-sd01 kernel: [ 38.040017] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
Nov 5 10:16:30 *****bareos-sd01 kernel: [ 38.061785] e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Nov 5 10:16:30 *****bareos-sd01 kernel: [ 38.061989] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
Nov 5 10:24:24 *****bareos-sd01 kernel: [ 511.765218] ip_tables: (C) 2000-2006 Netfilter Core Team
Nov 5 10:24:24 *****bareos-sd01 kernel: [ 511.783231] ip6_tables: (C) 2000-2006 Netfilter Core Team
Nov 5 10:29:14 *****bareos-sd01 kernel: [ 801.818200] st0: Block limits 1 - 16777215 bytes.
Nov 5 10:33:15 *****bareos-sd01 kernel: [ 1042.826501] st0: Error e0000 (driver bt 0x0, host bt 0xe).
Nov 5 10:33:17 *****bareos-sd01 kernel: [ 1044.324529] st0: Error 10000 (driver bt 0x0, host bt 0x1).
Nov 5 10:33:17 *****bareos-sd01 kernel: [ 1044.326771] mpt2sas0: removing handle(0x0009), sas_addr(0x500110a00152f34c)
Nov 5 10:34:40 *****bareos-sd01 kernel: [ 1127.556276] st0: Error on write filemark.

**********bareos-sd01 /etc/bareos # lsscsi -g
[0:0:1:0] tape HP Ultrium 5-SCSI Z51U /dev/st0 /dev/sg1
[0:0:1:1] mediumx EXABYTE MAGNUM 224 C304 /dev/sch0 /dev/sg2
[1:0:0:0] disk ATA ST500DM002-1BD14 KC45 /dev/sda /dev/sg0

+ Then I also tested the drive with tar
+ I created a file with random content an 10GB size and a file with 10GB with zeros
+ dd if=/dev/urandom of=10Grandom.dat bs=1M count=10000
+ dd if=/dev/zero of=10Grandom.dat bs=1M count=10000
+ Then I wrote this file to tape with tar

**********bareos-sd01 ~ # mt -f /dev/tape/by-id/scsi-3500110a00152f34e-nst rewind
**********bareos-sd01 ~ # mt -f /dev/tape/by-id/scsi-3500110a00152f34e-nst status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 0 bytes. Density code 0x58 (no translation).
Soft error count since last status=0
General status bits on (41010000):
BOT ONLINE IM_REP_EN
**********bareos-sd01 ~ # tar cvf /dev/tape/by-id/scsi-3500110a00152f34e-nst 10Gzero.dat --totals
10Gzero.dat
Total bytes written: 10737428480 (11GiB, 101MiB/s)
**********bareos-sd01 ~ # tar cvf /dev/tape/by-id/scsi-3500110a00152f34e-nst 10Grandom.dat --totals
10Grandom.dat
Total bytes written: 8510259200 (8.0GiB, ?/s)
tar: /dev/tape/by-id/scsi-3500110a00152f34e-nst: Cannot write: Input/output error
tar: Error is not recoverable: exiting now
**********bareos-sd01 ~ # lsscsi
[1:0:0:0] disk ATA ST500DM002-1BD14 KC45 /dev/sda
**********bareos-sd01 ~ #

+ BANG write random data on the tape ends with a driver arror and dirve disconnect!

**********bareos-sd01 ~ # tail -f /var/log/syslog
...
Nov 5 11:43:16 *****bareos-sd01 kernel: [ 5239.742665] st0: Error e0000 (driver bt 0x0, host bt 0xe).
Nov 5 11:43:18 *****bareos-sd01 kernel: [ 5241.240707] st0: Error 10000 (driver bt 0x0, host bt 0x1).
Nov 5 11:43:18 *****bareos-sd01 kernel: [ 5241.240712] st0: Error on write filemark.
Nov 5 11:43:18 *****bareos-sd01 kernel: [ 5241.242186] mpt2sas0: removing handle(0x0009), sas_addr(0x500110a00152f34c)
.....
+ After a few seconds the device is back again
.....
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.848543] scsi 0:0:2:0: Sequential-Access HP Ultrium 5-SCSI Z51U PQ: 0 ANSI: 6
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.848551] scsi 0:0:2:0: SSP: handle(0x0009), sas_addr(0x500110a00152f34c), phy(3), device_name(0x500110a00152f34e)
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.848554] scsi 0:0:2:0: SSP: enclosure_logical_id(0x500605b006411d30), slot(3)
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.848558] scsi 0:0:2:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.850028] scsi 0:0:2:0: TLR Enabled
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.852251] st 0:0:2:0: Attached scsi tape st0
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.852255] st 0:0:2:0: st0: try direct i/o: yes (alignment 4 B)
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.852321] st 0:0:2:0: Attached scsi generic sg1 type 1
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5389.967180] st0: Block limits 1 - 16777215 bytes.
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5390.462372] scsi 0:0:2:1: Medium Changer EXABYTE MAGNUM 224 C304 PQ: 0 ANSI: 4
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5390.462381] scsi 0:0:2:1: SSP: handle(0x0009), sas_addr(0x500110a00152f34c), phy(3), device_name(0x500110a00152f34e)
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5390.462384] scsi 0:0:2:1: SSP: enclosure_logical_id(0x500605b006411d30), slot(3)
Nov 5 11:45:47 *****bareos-sd01 kernel: [ 5390.462387] scsi 0:0:2:1: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(5), cmd_que(1)
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5390.750844] ch0: type #1 (mt): 0x61+1 [medium transport]
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5390.750848] ch0: type #2 (st): 0x1+22 [storage]
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5390.750850] ch0: type #3 (ie): 0x71+1 [import/export]
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5390.750851] ch0: type #4 (dt): 0x51+1 [data transfer]
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5391.053001] ch0: dt 0x51: ch0: ID 3, LUN 0, ch0: Huh? device not found!
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5391.053007] ch0: INITIALIZE ELEMENT STATUS, may take some time ...
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5391.351859] ch0: ... finished
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5391.351865] ch 0:0:2:1: Attached scsi changer ch0
Nov 5 11:45:48 *****bareos-sd01 kernel: [ 5391.352084] ch 0:0:2:1: Attached scsi generic sg2 type 8

--

+ So it's not a problem from bareos I know, but has someone an Idea how to solve this problems?

Marco van Wieringen

unread,
Nov 5, 2013, 6:58:46 AM11/5/13
to bareos...@googlegroups.com
Gerhard Sulzberger <gerhard.sulzberger <at> runtastic.com> writes:

>
> Seems to be an driver error, but this only happens when the random test is
> running.
Stupid question but reading all your info in all posts did you try using
an other tape e.g. different media. As you say it only happens with
random data and tar with data that does not compress that good. It could
be as simple as bad media and as random data does not compress that well
it will use much more of the actual tape to write its data. LTO also
reads what it wrote in the same go e.g. the tape head has both a write
and read head where the readhead reads what the write head just wrote.
If that doesn't work out it will issue an io error (e.g. raise an
SCSI sense.) It also looks like you don't get the same sense data all
the time, I tried looking into the SCSI sense data but it seems that you
get info that is not documented well I only found one hit yesterday on
the exact error you reported at first without a solution.

--
Marco van Wieringen marco.van...@bareos.com
Bareos GmbH & Co. KG Phone: +49-221-63069389
http://www.bareos.com

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: Stephan Dühr, M. Außendorf, J. Steffens,
P. Storz, M. v. Wieringen



Stephan Dühr

unread,
Nov 5, 2013, 7:11:58 AM11/5/13
to bareos...@googlegroups.com
Hi,

I'd give the Tandberg diagnostic tool a try:
http://www.tandbergdata.com/de/index.cfm/support/support-tools/
If you still have warranty, they will probably ask you to run that anyway.

Regards
--
Stephan D�hr stepha...@bareos.com
Bareos GmbH & Co. KG Phone: +49 221-630693-90
http://www.bareos.com

Sitz der Gesellschaft: K�ln | Amtsgericht K�ln: HRA 29646
Komplement�r: Bareos Verwaltungs-GmbH
Gesch�ftsf�hrer: S. D�hr, M. Au�endorf,
J. Steffens, Philipp Storz, M. v. Wieringen

Gerhard Sulzberger

unread,
Nov 5, 2013, 7:55:21 AM11/5/13
to bareos...@googlegroups.com
Am Montag, 4. November 2013 17:50:25 UTC+1 schrieb Gerhard Sulzberger:

Hi Marco,

Thank you for the response.
I have the same issue with different tapes.
The Error happens on different positions of the file when I thest with tar.


10Grandom.dat
Total bytes written: 4113100800 (3.9GiB, ?/s)

10Grandom.dat
Total bytes written: 2351575040 (2.2GiB, ?/s)

Now I'll try an other OS. (CentOS)
We will see what happen.

Gerhard Sulzberger

unread,
Nov 5, 2013, 9:08:30 AM11/5/13
to bareos...@googlegroups.com
Am Dienstag, 5. November 2013 13:11:58 UTC+1 schrieb Stephan Duehr:
> Hi,
>
>
>
> I'd give the Tandberg diagnostic tool a try:
>
> http://www.tandbergdata.com/de/index.cfm/support/support-tools/
>
> If you still have warranty, they will probably ask you to run that anyway.
>
>
>
> Regards
>
>
>
> On 11/04/2013 05:50 PM, Gerhard Sulzberger wrote:
>
> > Hi bareos community!
>
> >
>
> > Thank you for your work on the bacula fork. At the moment I try to migrate our backup system to bareos.
>
> > But I have a problem with the new bareos-sd.
>
> >
>
> > I use Ubuntu 12.04.03 LTS & bareos-storage-tape 12.4.4-662.1 version & Tandberg T24 Library with an HP Ultrium LTO-5 SCSI drive.
>
> >
>
> > At the moment the loader is connectet with an LSI SAS 9200-8e HBA controller.
>
> > I installed the driver form LSI and loaded the modules.
>
> > I also used an Adaptec 1405 HBA, but I got nearly the same error with this adapter.
>
> >
>
> > I tried to test the tape drive with btape (test, autochanger & speed).
>
> > Test and autochanger execute without errors. But when the speed test runs the random file test I got an I/O Error.
>
> >
>
> > this is the log form /var/log/syslog
>
> >
>
> > kernel: [ 494.025655] mpt2sas0: log_info(0x3112010c): originator(PL), code(0x12), sub_code(0x010c)
>
> > kernel: [ 494.025700] st0: Error b0000 (driver bt 0x0, host bt 0xb).
>
> >
>
> > Seems to be an driver error, but this only happens when the random test is running.
>
> >
>
> > Tomorrow I'll post more detailed informations.
>
> >
>
> > Some Ideas?
>
> >
>
> > Thanks Community
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
>
>
>
>
> --
>
> Stephan D�hr stepha...@bareos.com
>
> Bareos GmbH & Co. KG Phone: +49 221-630693-90
>
> http://www.bareos.com
>
>
>
> Sitz der Gesellschaft: K�ln | Amtsgericht K�ln: HRA 29646
>
> Komplement�r: Bareos Verwaltungs-GmbH
>
> Gesch�ftsf�hrer: S. D�hr, M. Au�endorf,
>
> J. Steffens, Philipp Storz, M. v. Wieringen

**********bareos-sd01 ~/tandberg # apt-get install openjdk-7-jre
**********bareos-sd01 ~/tandberg # tar xzf tdtoollinux64.tar.gz

+ logout and login again via ssh with X-Forwarding

ssh -X **********bareos-sd01

**********bareos-sd01 ~ # cd tandberg/
**********bareos-sd01 ~/tandberg # ls
lib readme.txt scsiintf.log TDTool.jar tdtoollinux64.tar.gz.1 tdtool.sh test
**********bareos-sd01 ~/tandberg # ./tdtool.sh

+ run the I/O test with the Tandberg Tool

All devices found on this computer...
0. ATA ST500DM002-1BD14
1. HP Ultrium 5-SCSI
2. EXABYTE MAGNUM 224

Selected Target: 0 - HP Ultrium 5-SCSI HU1235R0RV Z51U SAS

>>> Tape IO Test Start
Param: WR, 64, Random Data
Checking block size info...Completed
Get current compression status - Disabled
Rewinding...Completed
Resetting IO log parameters...Completed
Writing logical beginning of tape...Completed
Writing 4,000 MB data to the tape...Completed
Retrieving log status parameters...Completed
Write Transfer Rate = 7125 MB/min
Rewinding...Completed
Locating the first data...Completed
Reading 4,000 MB data from the tape...Completed
Retrieving log status parameters...Completed
Read Transfer Rate = 7881 MB/min
Rewinding...Completed
Erasing Tape...Completed
>>> Tape IO Test Passed

Gerhard Sulzberger

unread,
Nov 6, 2013, 7:52:47 AM11/6/13
to bareos...@googlegroups.com
News update,

Now I test with CentOS 6.4

[**********bareos-sd01 ~]# modinfo mpt2sas
filename: /lib/modules/2.6.32-358.23.2.el6.x86_64/weak-updates/mpt2sas/mpt2sas.ko
version: 17.00.00.00


license: GPL
description: LSI MPT Fusion SAS 2.0 Device Driver
author: LSI Corporation <DL-MPTFu...@lsi.com>

srcversion: D2E9C46B38F2B099F197C2A


alias: pci:v00001000d0000007Esv*sd*bc*sc*i*
alias: pci:v00001000d0000006Esv*sd*bc*sc*i*
alias: pci:v00001000d00000087sv*sd*bc*sc*i*
alias: pci:v00001000d00000086sv*sd*bc*sc*i*
alias: pci:v00001000d00000085sv*sd*bc*sc*i*
alias: pci:v00001000d00000084sv*sd*bc*sc*i*
alias: pci:v00001000d00000083sv*sd*bc*sc*i*
alias: pci:v00001000d00000082sv*sd*bc*sc*i*
alias: pci:v00001000d00000081sv*sd*bc*sc*i*
alias: pci:v00001000d00000080sv*sd*bc*sc*i*
alias: pci:v00001000d00000065sv*sd*bc*sc*i*
alias: pci:v00001000d00000064sv*sd*bc*sc*i*
alias: pci:v00001000d00000077sv*sd*bc*sc*i*
alias: pci:v00001000d00000076sv*sd*bc*sc*i*
alias: pci:v00001000d00000074sv*sd*bc*sc*i*
alias: pci:v00001000d00000072sv*sd*bc*sc*i*
alias: pci:v00001000d00000070sv*sd*bc*sc*i*
depends: scsi_transport_sas,raid_class

vermagic: 2.6.32-279.el6.x86_64 SMP mod_unload modversions

parm: logging_level: bits for enabling additional logging info (default=0)

parm: sdev_queue_depth: globally setting SAS device queue depth

parm: max_sectors:max sectors, range 64 to 32767 default=32767 (ushort)

parm: command_retry_count: Device discovery TUR command retry count: (default=144) (int)


parm: missing_delay: device missing delay , io missing delay (array of int)
parm: max_lun: max lun, default=16895 (int)

parm: mpt2sas_multipath: enabling mulipath support for target resets (default=0) (int)
parm: disable_eedp: disable EEDP support: (default=0) (uint)
parm: sriov_enabled: sriov support enabled: (default=0) (uint)
parm: max_vfs: max virtual functions allocated per physical function (default=4) (uint)


parm: diag_buffer_enable: post diag buffers (TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0) (int)

parm: disable_discovery: disable discovery (int)


parm: prot_mask: host protection capabilities mask, def=7 (int)
parm: max_queue_depth: max controller queue depth (int)
parm: max_sgl_entries: max sg entries (int)
parm: msix_disable: disable msix routed interrupts (default=0) (int)

parm: max_msix_vectors: max msix vectors (int)


parm: mpt2sas_fwfault_debug: enable detection of firmware fault and halt firmware - (default=0)

[**********bareos-sd01 ~]# uname -a
Linux ****bareos-sd01 2.6.32-358.23.2.el6.x86_64 #1 SMP Wed Oct 16 18:37:12 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
[**********bareos-sd01 ~]#

I have the same error like in Ubuntu server 12.04.3 when i test with btape (random)

Nov 5 16:52:22 *****bareos-sd01 dhclient[1608]: DHCPREQUEST on eth0 to 10.1.0.254 port 67 (xid=0x527081a2)
Nov 5 16:52:22 *****bareos-sd01 dhclient[1608]: DHCPACK from 10.1.0.254 (xid=0x527081a2)
Nov 5 16:52:23 *****bareos-sd01 dhclient[1608]: bound to 10.1.0.131 -- renewal in 1638 seconds.
Nov 5 16:58:03 *****bareos-sd01 kernel: st0: Error e0000 (driver bt 0x0, host bt 0xe).
Nov 5 16:58:05 *****bareos-sd01 kernel: mpt2sas0: removing handle(0x0009), sas_addr(0x500110a00152f34c)
Nov 5 16:58:16 *****bareos-sd01 kernel: st0: Error on write filemark.

I tested with tdtool from Tandberg --> no errors
I've done the Firmware upgrades for the drive and the library.

And now I thest with the testing tools from HP.
http://h18006.www1.hp.com/products/storageworks/ltt/

I can't test the library with this tool, but the drive.
The positive thing is that i got really much more information with this tool than with tdtool.

At the moment I think it could be a bunch of damaged tapes with we got in our last delivery.

Next update tommorrow.

Gerhard Sulzberger

unread,
Nov 18, 2013, 12:30:04 AM11/18/13
to bareos...@googlegroups.com
Update;

I send all the reports from tdtool, the diagnostic tool from the library and more to Tandbergdata.

They told me that it could be a problem from the Tapedrive.
I hope I can change the drive this week.

Am Montag, 4. November 2013 17:50:25 UTC+1 schrieb Gerhard Sulzberger:

Gerhard Sulzberger

unread,
Dec 10, 2013, 11:11:27 AM12/10/13
to bareos...@googlegroups.com
Easy solution,

Got new drive from tandberg -> Problems solved.
But the positive thing is I have learned very much about diagnostic tools an tape drive issues.

Now everything is working perfect.

Thank you bareos team & community.

Reply all
Reply to author
Forward
0 new messages