Tape drives, tape drives

8 views
Skip to first unread message

Tony Lawrence

unread,
May 1, 2000, 3:00:00 AM5/1/00
to
Things come in threes, but right now I've just got two :

- A Travan T4 unit. I was *sure* this darn thing was
defective, and HP replaced it, and now we have the same
symptoms: Access to tape locks, hangs in the driver, only
rebooting will clear it. This morning I thought I had
narrowed it down to specific cartridges because I test read
every cartridge in turn and it locked on Wednesday's.
Rebooted, tried Tuesday again, that was fine, tried
Wednesday, it locked. Rebooted, kept going, found Friday
had the same repeatable problem (yes, this is all very time
consuming). I was now certain that Wed and Fri were bad,
but decided to try the "good" tapes once more- one of them
locked it up. Other symptom: after these locks, and
shutting down OS without power down, the SCSI scan takes a
LOOONG time to find the tape- if the machine is power cycled
it finds it right away. Does that little clue mean anything
to anyone?

- A DAT unit. Here's a fun one- edge or tar work just fine
reading or writing the tape. But "dd" it, as edge does when
it reads the label, and the machine immediately panics-
"Kernel Stack too Deep". Not that it has anything to do
with whatever the problem really is, but why on earth would
dd be any different than tar tvf (in terms of the driver)?

Anyway, I'm thinking new cables and termination now, but
that second one (the "dd" thing) is a head scratcher- any
wild ideas would be interesting even if not acually useful..


--
Tony Lawrence (to...@aplawrence.com)
SCO/Linux articles, help, book reviews, tests,
job listings and more : http://www.pcunix.com

Bob Bailin

unread,
May 2, 2000, 3:00:00 AM5/2/00
to
Tony Lawrence <to...@aplawrence.com> writes:
> Things come in threes, but right now I've just got two :
>
> - A Travan T4 unit. I was *sure* this darn thing was
> defective, and HP replaced it, and now we have the same
> symptoms: Access to tape locks, hangs in the driver, only
> rebooting will clear it. This morning I thought I had
> narrowed it down to specific cartridges because I test read
> every cartridge in turn and it locked on Wednesday's.
> Rebooted, tried Tuesday again, that was fine, tried
> Wednesday, it locked. Rebooted, kept going, found Friday
> had the same repeatable problem (yes, this is all very time
> consuming). I was now certain that Wed and Fri were bad,
> but decided to try the "good" tapes once more- one of them
> locked it up. Other symptom: after these locks, and
> shutting down OS without power down, the SCSI scan takes a
> LOOONG time to find the tape- if the machine is power cycled
> it finds it right away. Does that little clue mean anything
> to anyone?

I used to have run into this problem a lot with early Seagate (Conner) DAT
drives. Since SCSI devices are intelligent, they can get themselves locked
up when confused by a possibly bad tape. The only way to "reboot" them is to
cycle power to the drive.

Sounds like you have a bad drive or a drive that could use a firmware update
to deal with tape problems more gracefully. Any chance of trying the "bad"
tapes on a different brand of TR-4 drive?

>
> - A DAT unit. Here's a fun one- edge or tar work just fine
> reading or writing the tape. But "dd" it, as edge does when
> it reads the label, and the machine immediately panics-
> "Kernel Stack too Deep". Not that it has anything to do
> with whatever the problem really is, but why on earth would
> dd be any different than tar tvf (in terms of the driver)?
>
> Anyway, I'm thinking new cables and termination now, but
> that second one (the "dd" thing) is a head scratcher- any
> wild ideas would be interesting even if not acually useful..
>
>
> --
> Tony Lawrence (to...@aplawrence.com)
> SCO/Linux articles, help, book reviews, tests,
> job listings and more : http://www.pcunix.com


--
Bob Bailin
72027...@compuserve.com

Dan

unread,
May 2, 2000, 3:00:00 AM5/2/00
to
Dear Tony,

I had recent struggles with a Seagate TR4 IDE drive. Not as
severe as yours, though, I just couldn't get thru a nightly
backup/verify with any regular success.

I sent out a cleaning cartridge, but the drive's firmware
was too early to run the cleaning. A firmware update allowed
the cleaning cartridge to work. Curiously, the manual had
instructions for swabbing with alcohol, but Seagate's web
site emphatically said dry process only. Seagate also recommended
retensioning, and to retension as many as 4 times if it
hadn't been done in a while. So we did, then set up Lone-tar
to do a tape reten prior to each Master.

My problems continued until the tapes were replaced.

I was also surprised to learn that tape drives are most in need
of cleaning after the initial use of a new tape. Stupid me!

I never recommend Travan type drives, so I don't deal with many of
them. I wonder if they have a significantly shorter life-span
than the 5.25" QICs such as 6525.
Best of luck,
Dan


In article <8emlg2$5ua$1...@ssauraaa-i-1.production.compuserve.com>,


Sent via Deja.com http://www.deja.com/
Before you buy.

Tony Lawrence

unread,
May 2, 2000, 3:00:00 AM5/2/00
to


Heh. I've been informed by HP that they no longer have that
model. They'll replace it with a 20 gig model at no extra
charge, but the customer will have to buy all new tapes.
I'm going to suggest that he at least consider blowing that
money on a DAT instead..

I love big bureaucracies. HP really is pretty good- you get
to talk to live, friendly and mostly pretty knowledgeable
humans, but the offered 20 gig replacement drive is
backordered. That's OK; it's only a tape drive (irony
symbol), but what can you do- they'd ship it if they had it,
but they don't. However, they DO have RMA tags, so this
morning I got a Fedex HOT! package in a box that could
easily have held the backordered drive, but all that was in
it was the lonely RMA paperwork- which gives me 14 days to
return the defective unit for which the replacement has not
yet been shipped..

Bill Campbell

unread,
May 2, 2000, 3:00:00 AM5/2/00
to
I've been having similar problems with Tandberg NS20 Travan drives, and a
some on Tandberg NS8s as well on SCO 5.0.5 and Caldera OpenLinux systems as
well. These are all SCSI, and the tendency is for them to hang the tape
drive such that ``kill -9'' doesn't do anything.

I asked Tom Podnar at Microlite about this this weekend, and particularly
about the retensioning recommendations. Tom's comment was that normal
incremental backups where something over a gig of data goes to the tape
exercises the tape enough, and that retensioning probably isn't necessary.

One thing I'm going to try is to run ``tape reten'' on SCO boxes, and
``mt -f /dev/st0 retension'' on the Linux machine prior to doing the
backup. I've found that this generally fails if I'm going to have a
problem with the tape, so I can test the exit status to avoid starting
edge and having it hang the tape.

Bill
--
INTERNET: bi...@Celestial.COM Bill Campbell; Celestial Systems, Inc.
UUCP: camco!bill PO Box 820; 6641 E. Mercer Way
FAX: (206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676
URL: http://www.celestial.com/

``The Income Tax has made more Liars out of American people than Golf has.''
Will Rogers

Bill Vermillion

unread,
May 2, 2000, 3:00:00 AM5/2/00
to
In article <8emq41$4jg$1...@nnrp1.deja.com>,
Dan <daniel...@my-deja.com> wrote:

>I was also surprised to learn that tape drives are most in need
>of cleaning after the initial use of a new tape. Stupid me!

Not really. That would be the assumption of almost any one who has
not worked closely with magnetic tape.

At times I think I've worked with tape since the time someone
dropped a roll of scotch tape into a bucket of rusty nails and
found they could record sound with it :-) (add :-) as needed)

Tapes differ a lot in manufacture and the first audio tapes were
essentailly nothing more than strips of paper (for the first
commericially available tapes on the Brush Soundmirror) with
iron paint on them. Magntite in the first - (black tapes) and
laster ferrous/feric oxides - aka rust.

The distance from the head to the media affect the ability to
record signals - and the close the two mate the higher freqeuncy
that can be recorded.

You would think that the tape is against the head - but on the
earliest tapes - which always looked dull like old floppy disks -
under a microscope the surface would look rough - lots of little
hills and valleys.

So in the old days of audio recording before you'd record something
for posterity you'd play a tape through from beginning to end at
least once to knock off the high-spots.

This turned your expensive tape-recorder head into a polishing
device. Later a compmany called Irish developed a process called
Ferro-Sheen - which gave a polised surface to tape. It was so
successful that Ampex bought them.

Since then tapes have become better and better each year. The
polish on the tape is done by a method called 'calendaring' which
basiclly squeezed the tape between two high pressure rollers.

This compacts the magnetizeable particles to make the surface more
dense, which in turn increase the s/n ability, which means that you
can squeeze more tracks and/or run it slower and/or increase the
density (the spacing between the particles).

Once the tape is made - in very wide rolls - it is slit
longitudinally to be put on spools/hubs/whatever and eventually
loaded into/onto the final transport (cartridge/reel/etc).

Dull cutters can cause ripples on the edge, fragments of magnetic
particle can break off and adhere to the surface. Any number of
things can happen.

When I can I buy used DAT tape. This is one-pass tape typically
from a drive manufacturer that is used one-time to test a new
drive. I've also purchased tape that was used for soft-ware
distribution.

These tapes fit the old 'play-it-once' idea from the early mag tape
days.

If the software permits its, I'll use the erase command to run a
tape from end to end before use. If not I'll record it once before
commiting it to use.

Test first before use. You'd probably not want to be a passenger
in a brand new aircraft after it rolled off the assembly line
without someone at least test-flying it once. Treat your tapes the
same way if you have to make a valuable save.

As you can tell I've been around magnetic media for a long time.

When I need to use floppies - if I can avoid it I'll never use a
pre-formatted floppy without at least formatting it under a Unix
based system.

I typically would format a disk under a know clean virus free DOS.
I would then dd the first track into a file. I then built a
wrapper program that formatted the disk with a full verify, then
would dd that above file back onto the disk.

Then I had a disk which >I< formatted and VERIFIED that I could
trust on an MS OR Unix based system.

I've seen almost all errors you could see with magnetic media
including places where you could see through the tape where
something bumped it when it was coated, to bumps in a tape cause by
a spider trapped when the spool was being wound, and tapes that
would stop going through a tranport when it was 1/1000ths of an
inch too wide (this was 1/4 audio media - tolerance are tighter in
DAT), to ... and the list goes on.

I've worked with it enough to know that if you have a one-time shot
at something you pre-record/test the medium first, and if possible
run a duplicate at the same time. "Ready any time you are CB" -
for the people who remember that classic punch line.

At times it's like smoke and mirrors.

Bill
--
Bill Vermillion bv @ wjv.com

Art L.

unread,
May 2, 2000, 3:00:00 AM5/2/00
to
On Tue, 2 May 2000 17:04:22 GMT, Bill Campbell <bi...@celestial.com>
wrote:

>I've been having similar problems with Tandberg NS20 Travan drives, and a
>some on Tandberg NS8s as well on SCO 5.0.5 and Caldera OpenLinux systems as
>well. These are all SCSI, and the tendency is for them to hang the tape
>drive such that ``kill -9'' doesn't do anything.
>
>I asked Tom Podnar at Microlite about this this weekend, and particularly
>about the retensioning recommendations. Tom's comment was that normal
>incremental backups where something over a gig of data goes to the tape
>exercises the tape enough, and that retensioning probably isn't necessary.
>
>One thing I'm going to try is to run ``tape reten'' on SCO boxes, and
>``mt -f /dev/st0 retension'' on the Linux machine prior to doing the
>backup. I've found that this generally fails if I'm going to have a
>problem with the tape, so I can test the exit status to avoid starting
>edge and having it hang the tape.
>
>Bill

In 1999 my company did Y2K upgrades for all of our customers which
entailed new tape drives. We decided to standardize on Travan NS4
drives from Aiwa, Tecmar, Seagate, and Tandberg (which is a Seagate
Drive). This decision is only slightly better than our previous
decision to use CMS Jumbo tape drives. In other words I have spent a
lot of time working on backup problems.

We returned a lot of drives (particularly Tecmar and Aiwa) for repair
and a lot of tapes for replacement. We have had so many problems with
Imation tapes that we use Sony tapes exclusively now.

I retension the tape every night before backing up which helps and try
to clean the tape drive once a month (using a dry process cleaning
cartridge).

I am working on a backup script which will reboot the system if the
backup has not completed 2 hours after it started and then retry the
backup.
Please repost to newsgroup because my news server
does not allow posting.

Bill Campbell

unread,
May 2, 2000, 3:00:00 AM5/2/00
to
On Tue, May 02, 2000 at 08:05:24PM +0000, Art L. wrote:
....

>In 1999 my company did Y2K upgrades for all of our customers which
>entailed new tape drives. We decided to standardize on Travan NS4
>drives from Aiwa, Tecmar, Seagate, and Tandberg (which is a Seagate
>Drive). This decision is only slightly better than our previous
>decision to use CMS Jumbo tape drives. In other words I have spent a
>lot of time working on backup problems.

I'm rapidly coming to the conclusion that the Travan drives are something
less than ideal, hardly in the class of the Junko drives though. They were
downright dangerous because they would appear to write properly, then be
unreadable.

Recently I've installed some DDS-3 drives which are at least twice as fast
as the Travans. Tom Podnar suggested that I try the HP DDS-3 or DDS-4
drives with OBDR support. With these, it's possible to boot off the tape
for recovery which could be very useful.

>We returned a lot of drives (particularly Tecmar and Aiwa) for repair
>and a lot of tapes for replacement. We have had so many problems with
>Imation tapes that we use Sony tapes exclusively now.

I took one look at an Aiwa drive that a distributor shipped us when I
ordered an HP, and sent it back because it looked like a cheap piece
of junk. We used one in-house for a while, but ended up returning it
to get it replaced with a Tandberg.

We've been using Imation tapes, and that could well be the problem.
I'll try the Sonys.

>I retension the tape every night before backing up which helps and try
>to clean the tape drive once a month (using a dry process cleaning
>cartridge).
>
>I am working on a backup script which will reboot the system if the
>backup has not completed 2 hours after it started and then retry the
>backup.

That might be reasonable. I would have to check the maximum time for the
backups because I've seen some systems that take longer than two hours
legitimately when they have a lot of small files to backup. It might be
better to check the error logs or ``dmesg'' output periodically to see if
tape errors appear, and reboot if they do.

I'm always leary of unattended reboots because of the possibility that
somebody left a floppy in the drive or similar problems. My present
procedure is to send an e-mail to the local admin if I see a tape hung, and
have them reboot the system.

Bill
--
INTERNET: bi...@Celestial.COM Bill Campbell; Celestial Systems, Inc.
UUCP: camco!bill PO Box 820; 6641 E. Mercer Way
FAX: (206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676
URL: http://www.celestial.com/

What's this script do?
unzip ; touch ; finger ; mount ; gasp ; yes ; umount ; sleep
Hint for the answer: not everything is computer-oriented. Sometimes you're
in a sleeping bag, camping out.
(Contributed by Frans van der Zande.)

Tony Lawrence

unread,
May 3, 2000, 3:00:00 AM5/3/00
to
Geoff Johnson wrote:
>
> If a tape driver is unkillable it is nearly always a device driver bug.
> There is no excuse for sleep at a unkillable priority without seting up
> timeout to signal the sleep. Even if its 5 minutes later this beats the
> forced reboot that is usually 5 minutes later anyway.

There's no excuse, but it's still extremely common. But
isn't it not that it's sleeping at an unkillable priority
but that it never comes out of the driver so the process
never sees the signal?

I think drivers should be written with an "abort" ioctl that
just tells it "give it up, Jack- I know you think you are
doing something useful, but you aren't, so just reset your
state and let your head pop back out of the water".

Geoff Johnson

unread,
May 4, 2000, 3:00:00 AM5/4/00
to
If a tape driver is unkillable it is nearly always a device driver bug.
There is no excuse for sleep at a unkillable priority without seting up
timeout to signal the sleep. Even if its 5 minutes later this beats the
forced reboot that is usually 5 minutes later anyway.


Bill Campbell wrote:
>
> I've been having similar problems with Tandberg NS20 Travan drives, and a
> some on Tandberg NS8s as well on SCO 5.0.5 and Caldera OpenLinux systems as
> well. These are all SCSI, and the tendency is for them to hang the tape
> drive such that ``kill -9'' doesn't do anything.
>
> I asked Tom Podnar at Microlite about this this weekend, and particularly
> about the retensioning recommendations. Tom's comment was that normal
> incremental backups where something over a gig of data goes to the tape
> exercises the tape enough, and that retensioning probably isn't necessary.
>
> One thing I'm going to try is to run ``tape reten'' on SCO boxes, and
> ``mt -f /dev/st0 retension'' on the Linux machine prior to doing the
> backup. I've found that this generally fails if I'm going to have a
> problem with the tape, so I can test the exit status to avoid starting
> edge and having it hang the tape.
>

> Bill
> --
> INTERNET: bi...@Celestial.COM Bill Campbell; Celestial Systems, Inc.
> UUCP: camco!bill PO Box 820; 6641 E. Mercer Way
> FAX: (206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676
> URL: http://www.celestial.com/
>

> ``The Income Tax has made more Liars out of American people than Golf has.''
> Will Rogers

--

Geoff Johnson

Geoff Johnson

unread,
May 10, 2000, 3:00:00 AM5/10/00
to
Tony Lawrence wrote:

>
> Geoff Johnson wrote:
> >
> > If a tape driver is unkillable it is nearly always a device driver bug.
> > There is no excuse for sleep at a unkillable priority without seting up
> > timeout to signal the sleep. Even if its 5 minutes later this beats the
> > forced reboot that is usually 5 minutes later anyway.
>
> There's no excuse, but it's still extremely common. But
> isn't it not that it's sleeping at an unkillable priority
> but that it never comes out of the driver so the process
> never sees the signal?
>
We are saying exactly the same thing.
The only way to not leave the drver is to sleep. The only way the
processes does not respond to signals is if it sleeps at too high
a priority.

I ioctl is useless if the upper levels of the driver will not allow
simultaneous opens of the device (usually a good thing).
Creating a co-device for issuing the ioctl to is harder than just
coding it correctly in the first place.

> I think drivers should be written with an "abort" ioctl that
> just tells it "give it up, Jack- I know you think you are
> doing something useful, but you aren't, so just reset your
> state and let your head pop back out of the water".
>
> --
> Tony Lawrence (to...@aplawrence.com)
> SCO/Linux articles, help, book reviews, tests,
> job listings and more : http://www.pcunix.com

--

Geoff Johnson

Tony Lawrence

unread,
May 10, 2000, 3:00:00 AM5/10/00
to
Geoff Johnson wrote:
>
> Tony Lawrence wrote:
> >
> > Geoff Johnson wrote:
> > >
> > > If a tape driver is unkillable it is nearly always a device driver bug.
> > > There is no excuse for sleep at a unkillable priority without seting up
> > > timeout to signal the sleep. Even if its 5 minutes later this beats the
> > > forced reboot that is usually 5 minutes later anyway.
> >
> > There's no excuse, but it's still extremely common. But
> > isn't it not that it's sleeping at an unkillable priority
> > but that it never comes out of the driver so the process
> > never sees the signal?
> >
> We are saying exactly the same thing.
> The only way to not leave the drver is to sleep. The only way the
> processes does not respond to signals is if it sleeps at too high
> a priority.

I always have trouble getting my brain wrapped around these
issues. So does everyone else, even the people who write
the drivers- that's why they screw up, right?

But the fine distinction I'm trying to make here is that a
process running in kernel code doesn't respond to signals
because it doesn't see signals until it pops back up into
user space. So yes, that's probably because it's sleeping,
but isn't the fact that it's in kernel space more
important? I dunno, as I said, at a certain point here my
brain boggles and I lose track of the overall picture :-)

>
> I ioctl is useless if the upper levels of the driver will not allow
> simultaneous opens of the device (usually a good thing).
> Creating a co-device for issuing the ioctl to is harder than just
> coding it correctly in the first place.

You are probably right. Still, if you can't plan for every
screwup of the hardware, it would seem smart to have a
safety valve that could let the administrator free it.

Geoff Johnson

unread,
May 11, 2000, 3:00:00 AM5/11/00
to
Tony Lawrence wrote:
>
> Geoff Johnson wrote:
> >
> > Tony Lawrence wrote:
> > >
> > > Geoff Johnson wrote:
> > > >
> > > > If a tape driver is unkillable it is nearly always a device driver bug.
> > > > There is no excuse for sleep at a unkillable priority without seting up
> > > > timeout to signal the sleep. Even if its 5 minutes later this beats the
> > > > forced reboot that is usually 5 minutes later anyway.
> > >
> > > There's no excuse, but it's still extremely common. But
> > > isn't it not that it's sleeping at an unkillable priority
> > > but that it never comes out of the driver so the process
> > > never sees the signal?
> > >
> > We are saying exactly the same thing.
> > The only way to not leave the drver is to sleep. The only way the
> > processes does not respond to signals is if it sleeps at too high
> > a priority.
>
> I always have trouble getting my brain wrapped around these
> issues. So does everyone else, even the people who write
> the drivers- that's why they screw up, right?
>
> But the fine distinction I'm trying to make here is that a
> process running in kernel code doesn't respond to signals
> because it doesn't see signals until it pops back up into
> user space. So yes, that's probably because it's sleeping,

Only if it is sleeping at too high a priority. Below the threshhold
priorty the kernel will prematurely awaken the process if the process
is signaled.

Of course a stupid program could loop back and sleep again.

This is how reads on ttys etc, are forced to return from the kernel
side of the fence when the process is interupted. There is nothing
magic about being in the kernel except for bloody minded device drivers.

In the old days of QIC tape drives just about every driver on the market
suffered from this problem because a typoe in the standard led everyone
to
write a damaged driver.

> but isn't the fact that it's in kernel space more
> important? I dunno, as I said, at a certain point here my
> brain boggles and I lose track of the overall picture :-)
>
> >
> > I ioctl is useless if the upper levels of the driver will not allow
> > simultaneous opens of the device (usually a good thing).
> > Creating a co-device for issuing the ioctl to is harder than just
> > coding it correctly in the first place.
>
> You are probably right. Still, if you can't plan for every
> screwup of the hardware, it would seem smart to have a
> safety valve that could let the administrator free it.
>
> >
> > > I think drivers should be written with an "abort" ioctl that
> > > just tells it "give it up, Jack- I know you think you are
> > > doing something useful, but you aren't, so just reset your
> > > state and let your head pop back out of the water".
> > >
> > > --
> > > Tony Lawrence (to...@aplawrence.com)
> > > SCO/Linux articles, help, book reviews, tests,
> > > job listings and more : http://www.pcunix.com
> >
> > --
> >
> > Geoff Johnson
>
> --
> Tony Lawrence (to...@aplawrence.com)
> SCO/Linux articles, help, book reviews, tests,
> job listings and more : http://www.pcunix.com

--

Geoff Johnson

Reply all
Reply to author
Forward
0 new messages