locks-up
Systems:
486/66 w/BT745S EISA Host Adaptor
8MB RAM
NE2000 clone
1G SCSI HD
486SLC/40 w/BT545S ISA Host Adaptor
8MB RAM
WD Elite-16
420MB IDE
1G SCSI
2 SCSI CD-ROM
1 SCSI Tape
Both running Linux 1.1.27, no additional patches
Both die infrequently while spooling news, though the latter is "more
frequent" lately (nightly).
Any clues, help most appreciated.
Peeking into Kernel code reveals (That I'm a freaking clueless bastard
:() that this occurs while we are trying to queue a fresh SCSI command.
My guess is that somewhere in the swap routine a flawed assumption is
being made about the availability of an empty mailbox, but this doesn't
make any sense to me either.
--
Keith Smith aka Digital Designs ke...@ksmith.com
5719 Archer Rd. Free Usenet News and Internet Mail Services
Hope Mills, NC 28348-2201 All 28K/14K Modems (910) 423-4216/7389/7391
Somewhere in the Styx of North Carolina ... 14K-V.32/28K-V.34/28K-V.34
ah hah! Thanks me too! :)
: locks-up
: Systems:
: 486/66 w/BT745S EISA Host Adaptor
: 8MB RAM
: NE2000 clone
: 1G SCSI HD
486/66 BT445S, localbus SCSI adaptor, 32 megs mem, 1.37 gig SCSI, 1.05
gig SCSI. no tap.
: 486SLC/40 w/BT545S ISA Host Adaptor
: 8MB RAM
: WD Elite-16
: 420MB IDE
: 1G SCSI
: 2 SCSI CD-ROM
: 1 SCSI Tape
: Both running Linux 1.1.27, no additional patches
Happens variously under 1.1.x --> 1.1.34 (latest I've tried).
: Both die infrequently while spooling news, though the latter is "more
: frequent" lately (nightly).
Hmm. We run innd to news is running continously. Happens at fairly
random intervals.
Michael.
--
Michael O'Reilly @ iiNet Technologies, Internet Service providers.
Voice (09) 307 1183, Fax (09) 307 8414. Email mic...@iinet.com.au
GCS d? au- a- v* c++ UL++++ L+++ E po--(+) b+++ D++ h* r++ u+
e+ m+ s+++/--- !n h-- f? g+ w t-- y+
>panic: Unable to find empty mailbox for aha1542.
>in swapper task not syncing.
>locks-up
>Systems:
>486/66 w/BT745S EISA Host Adaptor
> 8MB RAM
> NE2000 clone
> 1G SCSI HD
>486SLC/40 w/BT545S ISA Host Adaptor
> 8MB RAM
> WD Elite-16
> 420MB IDE
> 1G SCSI
> 2 SCSI CD-ROM
> 1 SCSI Tape
>Both running Linux 1.1.27, no additional patches
>Both die infrequently while spooling news, though the latter is "more
>frequent" lately (nightly).
486/33 AHA1542B 16MB RAM, NE2000 Clone, 2 x 1G SCSI, Linux 1.1.37
Few days ago I bought an Apple 300 SCSI external CD-ROM drive and had
same symptoms you describe. It works okay for some time and locks up
completely from time to time. So I tried to reproduce the lock up and
found out that transfering large amount of data from CD-ROM will lock
up the machine. With for example:
cat * > /dev/null or
dd if=/dev/sr0 of=/dev/null
I could alway reproduce the lock ups.
Then I rebooted to MS-DOS and had complete lock ups too, when and only
transfering large amount of data.
I was confused by this and did RTFM again. The FM suggested to put the
terminator, it has connectors on both sides, that came with the CD-ROM
drive between cable and drive and after doing this, all problems went
away.
So I can use my CD-ROM drive now but still have some questions:
- The drive has two connectors. Why does the drive not work correctly
when I put the SCSI cable on one and terminator on the other?
(All of the external SCSI devices I saw used to work this way.)
- How does the terminator work if put between connector and cable
leaving the other connector empty? I alway thought that the terminators
should be on both ends of the cable and noth between external drive
and controller like it is now.
- Is it possible to use the drive with another computer when it is
connected to the now empty connector?
>Keith Smith aka Digital Designs ke...@ksmith.com
>5719 Archer Rd. Free Usenet News and Internet Mail Services
>Hope Mills, NC 28348-2201 All 28K/14K Modems (910) 423-4216/7389/7391
>Somewhere in the Styx of North Carolina ... 14K-V.32/28K-V.34/28K-V.34
--
Kang-Jin Lee
l...@tengu.in-berlin.de
[cat > /dev/snip]
>I was confused by this and did RTFM again. The FM suggested to put the
>terminator, it has connectors on both sides, that came with the CD-ROM
>drive between cable and drive and after doing this, all problems went
>away.
This does not surprise me. The problem of running out of mailboxes
comes up because I suspect that we are not releasing the mailboxes when
we attempt to reset a device. We only attempt to reset the device when
a command does not finish or the device locks up, and this tends to happen
if the cable has termination problems (among other things).
I have a theory that the enclosed patch will release the
mailboxes when we abort a command, and may solve the problem for people
who believe that they have a good cable. Please let me know if it helps
at all (conversely, please let me know if it does no good, or even worse).
-Eric
--- aha1542.c.~2~ Mon Jul 25 22:17:36 1994
+++ aha1542.c Tue Aug 2 22:38:55 1994
@@ -1083,6 +1083,7 @@
SCtmp->scsi_done(SCpnt);
HOSTDATA(SCpnt->host)->SCint[i] = NULL;
+ HOSTDATA(SCpnt->host)->mb[i].status = 0;
}
return SCSI_RESET_SUCCESS;
#else
--
"The woods are lovely, dark and deep. But I have promises to keep,
And lines to code before I sleep, And lines to code before I sleep."
I recently had an experience which might support your theory.
A couple days ago I had a system which would panic with "unable to find
empty mailbox for aha1542" when I accessed the CD-ROM drive for a
couple minutes. Before the panic, I would get a few "SCSI host 0 timed
out - aborting command" messages. Since I could reproduce the problem
under DOS, I decided the cause was probably not software, and swapping
hardware revealed that the 1542 card was bad (of course, before
swapping hardware, I double checked card configuration and bus
termination and couldn't find anything wrong there). After replacing
the card, things worked fine.
I suspect that with your fix, I would have only seen the ".. timed out
- aborting command" errors, and not the subsequent panic.