Everything has been running very smoothly for two weeks. On Friday, everything
went down hill. They were printing a large document and the system froze after
about 16 pages (On the one HP Rugged Writer Parallel printer) and they got a
a Parity Error - Problem on Motherboard and the system dumped RAM and froze.
They tried rebooting about 4 times and got the same thing or Double Panics. I
was not there but the client wrote down the exact error messages. Each time
the printer was on and the system would boot and reprint about the same amount
of paper and then the system would panic. Eventually Compaq talked them through
running Compaq Diagnostics (8.03) which tested the RAM 3 times successfully.
Then I had them boot the system without the printer on and it froze within about
ten minutes - so I don't think the printer has anything to do with it.
I tried dialing into SOS but it hung after the I typed in my customer number.
I wonder if SCO's /etc/auth/systems/ttys file is scrambled from too much
uugetty action? Anyway, I though I would try installing unx329/ 486 boot
supplement which sounds like it might help but I don't understand why the
system ran OK for two weeks twenty four hours a day. Also, the unx329.ltr
file shows seems to correct a problem by truning off cache suring boot, and I
don't have the cache turned on on this system anyway. So does anyone have
any clues at all? I'd really appreciate any suggestions quickly. I'll be
going back tomorrow (Monday) morning to try installing the SLS and to run
the Compaq Diagnostics myself and perhaps move the rugged writer to the
serial port (Yecch - HP printers and serial ports don't mix well). Anyway,
I don't usually deal with Compaq equipment so maybe some wisdom from those
of you who do would help solve my problem.
Thanks in Advance.
.----.
\ o / | Kevin Clark (ke...@wumpus.celestial.com)
| | Lawrence & Clark Inc. Seattle.
/ \ | (206) 323-2864
________|______
> \ o / | Kevin Clark (ke...@wumpus.celestial.com)
Sounds like RAM to me. Compaq uses SCO to do it's memory tests. DOS based RAM
tests may not always find problems like this. The main question to ask
yourself is... "Did anything change?" No matter how insignificant if the
customer rebuilt the kernel, or turned on a modem, this may have an effect on
what is happening.
Good Luck.
--------------------------------------------------------------------------------
Independent SCO support and service provider in the D.C. metro area.
What I don't know I will learn. What I can't learn I'll hire someone to.
Paul A. Fischer | Open Age Inc. | pa...@openage.com
Sr. UNIX Sys. Consultant | (301) 948-6422 | (301) 948-9644 Fax
Huh? Compaq relies on SCO to test memory? Are there some magic memory
diagnostics built into SCO that I've never heard of? If a Compaq user suspects
a memory problem, they gotta go buy SCO to find out for sure? Ouch.
If it's memory, there is no reason why a decent diagnostic couldn't find a
problem SCO could. If the Compaq's diagnostics don't test memory in protected
mode and at full CPU speed, they are pretty lame diags. If they do, what
would be so "magic" that SCO would do that good diags wouldn't? (Mind you,
I have no clue what diags Compaq *does* provide -- I avoid Compaqs
whenever possible...)
For example, AT&T provided excellent diagnostics with their Service Manual
package in their Cascade series machines... you could select any or all
from a variety of tests, and run those tests against any or all addresses...
(for the full battery against 24MB, it was over a six hour job...)
>> yourself is... "Did anything change?" No matter how insignificant if the
>> customer rebuilt the kernel, or turned on a modem, this may have an effect on
>> what is happening.
>> Good Luck.
>> Paul A. Fischer | Open Age Inc. | pa...@openage.com
--
Alan Denney # al...@informix.com # {pyramid|uunet}!infmx!aland
"We must learn to live together as brothers,
or we shall perish together as fools." - Dr. Martin Luther King, Jr.
>Huh? Compaq relies on SCO to test memory? Are there some magic memory
>diagnostics built into SCO that I've never heard of? If a Compaq user suspects
>a memory problem, they gotta go buy SCO to find out for sure? Ouch.
I have seen many systems pass the most exhaustive memory tests a DOS diag can
give them. These same systems crashed under SCO. When I pulled out the memory
and replaced it (SIMM by SIMM, or module by module) the system would still pass
memory diags. After a swapping part of the memory out, the system would come
up and stay up! The only hardware change was memory.
DOS may survive under marginal memory, but a 32-bit, protected mode, full
memory using OS like UNIX, with multi-tasking will excercise memory more
rigorously than 99% of the DOS memory diags! Booting a multi-tasking OS like
SCO is a serious memory test!
>If it's memory, there is no reason why a decent diagnostic couldn't find a
>problem SCO could. If the Compaq's diagnostics don't test memory in protected
>mode and at full CPU speed, they are pretty lame diags. If they do, what
>would be so "magic" that SCO would do that good diags wouldn't? (Mind you,
>I have no clue what diags Compaq *does* provide -- I avoid Compaqs
>whenever possible...)
Minor FLAME!
The ignorant shouldn't speak so harsh, but rather should ask why I think what I
think.
FLAME OFF!
I used to work for the largest Compaq dealer in the US and I was the Sr. UNIX
specialist. I have extensive Compaq experience, and I am just responding to a
plea for help from someone who hasn't seen as much Compaq trouble shooting
as I have.
>For example, AT&T provided excellent diagnostics with their Service Manual
>package in their Cascade series machines... you could select any or all
>from a variety of tests, and run those tests against any or all addresses...
>(for the full battery against 24MB, it was over a six hour job...)
Compaq has similar memory tests that take days. The problem is that none of
the memory diags I've seen test RAM like a multi-tasking OS.
>Alan Denney # al...@informix.com # {pyramid|uunet}!infmx!aland
> "We must learn to live together as brothers,
> or we shall perish together as fools." - Dr. Martin Luther King, Jr.
Good .sig
Again... just trying to help and explain.
--------------------------------------------------------------------------------
Independent SCO support and service provider in the D.C. metro area.
What I don't know I will learn. What I can't learn I'll hire someone to.
Paul A. Fischer | Open Age Inc. | pa...@openage.com
The only thing you can say when a system runs under DOS and fails
under UNIX is that the computer probably isn't on fire :-)
I had the same type of memory problems on my Intel 303 (386-33
cache) and they went away when I swapped out all the memory. My
DOS-only box runs fine with the memory I took out of the 303.
Paul. How 'bout setting your right margin a bit narrower so it
will fit in when peoply reply :-)
> Independent SCO support and service provider in the D.C. metro area.
> What I don't know I will learn. What I can't learn I'll hire someone to.
> Paul A. Fischer | Open Age Inc. | pa...@openage.com
> Sr. UNIX Sys. Consultant | (301) 948-6422 | (301) 948-9644 Fax
>
Bill
--
INTERNET: bi...@Celestial.COM Bill Campbell; Celestial Software
UUCP: uunet!camco!bill 6641 East Mercer Way
FAX: (206) 232-9186 Mercer Island, WA 98040; (206) 947-5591
Equinox allows having its memory map above 1Mb. This fails miserably
if this area of memory is ram cached. Same applies for putting the
memory map between A000-EFFF. I suggest you eliminate this as a
possible issue by putting the memory map between A000-EFFF (per
recommendations in the manual) and turning off the ram caching
for this area of memory.
Otherwise let's assume that:
1. We're dealing with a hardware issue
2. That the cache is OFF so that the unx329 fix is not the issue.
3. That the ram may be ok.
4. You have a valid tape backup and can afford to play.
5. The panics are reproducable.
Here's how I would troubleshoot the problem.
1. Verify that the CMOS setup is correct and does not exceed the
recommended wait states for the speed of RAM installed.
2. Run some DOS diagnostics. Checkit or QA Plus is fine. Don't
ignore the serial and parallel ports. Verify that you don't
have two devices on one IRQ. In particular, check IRQ=7 (lpt1:).
3. Replace the parallel printer cable. I've had so many screwed
up cables that I always carry spares. They also tend to get
plugged in half-way.
4. Verify that the case is not "hot" relative to AC ground. I was
nearly incinerated by a miswired clone power supply. This is with
everything (inlcuding the printer and ups) turned on and connected.
5. Go into the setup. Slow the machine down and turn off everything.
No cache, no shadow, no high mem remap, no internal cache, no nothing.
See if it works. If it does, then start turning things on in the
order of maximum suspicion until the culprit is found. If it still
crashes, continue...
6. Remove ALL cards not necessary to troubleshoot the problem. This
means leave the drive, controller, keyboard, video, monitor, and
not much else. Be sure to remove the floppy disk and tape drives
(DMA hogs), Equinox, lan cards, AT/IO, and what not. Beat it up
some more. If it works, put the cards back in order of maximum
suspicion. If it's still there, you may have a wasted video,
hd, or motherboard. Don't ignore the keyboard. I had a DOS machine
that would chronically hang. T'was the keyboard.
7. Reduce the amount of RAM in the machine to the minimum (4Mb?).
This should aggrivate the problem and cause the system to swap
thereby exercising the hard disk and ram.
Note that the order of troubleshooting. Do what easiest first and
save the messy stuff for last. Don't try to "fix" by replacement.
Interaction (address conflicts) are a verrrrry common problem.
My gut feel, wild guess is that something changed or was moved.
My money's on a cheapo printer cable or a partially installed
printer cable.
I hope this helps.
--
# Jeff Liebermann Box 272 1540 Jackson Ave Ben Lomond CA 95005
# 408.336.2558 voice wb6ssy@ki6eh.#nocal.ca.usa 73557,2074 cis
# 408.699.0483 digital pager wb6ssy.ampr.org [44.4.1.86]
# je...@comix.santa-cruz.ca.us uunet!comix!jeffl jeffl%co...@ucscc.ucsc.edu