Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

hardware failing

18 views
Skip to first unread message

tho...@antispam.ham

unread,
Aug 1, 2020, 9:49:26 PM8/1/20
to
One of my OS/2 systems must be pushing 20 years old and is starting to
show signs of hardware failure. I'm faced with the choice of either
getting new, faster, more modern hardware and fighting with OS/2
compatibility issues, or trying to repair the existing system, which
is still surprisingly capable despite its age. The trick to the
latter choice is to figure out what component is failing.

I've not yet noticed any pattern to the random reboots. The machine
can be up for many days when it decides to just stop without any kind
of error message and reboot. However, by far the most frequent
occurrence of a failure is associated with the reboot process itself,
specifically when the white square and OS/2 appear in the upper left
corner of the screen. If it gets past that point, the system can
remain up for days at a time, but I've seen the system reboot process
interrupted at that particular point many consecutive times. Could
be an ambient temperature issue, with some component having become
particularly sensitive to overheating.

This much I can say. The Adaptec 29160 SCSI bus scan happens just
fine. The system can boot from a MEMTEST 3.0 bootable CD just fine
and run tests on memory for hours without failure. Furthermore, the
machine has error-correcting memory, and the BIOS hasn't recorded any
events, so I'm fairly confident that the SCSI adapter and memory aren't
the source of the problem.

I have Boot Manager installed, and that always comes up just fine, so
the Boot Manager partition appears to be okay, as is the disk controller.
I have the operating system installed on two different partitions, both
on the same physical disk as the Boot Manager partition. Doesn't matter
which one I try to boot from; I get the reboot when "white square OS/2"
appears with either, so I doubt the failure is associated with a
particular disk partition.

Once booted, I need to run CHKDSK on the partitions located on other
physical disks, and those always complete without problem, so I don't
think the disk drives represent the failing component.

I have seen more than one random reboot occur while using FTP to
transfer files to another computer, so the network card is a suspect,
and if the system touches the network card during the boot process,
particular immediately after the "white square OS/2" appears in the
upper left corner, then I'd have something to try.

Does anybody familiar with the details of the OS/2 boot process know
exactly what is happening immediately after the "white square OS/2"
appears? Narrowing down the list of possible failing components is
the goal here.

Lars Erdmann

unread,
Aug 2, 2020, 3:41:33 AM8/2/20
to
I think the component that is most likely to fail is the power supply.
In particular, if it is too small in power.
I cannot say for sure if this is the problem with your system but power
supplies are not that expensive (compared to the MOBO at least) and
therefore I would give that a try.
And make sure if offers plenty of Watts. Pick a larger one than what you
currently have.

Lars

On 02.08.20 03.49, tho...@antispam.ham wrote:
> One of my OS/2 systems must be pushing 20 years old and is starting to
> show signs of hardware failure. I'm faced with the choice of either
> getting new, faster, more modern hardware and fighting with OS/2
> compatibility issues, or trying to repair the existing system, which
> is still surprisingly capable despite its age. The trick to the
> latter choice is to figure out what component is failing.
...

David Wade

unread,
Aug 2, 2020, 5:09:59 AM8/2/20
to
Almost certainly a capacitor dried up. Its worth getting the caps on the
Mother board and in the PSU replaced.

Dave

Marcel Mueller

unread,
Aug 2, 2020, 6:40:46 AM8/2/20
to
Am 02.08.20 um 03:49 schrieb tho...@antispam.ham:
> I've not yet noticed any pattern to the random reboots. The machine
> can be up for many days when it decides to just stop without any kind
> of error message and reboot. However, by far the most frequent
> occurrence of a failure is associated with the reboot process itself,
> specifically when the white square and OS/2 appear in the upper left
> corner of the screen. If it gets past that point, the system can
> remain up for days at a time, but I've seen the system reboot process
> interrupted at that particular point many consecutive times. Could
> be an ambient temperature issue, with some component having become
> particularly sensitive to overheating.
>
> This much I can say. The Adaptec 29160 SCSI bus scan happens just
> fine. The system can boot from a MEMTEST 3.0 bootable CD just fine
> and run tests on memory for hours without failure. Furthermore, the
> machine has error-correcting memory, and the BIOS hasn't recorded any
> events, so I'm fairly confident that the SCSI adapter and memory aren't
> the source of the problem.

Most likely you have unstable power supply. Check all electrolytic
capacitors on the main board and inside the power supply for bulges. If
one is positive it is defective. Unfortunately the opposite does not
always apply.

Whether you are able to replace this capacitor on your own is another
question. In the power supply this is quite easy. On the main board you
need soldering experience and a powerful soldering iron.

Alternatively you may replace the defective component. But do not expect
any old power supply fond somewhere to work. I had at most 3 consecutive
defective power supplies in one day.

Do not care much on the rated maximum power. It is almost never the
cause of problems. Larger power supplies will primarily consume more
power as they are less efficient at the same output.


> I have Boot Manager installed, and that always comes up just fine, so
> the Boot Manager partition appears to be okay, as is the disk controller.
> I have the operating system installed on two different partitions, both
> on the same physical disk as the Boot Manager partition. Doesn't matter
> which one I try to boot from; I get the reboot when "white square OS/2"
> appears with either, so I doubt the failure is associated with a
> particular disk partition.

Typically defective capacitors do not primarily cause problems at high
power supply. More often they cause problems at with load changes. And
the first major load change is when the kernel activates power saving
states of CPU. The BIOS of older hardware usually doesn't do so.


> Does anybody familiar with the details of the OS/2 boot process know
> exactly what is happening immediately after the "white square OS/2"
> appears? Narrowing down the list of possible failing components is
> the goal here.

You know Alt-F2? It will show you the drivers loaded.

But it is not too likely that it will help you much to diagnose hardware
problems. Almost any driver may fail because of this. You will just
blame the first one that executes some code that can no longer be
executed correctly.


Marcel

Dave Yeo

unread,
Aug 4, 2020, 2:33:25 AM8/4/20
to
On 08/01/20 06:49 PM, tho...@antispam.ham wrote:
> One of my OS/2 systems must be pushing 20 years old and is starting to
> show signs of hardware failure.

While the others are likely correct about the power supply, or worse, a
system board capacitor, I'd start with the easiest thing. Unplug
everything and plug it in again. My sons computer recently died and
wouldn't boot, he removed the ram and reinserted it and it came back to
life and has been working fine. Likely just some oxidation in his case
and possibly yours. It's an easy thing to do. I've also heard of cards
working themselves slightly loose.
Dave

Dan Camp

unread,
Aug 16, 2020, 12:57:07 AM8/16/20
to
On Sun, 2 Aug 2020 01:49:22 +0000 (UTC), tho...@antispam.ham wrote:

>One of my OS/2 systems must be pushing 20 years old and is starting to
>show signs of hardware failure. I'm faced with the choice of either
>getting new, faster, more modern hardware and fighting with OS/2
>compatibility issues, or trying to repair the existing system, which
>is still surprisingly capable despite its age. The trick to the
>latter choice is to figure out what component is failing.
>

Just throwing this out there as a possible workaround. I recently
pulled out an old OS/2 machine I hadn't booted in over 10 years because
I needed to get some files off it (to ironically build a newer machine
running ArcaOS).

Rather than risking booting up 20+ year old equipment and having who
knows what happen, I was able to use VMWare's standalone VM convertor
software to do a bare-metal conversion of my OS/2 machine to a VM which
I can run in VMWare Player.

The benefit of doing this is that I now have access to my OS/2 machine
in its entirety running virtually on a modern machine.

===================================
danny...@spamsucks.attglobal.net
(Change the obvious to reply)

Windows 9x -n- (win-doze): A 32 bit Extension to
a 16 bit Graphical Shell of an 8 bit Operating System
originally coded for a 4 bit Processor by a 2 bit
company that can't stand one bit of competition!!


0 new messages