Bizarro comms issue

30 views
Skip to first unread message

Daemon Can

unread,
Jul 17, 2020, 3:24:55 PM7/17/20
to
I've an 44p-170 running 4.3.

Out of the blue, it started getting "hung" from a communications perspective.
After boot, the console and telnet will work fine for a time (anywhere from 3 minutes to a couple of hours) and then all connections stop working. The console appears to connected (electrically - CD RTS/CTS etc) but it stops responding to key presses or attempts to reset it. The system is pingable, and you can get an FTP "connection", but with no actual transfer possible (can't list files etc.)

Occasionally, it'll come back to life for a short while, then it'll be gone again.
Once & a while, I'll get a "respawning too quickly" message for tty25 on the console (it's disabled).

The only way to get it back is to hard reset it, and hope you get access to try a few things before it hangs up again.

Has anybody ever seen something like this before? (This is my 3rd RISC box in 30 years, and this is a new one for me)

Tks

Daemon Can

unread,
Jul 24, 2020, 11:56:42 AM7/24/20
to
More info:

Ran diagnostics (pre-boot). System claims everything is OK. Put the system into single user mode overnight, and it was still responding this morning. Rebooted into multi-user, and it locked up shortly after displaying the login banner on the console.


Grant Taylor

unread,
Jul 25, 2020, 1:07:44 AM7/25/20
to
Try multi-user with the network disconnected. See if the console still
responds a day later.

4.3 is rather long in the tooth. Is there any chance that something is
attacking it across the network?



--
Grant. . . .
unix || die

Daemon Can

unread,
Jul 26, 2020, 1:20:18 AM7/26/20
to
On Saturday, 25 July 2020 01:07:44 UTC-4, Grant Taylor wrote:

>
> Try multi-user with the network disconnected. See if the console still
> responds a day later.
>

Yeah, tried that. Also with the RANs turned off.

> 4.3 is rather long in the tooth. Is there any chance that something is
> attacking it across the network?

It's possible. We've been adding a fair number of remote users recently, so one of their PC's might be "pinging" the crap out of it.

Update: Over a few reboots, during the time I had before the system would lock up, I went into the inittab and disabled everything that I thought might cause a comms issue (faxserver, unused serial ports, etc.) I stopped the Progress database servers from coming up after boot as well. The system remained up after this (and is still running fine in multi-user). I was even able to start Progress after I was satisfied that things were going to stay up & running)

At this point, I'll have to conclude the it was one of the serial port processes that was swamping the machine's ability to communicate (time will tell). I've already advised people that given this is a nearly 20 year old machine, and that we've been on it's successor system (Windows based) for over a year now, that perhaps it's time that they shouldn't be so reliant on it's services & data (it could give up the ghost any day).

Thank you Grant, for your suggestions and for taking the time to respond.

Reply all
Reply to author
Forward
0 new messages