Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Ultra1 Crash - FIX?

1 view
Skip to first unread message

Robert Dinse

unread,
Apr 29, 2001, 12:34:56 AM4/29/01
to

Ultra1 crashed again this evening.

Ultra1 is an Ultra 1 with 640MB of RAM and two 30GB IBM SCSI disks
presently running Linux 2.2.17 with RAID and NFS patches.

I thought it would be good to put web server on the machine with the files
since it eliminates NFS as a weak link, but for some reason the combination of
Apache/Sparc/Linux seems to be unstable.

This machine ran for months at a time before Apache was put on it even
with the above mentioned RAID and NFS patches.

I'm going to move the web server to a different platform because I've
become convinced the Linux/Sparc/Apache combination can not be made to be
stable. We had stability problems on 32-bit Sparc, Dave Miller said, "Go to
Ultra you won't have these problems", I did, problems persist although at a
slightly attenuated level.

Failure modes include running out of memory even though there are 640MB,
and this in part is related to bandwidth (and we have another T1 on order that
should rectify that), but other failure modes include "Failure to dereference
NULL pointer" which was also a frequent crash on the 32-bit hardware and a
state where the machine is still alive in terms of echoing keystrokes and you
can switch virtual consoles, but does not respond to anything else.

I'd be very interested to hear from people running Apache web servers that
are doing in the neighborhood of 2 million hits/day or better and are stable.
By stable I mean averaging at least one month between crashes with a load of
two million hits per day or better.

Rich Teer

unread,
Apr 29, 2001, 4:58:11 PM4/29/01
to
On Sat, 28 Apr 2001, Robert Dinse wrote:

> Ultra1 is an Ultra 1 with 640MB of RAM and two 30GB IBM SCSI disks
> presently running Linux 2.2.17 with RAID and NFS patches.
>
> I thought it would be good to put web server on the machine with the files
> since it eliminates NFS as a weak link, but for some reason the combination of
> Apache/Sparc/Linux seems to be unstable.

Not surprising. Try using Solaris instead of Linux.

--
Rich Teer

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-online.net

Phil Allen

unread,
Apr 30, 2001, 6:16:52 PM4/30/01
to
Robert,

I run Apache, some mod_ssl, with Sol 7 on Ultra 1s, 2s, 10s and a 3000.
I'm trying Sol 8 on another 3000 now. My Ultra 2 with 256m and twin 167s,
and the 10 of course, is just as fast as the 3000s it seems. Solaris, from
what I've heard, is just as fast and more secure than Linux web servers.
BSD on a Compaq or Gateway would be my second choice. But, I feel more
comfortable with Sol 7 for some reason. Maybe it's because my 7 E3000
production server just stays up all the time, except for patch installs and
long power outages. Why break it if it ain't broke?

Phil

--

"Rich Teer" <ri...@rite-group.com> wrote in message
news:Pine.GSO.4.21.010429...@mars.rite-group.com...

AB

unread,
Apr 30, 2001, 10:37:06 PM4/30/01
to
Robert Dinse <nan...@eskimo.com> wrote:
> Ultra1 crashed again this evening.
>
> Ultra1 is an Ultra 1 with 640MB of RAM and two 30GB IBM SCSI disks
> presently running Linux 2.2.17 with RAID and NFS patches.
>
> I thought it would be good to put web server on the machine with the
> files since it eliminates NFS as a weak link, but for some reason the
> combination of Apache/Sparc/Linux seems to be unstable.
<snip>

> Failure modes include running out of memory even though there are 640MB,
> and this in part is related to bandwidth

It runs out of swap, too? I presume it gets "thrashy" toward the end.

> (and we have another T1 on order that should rectify that),

^^^^
ITYM "exacerbate"

> but other failure modes include "Failure to dereference NULL pointer"
> which was also a frequent crash on the 32-bit hardware and a state where
> the machine is still alive in terms of echoing keystrokes and you can
> switch virtual consoles, but does not respond to anything else.

It all sounds like a nasty userspace memory leak that eventually triggers
Linux's less-than-stellar out-of-memory behavior. With the last failure
mode, can you get any response out of the Magic SysRQ?

It would be helpful to know what version of Apache you're using, how you
obtained the binary, what distro you're running and whether your kernel
source is from kernel.org, a Sparc/Ultra-specific tree or a source package
in your distro. Any Apache modules in use? What sort of load-- files,
Perl, compiled CGI, MySQL?

Have you monitored memory usage as the thing runs to try and see what's
sucking it up?

I admit to having a professional interest in this, especially if it can be
reproduced on x86.
--
drop ego to email me

0 new messages