strange slow down

Scott Randolph

unread,

Nov 9, 2000, 3:00:00 AM11/9/00

to

Please be patient with me! I'm a part time net admin and I don't work with
the hardware often enough to know all the terminology. The good news is that
my few severs just work. The bad news is since they work so well I haven't
learned anything <g>

I have one NW4.2 server and one NT4 SP6 server. The NT server is running NDS
for NT. I have two 24-port Cisco switches between the clients and the
servers. This configuration has been running smoothly for the past three
years or so. Suddenly, today, my clients are bogging down big time. The NW
server is loafing along at very low utilization levels. The NT server is
really only used for one SQL 7 application via tcp-ip. In other words, my
clients don't need to log in to the NT server in order to use the NT
application.

I also have one Unix-based apache web server in a box (Whistle InterJet)
which acts as my DHCP as well as public web server and e-mail server.

Because everyone is bogged down and because it happened suddenly I have a
feeling that I have a hardware problem. My guess is that one or both
switches is going bad, however, I've never experienced a failing hub or
switch so I don't know what the experience is like.

I thought that my Unix based internet server was failing, however, if I turn
it off and only access the SQL program then I still have slow moving
clients. I turned off both the NT server and the Unix server and things
still ran slowly. That leaves me with the switches. How do I check to see if
a switch is failing? Any other suggestions?

TIA,

Scott

Barry St.John [SysOp]

unread,

Nov 10, 2000, 3:00:00 AM11/10/00

to

Before you reach the conclusion that it is the switches, check the NetWare
server. It's possible, depending on the NLMs you have running, that
memory has become fragmented and you need to reboot. Some NLMs will load
and use memory, but unloading them doesn't release all of the memory they
had been using. This will eventually fragment memory to the point where
the server performs poorly. This isn't exactly a NetWare problem, but one
of the particular NLM(s) that exhibit this behaviour. Rebooting the
NetWare server will clear this condition, if that is what's causing the
problem.

Other things that can cause slowness:

Duplicate addresses. It is not usually possible to duplicate a MAC
address (the hardware address of a NIC) though some NIC drivers have that
capability to override the hardware address with one specified in
software. Similarly, an IP address can very easily be duplicated. Either
of these conditions will cause slowness in a network, sometimes creating
an inoperable condition. Even if you use a DHCP server to dole out IP
addresses, it's possible in some cases to have a duplicate address
assigned. All it could take is one "creative" user playing around to set
a fixed address on a workstation to mess things up.

The disk channel can also be a bottleneck. This sort of thing doesn't
typically happen overnight, but rather over time. A disk subsystem that
performs well when the server is new could bog down over time because of
the increased amount of data stored on the drives, and/or the number of
users accessing that data. A more drastic reduction in performance could
be caused by a failing or failed drive. For example, in a RAID 5
environment, a failed drive will not cause the server to crash because
data is striped across all of the drive, with parity calculated and stored
across all drives so that if one drive fails, there is enough information
on the remaining drives to rebuild the missing data. But this must be
calculated on the fly, which is not nearly as fast as reading from disk to
begin with.

Another thing that could cause poor performance is a failing server NIC.
A NIC can fail marginally so that it works, but not very well. You will
see lots of errors, resulting in lots of retransmissions, but perhaps not
enough to make it fail altogether. And since virtually all of the traffic
passes through that NIC, if it fails it effects everybody.

I'd check these things out first. If that isn't it, then you may well
have a faulty switch.

-Barry. [Novell Support Connection SysOp]

Scott Randolph

unread,

Nov 13, 2000, 3:00:00 AM11/13/00

to

Barry, thanks for the great info. I'll get started on your list of things to
check. I'll see what I can find.

"Barry St.John [SysOp]" <Chicago...@compuserve.com> wrote in message
news:VA.00000d5...@gf5q0.firstfedbankkc.com...

Scott Randolph

unread,

Nov 14, 2000, 3:00:00 AM11/14/00

to

> > Before you reach the conclusion that it is the switches, check the
NetWare
> > server. It's possible, depending on the NLMs you have running, that
> > memory has become fragmented and you need to reboot. Some NLMs will
load
> > and use memory, but unloading them doesn't release all of the memory
they
> > had been using. This will eventually fragment memory to the point where
> > the server performs poorly. This isn't exactly a NetWare problem, but
one
> > of the particular NLM(s) that exhibit this behaviour. Rebooting the
> > NetWare server will clear this condition, if that is what's causing the
> > problem.

Rebooting the NetWare server seems to have solved the problem.

Thanks again!

Barry St.John [SysOp]

unread,

Nov 14, 2000, 3:00:00 AM11/14/00

to

In article <8urqp2$t1...@nexus.provo.novell.com>, Scott Randolph wrote:

> Rebooting the NetWare server seems to have solved the problem.

Now to determine what NLM(s) were causing the problem. Any chance you're
running ARCserve or similar backup utility? What about anit-virus
software. There might not be anything you can do about these, except that
the mfr might have some patches to help with the problem.

Thanks for the feedback!

Scott Randolph

unread,

Nov 16, 2000, 3:00:00 AM11/16/00

to

I've been running the same NLMs for years. One of them isn't perfect but it
has never caused this kind of network slowdown before. I did get this e-mail
today about my corporate wide anti-virus software:
"There were a number of customers who reported computer slow downs
after running LiveUpdate and updating to the Nov. 6, 2000 virus
definitions. The Symantec AntiVirus Research Center quickly
identified the problem and within a matter of hours built a new virus
definition package. If you have experienced this slow down after
updating to the Nov. 6, 2000 definitions, run LiveUpdate again and
select "Virus Definitions (3)." When LiveUpdate is complete, your
virus definitions will be dated Nov. 9, 2000 and the slow down should
be resolved."

Perhaps this was my culprit?

Barry St.John [SysOp]

unread,

Nov 27, 2000, 3:00:00 AM11/27/00

to

In article <8v14hl$nt...@nexus.provo.novell.com>, Scott Randolph wrote:

> Perhaps this was my culprit?

Sounds plausible to me.