eDir 8.7.0.4 & NW5.1 - utilization problems

al...@canisius.edu

unread,

Jan 29, 2004, 9:03:32 AM1/29/04

to

We've begun experiencing some high utilization problems on some
of our servers since shortly after the new semester began. Posting
to the os.server.netware5x.utilities group only received a sugg.
on purging the SYS volumes on one machine that has consistent high
utilization. The only other change (I had noted early in my other
thread) was flattening our eDirectory tree structure.

All 4 servers are NW5.1 SP6, eDir 8.7.0.4. I had put off 8.7.1x or
8.7.3 due to a hardware failure in my test lab & having read some
messages that alluded to problems with NDPS serving queues.

Our tree formerly looked something like this

|_amh
|_buf
| |_adm
| |_fac
|_stu
| |_a-c
| |_d-f
| |_g-i
| |_j-l
| |_m-o
| |_p-r
| |_s-t
| |_u-z
|_tech

where servers reside in amh, buf, adm.buf, tech

The bindery context for the main server in buf had 13 levels. The
split structure for the stu container was to keep under 1500 objects
per container as recommended in the Hughes/Thomas Four Principles of
NDS Design (1996).

Our local Novell system engineer told me that this restriction no
longer existed with eDirectory 8.7. He also noted that eDir87 supports
up to 12 levels in a bindery context. (this was needed to support
Macintosh logins - the main server is our academic support machine).

Over the semester break, I moved all students from their subcontainers
to the stu level, so now our tree looks more like this:

|_amh
|_buf
| |_adm
| |_fac
|_stu
|_tech

with close to 6000 objects in the stu container. Logins for the
students were accelerated now that the registery "hack" that was
needed to search the tree has fewer places to look.

Last week we started seeing performance problems on the main server.
Looking at kernel threads, things appeared normal, although the AV
software was appearing (Symantec Antivirus Corp. Ed. 8.0) frequently.
I have unloaded the SAV server from startup & the utilization on the
main server has been nominal. However, I still see this on the
server in the amh container ("remote" site - connected via T1). I
ran local db repairs on all 4 servers this morning, with 5
modification time corrections on this amh container server. Ran until
I had 0 errors (only req'd 2nd run on the one server, all others were
clean to start).

Now I have 2 servers that have high utilization, and 2 that are OK.
None are running the SAV CE software, which has a number of us on
edge with the mydoom worm & others in apparent full swing.

I hope this is sufficient background - now the questions:

1. Is it possible that this has caused/contributed to my performance
issues? Top threads are usually server threads.

2. Have I now exceeded any design capabilities of 8.7.0.4 on a NW51
network?

3. Are there tools available that can tell me exactly what is going
on inside these servers?

4. Is there a better resource for me to be looking for assistance
on these issues?

Any help/guidance will be greatly appreciated!

Alan D. Weitzsacker, Sys Admin III
Canisius College, Buffalo, NY

Bert Boleij

unread,

Jan 29, 2004, 9:26:15 AM1/29/04

to

Any difference between the servers concerning the amount of physical memory?

al...@canisius.edu wrote:

--
Bert Boleij
Novell Engineer TRN
>
>
>
> Life is too short to drink bad wine...

Donald Albury

unread,

Jan 29, 2004, 11:15:13 AM1/29/04

to

Alan,

I don't see a mention of it, so I'll ask if you have Client File
Caching Enabled set to OFF on the servers? Oplocks can cause high
utilization. Also, are there a lot of purgeable files on any volume
(not just SYS)? Are the volumes using compression, and are they also
relatively full?

Donald Albury
Novell Product Support Volunteer SysOp
Sorry, no replies to e-mail responses

Andrew C Taubman

unread,

Jan 29, 2004, 5:38:28 PM1/29/04

to

1. Is it possible that this has caused/contributed to my performance
issues? Top threads are usually server threads.

Not sure what you're referring to as "this" here. I see no evidence
whatsoever this is a DS problem.

2. Have I now exceeded any design capabilities of 8.7.0.4 on a NW51
network?

Nowhere near it. We have trees with tens of millions of objects working
fine.

3. Are there tools available that can tell me exactly what is going on
inside these servers?

NRM - profile/debug info - profile cpu util by NLM

4. Is there a better resource for me to be looking for assistance on these
issues?

Better than here on the forums, you mean? Or better than Monitor?
--
Andrew C Taubman
Novell Support Forums Volunteer SysOp
http://support.novell.com/forums
(Sorry, support is not provided via e-mail)

Opinions expressed above are not
necessarily those of Novell Inc.

al...@canisius.edu

unread,

Jan 30, 2004, 1:29:26 PM1/30/04

to

The machine with the single CPU at the remote site has 640MB. All the
others have 1100MB.

al...@canisius.edu

unread,

Jan 30, 2004, 1:36:57 PM1/30/04

to

I just verified - all have client file caching enabled set to off.

I have a lot of free space on all volumes, compression is enabled, and I
try to purge them on a fairly regular basis (once a week or so). I had
purged the volumes on the Proliant 1600 & it didn't seem to make any
difference. While just verifying the settings, that machine is now in a
nominal utilization state (was 0-10% around 10 minutes ago, but was in the
90-98% range around 9:30). I do have compression on, but

al...@canisius.edu

unread,

Jan 30, 2004, 1:40:57 PM1/30/04

to

> 1. Is it possible that this has caused/contributed to my performance
> issues? Top threads are usually server threads.
>
> Not sure what you're referring to as "this" here. I see no evidence
> whatsoever this is a DS problem.

I guess I could have been more specific - I was referring to the
consolidation of all the stu subcontainers into the stu container.

>
> 2. Have I now exceeded any design capabilities of 8.7.0.4 on a NW51
> network?
>
> Nowhere near it. We have trees with tens of millions of objects working
> fine.

I'm glad to see this - I doubt we'd ever see that. I think I have around
11000 objects total, including all the NDPS printers.

>
> 3. Are there tools available that can tell me exactly what is going on
> inside these servers?
>
> NRM - profile/debug info - profile cpu util by NLM
>

Thanks!

>
> 4. Is there a better resource for me to be looking for assistance on
these
> issues?
>
> Better than here on the forums, you mean? Or better than Monitor?

I wasn't sure if I was posting to the right area. I've been able to
resolved nearly all issues from the forums & as always appreciate all the
great info & direction I gain from them.

Although monitor does leave some things to be desired at times...

David Gersic

unread,

Jan 30, 2004, 5:06:23 PM1/30/04

to

On Thu, 29 Jan 2004 14:03:32 GMT, al...@canisius.edu wrote:

>Our local Novell system engineer told me that this restriction no
>longer existed with eDirectory 8.7.

Even in 1996, you could put many thousands of objects in a container, and NDS
ran fine. The management utility (NWAdmin), a 16-bit app running on Win31 would
puke on it, but that wasn't a limitation of NDS. I know this for a fact, as we
did it. I had well over 3000 objects per container back then.

>1. Is it possible that this has caused/contributed to my performance
> issues? Top threads are usually server threads.

My guess: your changes are what is causing your current problem. I've seen
Windows boxes get confused when something changes in eDir and some object it
wants to see is no longer where it used to be. It then goes in to a busy loop
looking for it. If your server threads you're seeing in Monitor keep changing
(server 45, server 42, server 10, etc.) every couple of seconds, then you're
seeing a lot of inbound requests (a request gets a thread allocated to it) that
are being serviced. When the request is complete, the thread is released, so
lots of requests will show up as constantly changing thread numbers in Monitor.

To track this down, your best bet is a packet sniffer to spot the offending
workstation(s). Get the server-based packet capture NLM (free) and a copy of
Ethereal (also free, and VERY nice, I prefer it to Sniffer). Grab some traffic
from the server and have a look at it to see what's going on.

Once you find the machine(s) that are pounding you with requests, you'll find
something like a no-longer-where-it-used-to-be print queue object reference
burried in the registry. You'll know what to look for, as it will be what the
machine is searching for in the packet trace.

>2. Have I now exceeded any design capabilities of 8.7.0.4 on a NW51
> network?

Nope. Got lots bigger here and it works fine.

---------------------------------------------------------------------------
David Gersic dgersic_@_niu.edu

I'm tired of receiving rubbish in my mailbox, so the E-mail address is
munged to foil the junkmail bots. Humans will figure it out on their own.

Bert Boleij

unread,

Jan 31, 2004, 5:17:29 AM1/31/04

to

What about the patch level? Any differences there? Try to get them to
the same patchlevel...

al...@canisius.edu wrote:

--

Bert Boleij

unread,

Jan 31, 2004, 5:21:55 AM1/31/04

to

Compression on a server with e.g. GroupWise is NOT advised by Novell,
the same goes for BM... BM should NOT use NSS volumes as wel... Anything
in that area?

al...@canisius.edu wrote:

--

al...@canisius.edu

unread,

Feb 2, 2004, 9:34:28 AM2/2/04

to

Not using either one of those products. Stopped using the inventory
portion of ZfD32 SP1 this past fall.

al...@canisius.edu

unread,

Feb 2, 2004, 9:33:02 AM2/2/04

to

Don't have the post SP6 patches on this machine (only on 1 so far). Will
be updating this one tomorrow morning (soonest possible to schedule).