Non-paged pool quota exceeded (ERROR_NOT_ENOUGH

@discussions.microsoft.com Simon (Sly)

unread,

Mar 17, 2005, 8:37:11 AM3/17/05

to

We’re experiencing problems due to exceeding the non-paged kernel memory
quota under Windows 2003. The problem has been traced to the creation of
kernel objects such as semaphores and events.

The application involved uses a single process, using a large number of
synchronisation objects (critical sections/condition variables implemented
using semaphores and events and read write locks). When approximately 2
million handles are in use (combinations of semaphores/events etc),
CreateSemaphore and CreateEvent start to fail with ERROR_NOT_ENOUGH_QUOTA.

I believe this relates to our non-paged per-process quota being exceeded.
The system non-paged pool bytes size as reported by performance monitor is
approximately 40MB when we get the out of quota error, significantly less
than the 256MB limit for 32bit Windows 2003
(http://support.microsoft.com/default.aspx?scid=kb;en-us;294418). This
implies that the calculated quota value for our process/user/system is too
low (if that’s the limit we are hitting).

My question now is how can we alter this quota, and how can we determine
what it is currently set to. I can get the current non paged pool usage value
for the process using GetProcessMemoryInfo, but this does not give us the
quota value itself. How do we establish the current quota value?

The registry has a NonPagedPoolQuota value
(http://www.microsoft.com/resources/documentation/Windows/2000/server/reskit/en-us/Default.asp?url=/resources/documentation/Windows/2000/server/reskit/en-us/regentry/29934.asp)
but it advises not changing that value. The same goes for the
NonPagedPoolSize value; question here is; can we influence these values to
give us a larger quota? Should we go tinkering in here to solve it?

However, when this error occurs, it effects other processes and other users
(one situation resulted in a different user (but within the same group as the
user running the out of quota process) couldn’t map a network drive (out of
quota error raised). For another user VC++ failed with a similar error. This
implies it is not a per-process problem; more a per system/group quota?

MSDN states
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sysinfo/base/kernel_objects.asp)
there is a limit of ‘significantly less than 2^24’ kernel handles that can be
active at any one time, but I am unable to locate how significantly less it
may be. From empirical tests, the out of quota error occurs when
approximately 2 million handles (~2^21) are active. Is this significant
enough? ;)

There appears to be very little information on how these quotas are
dynamically calculated. Does anyone have any information on how we can
increase/disable these quotas? The system itself is essentially running a
single (primary) process, in terms of resource usage, so we’re not concerned
with equal resource usage over accounts etc.

If 2 million handles sounds like a lot (specifically within a single 32 bit
process), note we’re using AWE to allocate all core memory, and a each object
within this core memory requires synchronisation objects to control access
(current read/write lock implementation uses two condition variables each,
which in turn use two handles for a semaphore and an event, leading to the 2
million handles in a short period of time).

Example code:

Something like the following:

while (true) HANDLE foo = CreateSemaphore (NULL, 0, 0x7fffffff, NULL);

i.e. just create a load of handles. This dies with ERROR_NOT_ENOUGH_QUOTA
after approximately 2 million calls.

On a slightly unrelated issue, can anyone explain the memory requirements
for CriticalSections? Creation of the actual CRITICAL_SECTION results in 4
bytes being allocated, but calling InitializeCriticalSection causes an
additional ~72 bytes in the process address space to be allocated. When
you’re using a lot of CriticalSections (millions), the usage starts to become
significant. Is there any more information on this?

For the record we’re using a single process, using AWE with /PAE and /3GB
switches. We’re not hitting any other memory limits when these quota errors
occur.

System:
Windows 2003 Enterprise Edition 32-bit (PAE enabled kernel)
ProLiant DL580 G2 (4x2.6Ghz Opteron - 16GB physical)

Any help on this matter would be greatly appreciated!

Thanks,

Simon

Slava M. Usov

unread,

Mar 17, 2005, 9:36:27 AM3/17/05

to

"Simon (Sly)" <Simon (Sly)@discussions.microsoft.com> wrote in message
news:010D289F-DE6A-48E1...@microsoft.com...

[...]

> For the record we're using a single process, using AWE with /PAE and /3GB
> switches.

/3GB is just the wrong switch to use in your circumstances, especially with
/PAE. It limits both paged and non-paged pools very significantly.

You might tinker with the pool limits in registry, but what you really want
to do is reduce the number of locks. It just does not make sense.

S

Simon (Sly)

unread,

Mar 17, 2005, 10:11:05 AM3/17/05

to

Thanks for your reply. I did wonder whether the presence of /3GB was adding
to the limitation (less space for kernel object etc), but linking the
application with /LARGEADDRESSAWARE:NO didn't seem to effect the point at
which it runs our of resources/quota. With that swtich set, the process
behaves (as in only gives 2GB to user space) as if no /3GB switch was
specified. So is data associated with objects such as semaphores and events
stored in the kernel segments of a process' address space (i.e. the left over
1GB in this case)?

Do you know if just having the /3GB switch enabled, but running an
application that is not linked to use >2GB would still limit the
paged/non-paged address space? I'll experiment without it non-the-less to see
if it helps.

Could you point me towards any docs with any info on expected degregation of
non-paged/paged pool sizes when using /3GB?

Unfortunately we're seriously limited on memory usage (hence the need for
AWE), so we will start hitting some brick walls somewhat sooner without /3GB.
Business reasons prevent us from going to 64-bit in the near future (which is
what we really need!).

Point noted on the locks; we need that number of locks, but maybe
implementing them without so many handles will be more appropriate.

Looking at the process with memmonitor, it's the pool page bytes that's
increasing (by ~20MB), not the non-pool page bytes. Got my wires crossed on
that, but from your reply the limiting factor of /3GB still applies.

Thanks for your help so far,

Simon

Leo Havmųller

unread,

Mar 17, 2005, 10:29:18 AM3/17/05

to

> The application involved uses a single process, using a large number of
> synchronisation objects (critical sections/condition variables implemented
> using semaphores and events and read write locks). When approximately 2
> million handles are in use (combinations of semaphores/events etc),
> CreateSemaphore and CreateEvent start to fail with ERROR_NOT_ENOUGH_QUOTA.

Why are you using a home-grown critical section implementation?
If its a single process, why are you using semaphores and events?

IMHO if you need 2 million handles, your design is broken.

Leo Havmøller.

Simon (Sly)

unread,

Mar 17, 2005, 11:05:05 AM3/17/05

to

Leo,

We're using a home-grown read/write lock implementation; I'm guessing this
is due to there being no native read/write lock or condition variables in the
SDK (not .NET; I may be wrong there as I didn't work on that area of the
application, but that's my understanding).

Semaphores and events are used in the implementation of condition variables,
which are used by read-write locks. Essentially it's a multi-threaded
application manipulating a single (large) in memory database. Locks are
essential to maintain consistence.

It's not the handles we need, more the synchronisation objects. It's a
matter of implementing read/write locks without semaphores/conditions, if
that's possible...

Thanks,

Simon

> Why are you using a home-grown critical section implementation?
> If its a single process, why are you using semaphores and events?
>
> IMHO if you need 2 million handles, your design is broken.

Slava M. Usov

unread,

Mar 17, 2005, 11:53:51 AM3/17/05

to

"Simon (Sly)" <Simo...@discussions.microsoft.com> wrote in message
news:6F3EC2CD-EDB6-410F...@microsoft.com...

[...]

> So is data associated with objects such as semaphores and events stored in
> the kernel segments of a process' address space (i.e. the left over 1GB in
> this case)?

Yes.

> Do you know if just having the /3GB switch enabled, but running an
> application that is not linked to use >2GB would still limit the
> paged/non-paged address space?

Yes. The 2/2 vs 3/1 repartition is global and is done early in the boot
sequence. /PAE, as I said, makes it even worse because /PAE consumes more
system address space for bookkeeping.

[...]

> Could you point me towards any docs with any info on expected degregation
> of non-paged/paged pool sizes when using /3GB?

Sorry, I tend to forget these things. I think that the non-paged pool is
always limited to 256M, and /3GB probably halves it. Which seems to
correlate with your observations: if we take that on object needs 50 bytes
of non-paged kernel memory, then 2M objects will consume 100M.

S

Simon (Sly)

unread,

Mar 17, 2005, 3:49:02 PM3/17/05

to

Okay thanks, it's all starting to make sense now. As you said, without /3GB
approximately twice the number of semaphores/events can be created.

I'm beginning to think it's more a limit associated with the number of
handles (or at least a resource that is connected to handles), rather than a
specific limit on paged/non-paged memory. It seems to use a similar amount of
paged memory when creating twice the number of kernel objects (without /3GB)
as it did with that switch, and still a fair way away from the max (obvs
don't know what our quota is though).

What I'm still unsure about is:

(a) if we're hitting a hard limit, wouldn't we expect to get an error
relating to out of handles/memory rather than not enough quota? (a quota
usually implies we can turn it up/off?)

(b) we seem to be using up all the handles/resources on the entire box
(breaks other users sessions/processes), so is this a system wide limit
rather than per-process?

(c) if above is true, is this limit something that's architecturally related
in any way to a 32bit implementation, or otherwise going to increase when
moving to 64bit Windows?

Thanks again!

Simon

Slava M. Usov

unread,

Mar 17, 2005, 4:45:16 PM3/17/05

to

"Simon (Sly)" <Simo...@discussions.microsoft.com> wrote in message

news:E99A0703-EDD8-49B9...@microsoft.com...

[...]

> I'm beginning to think it's more a limit associated with the number of
> handles (or at least a resource that is connected to handles), rather than
> a specific limit on paged/non-paged memory.

Handles are just indices in process-specific handle tables that point to the
real objects. Each handle consumes 8 bytes of paged memory; for 2 million
objects it is not insignificant but far from being a decisive factor. The
underlying objects, on the other hand, consume more; and what they consume
is non-paged memory. Any synchronization object has, at a minimum, the
Object Manager prefix [undocumented], which is about 40 bytes and the
DISPATCHER_HEADER [see ntddk.h], 16 bytes; so, more than 50 bytes already.

> It seems to use a similar amount of paged memory

Paged memory is not the limiting factor here.

[...]

> (a) if we're hitting a hard limit, wouldn't we expect to get an error
> relating to out of handles/memory rather than not enough quota? (a quota
> usually implies we can turn it up/off?)

There are process quotas as well, which are configured through the
NonPagedPoolQuota registry setting. You may indeed hit this limitation
first.

> (b) we seem to be using up all the handles/resources on the entire box
> (breaks other users sessions/processes), so is this a system wide limit
> rather than per-process?

It may be that the process quota that you're hitting is pretty close to or
even less than the entire available non-paged pool. Which is I think what
actually happens.

> (c) if above is true, is this limit something that's architecturally
> related in any way to a 32bit implementation, or otherwise going to
> increase when moving to 64bit Windows?

Definitely. Have a look here:

http://support.microsoft.com/default.aspx?scid=kb;EN-US;q294418

Just one line from that page:

Non-paged pool 128 GB 256 MB

The right-most value is that of 2K3 32 bit, the one in the middle of 2K3 64
bit.

Again, you ought to consider reducing the number of locks. You mentioned
that your application is an in-memory database, but perhaps you should
recall that RDMS's normally escalate row locks to page or even table locks
when there are too many of them.

S

Ivan Brugiolo [MSFT]

unread,

Mar 17, 2005, 4:52:27 PM3/17/05

to

If you have a kernel-debgger, or even a local kernel debugger,
you can use `!vm` to check on the status of the
paged-pool, non-paged-pool, session-pool, system-pte, etc, etc, etc.

Opened Handles to kernel objects do hold onto non-paged-pool.
The number of handles in a proecss can be limited by a Job-Object setting,
please check that one as well.

To investigate Pool-exhaustion, you can used the
`!poolused 2` debugger extension command in KD.

If you cannot interpret the results, post the output and we can help.

Empirical evidence shows that few tens thousands
of simple kernel objects (events, semaphores)
can be created without no problem.
An event consumes a handful of bytes of non-paged-pool, while
sophisticated objects such as Files or Sockets consume way more.

Did you check for a handle leak in your process ?
You can use `!handle` and/or `!htrace` and.or oh.exe to investigate those.

Non-Paged pool is a scarce and limited resource, for which there is an upper
limit.
IA64/AMD64 architectures move that limit way high from what is is today in
x86.

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm

"Simon (Sly)" <Simo...@discussions.microsoft.com> wrote in message

news:E99A0703-EDD8-49B9...@microsoft.com...

Simon (Sly)

unread,

Mar 18, 2005, 5:01:02 AM3/18/05

to

Okay the numbers are starting to add up! With the 50 bytes you mentioned I
can see how we exhaust the non-paged pool so quickly (especially with /3GB
user/kernel division switch).

Point noted on the quota; I think you're right on the system hitting a
non-paged pool max rather than just hitting a quota.

I'm with you on the fact that it is the non-paged pool filling up, but
performance monitor definitely shows the paged bytes value increasing (for
both the process and the system?

Before:

Memory:
Pool Nonpaged Byes 20701184
Pool Paged Bytes 17297408

After saturation (whilst running):

Memory:
Pool Nonpaged Byes 20701184
Pool Paged Bytes 54206464

Process:
Pool Nonpaged Byes 1080
Pool Paged Bytes 34953988

Maybe I'm mis-interpreting these results?

Point noted on the locks, I was maybe using the term 'database' out of
context. Essentially there are a large number of objects (2-12kb) indexed by
a single key; it's more a lump of indexed data rather than a sequence of
tables etc. Multiple threads then perform updates to individual objects in an
ad-hoc manner (either read or write); there's no relation/correlation between
objects so unfortunately we can't lock at a higher level. However, there's
scope to modify the behaviour of the locks I think.

Thanks again for your input,

Simon

Slava M. Usov

unread,

Mar 18, 2005, 7:41:18 AM3/18/05

to

"Simon (Sly)" <Simo...@discussions.microsoft.com> wrote in message

news:6E5E92D2-DD1B-4748...@microsoft.com...

[...]

> I'm with you on the fact that it is the non-paged pool filling up, but
> performance monitor definitely shows the paged bytes value increasing (for
> both the process and the system?

Hmm, no explanation. It is either paged or non-paged pool, then. /3G hits
them both, though. I would follow Ivan's advice and have a look at the pool
details with the kernel debugger.

[...]

> Essentially there are a large number of objects (2-12kb) indexed
> by a single key; it's more a lump of indexed data rather than a sequence
> of tables etc. Multiple threads then perform updates to individual objects
> in an ad-hoc manner (either read or write); there's no
> relation/correlation between objects so unfortunately we can't lock at a
> higher level. However, there's scope to modify the behaviour of the locks
> I think.

You could have two dozens of "global" locks in a table. When an object must
be locked, compute a hash function of that object -- if your key is numeric
you can just do "object_key modulo number_of_locks" -- or "object_ptr modulo
..." -- and lock the corresponding global lock. Because you clearly cannot
have thousands or, indeed, hundreds of threads simultaneously active, this
locking scheme should be sufficient.

If operations on your objects are fast, then you might even have just one RW
lock and be done with that.

S

m

unread,

Mar 18, 2005, 6:18:40 PM3/18/05

to

FYI:

You can also implement a 'lock' system using only InterlockedXXX and Sleep
that will not use any non-paged memory

"Slava M. Usov" <stripit...@gmx.net> wrote in message
news:euT#5f7KFH...@TK2MSFTNGP12.phx.gbl...

Simon (Sly)

unread,

Mar 21, 2005, 5:35:04 AM3/21/05

to

Slava,

Yep, I'll have a look with the kernel debugger to see what's going on.

Indeed, we typically only have 10-15 threads (mixture of readers/writers)
operating on the data at any one time, so your suggestion of a set of global
locks could be used (need to do some experimenting).

The system has recently been ported from a 64-bit UltraSparc III Solaris
platform, hence the current locking implementation inherited from that. It
looks like the suggested (business driven) approach will be simply to reduce
the maximum number of objects we can hold at any one time, which will see us
through until we can go over to Win64. However, it may still bite us if we
run two instances of the application (each with a smaller database/cache) as
the pool usage issues appear to be system wide.

The InterlockedXXX operations mentioned by 'm' look like an interesting
lead, we could use them for semaphore implementation and hence (roughly) half
the usage in the pool, so that's something to try.

Thanks all for your help!

Simon

Simon (Sly)

unread,

Mar 21, 2005, 5:41:06 AM3/21/05

to

Ivan,

Excuse my ignorance, but do I need to get the DDK to get the kernel
debugger, or can it be obtained as a separate application (or resource kit)?

I'll definitely dig around to see what's going on with the pool usage.

It's not an issue of resource leak, we can account for all of them, just
there are a lot! The system has been ported from Solaris, so it is looking as
if a like-for-like locking implementation is not suitable.

Thanks for you help,

Simon

Ivan Brugiolo [MSFT]

unread,

Mar 21, 2005, 12:25:19 PM3/21/05

to

The host part of the kernel debugger is a free download.
For example, this link should wrok
http://www.microsoft.com/whdc/devtools/debugging/installx86.mspx
Make sure you understand the symbols set-up part,
The target part is built-in in the operative system.

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm

"Simon (Sly)" <Simo...@discussions.microsoft.com> wrote in message

news:5E2BFF01-2157-4BD3...@microsoft.com...

Non-paged pool quota exceeded (ERROR_NOT_ENOUGH_QUOTA)

@discussions.microsoft.com Simon (Sly)

Slava M. Usov

Simon (Sly)

Leo Havmųller

Simon (Sly)

Slava M. Usov

Simon (Sly)

Slava M. Usov

Ivan Brugiolo [MSFT]

Simon (Sly)

Slava M. Usov

m

Simon (Sly)

Simon (Sly)

Ivan Brugiolo [MSFT]