MmAllocateContiguousMemory and non-paged pool size

mirage2k2

unread,

Mar 2, 2010, 11:32:01 PM3/2/10

to

The documentation on MmAllocateContiguousMemory says that it attempts to
allocate memory from the non-paged pool, and if this fails then "it attempts
to perform the allocation from available unused pages".

Firstly, what on earth does that mean? Secondly, will the memory "from
available unused pages" still be non-paged? I will be accessing it at
DISPATCH_LEVEL so I need to be certain.

My guess is that the memory available in "unused pages" is somehow mapped
into the non-paged pool ... thus increasing the size of the pool.

My next question is about the size of the non-paged pool. I've read a few
articles that suggest that there is a limit of 250MB, irrespective of total
RAM. Does anyone know if there is a way of changing the configuration of
the system to make this value higher, i.e. some weird registry setting or
some system API call.

Basically I need to allocate between 32MB - 64MB of non-paged memory, at any
given time, and I need it to work. Allocating the memory in DriverEntry at
startup is probably my best shot but there will be times when I don't need to
allocate anything at all ... I don't want to allocate 32MB and then never use
it ... or hog it until I do need it.

Does anyone have any ideas?
Thanks
Mirage2k

Tim Roberts

unread,

Mar 3, 2010, 1:13:34 AM3/3/10

to

mirage2k2 <mira...@discussions.microsoft.com> wrote:
>
>The documentation on MmAllocateContiguousMemory says that it attempts to
>allocate memory from the non-paged pool, and if this fails then "it attempts
>to perform the allocation from available unused pages".
>
>Firstly, what on earth does that mean? Secondly, will the memory "from
>available unused pages" still be non-paged? I will be accessing it at
>DISPATCH_LEVEL so I need to be certain.

The current documentation does not make that statement. Yes,
MmAllocateContiguousMemory will always allocate non-paged memory.

>My next question is about the size of the non-paged pool. I've read a few
>articles that suggest that there is a limit of 250MB, irrespective of total
>RAM.

Depends on the operating system. In some versions, the total amount
depends on the size of the RAM, but the upper limit is approximately what
you describe.

>Does anyone know if there is a way of changing the configuration of
>the system to make this value higher, i.e. some weird registry setting or
>some system API call.

Nope.

>Basically I need to allocate between 32MB - 64MB of non-paged memory, at any
>given time, and I need it to work. Allocating the memory in DriverEntry at
>startup is probably my best shot but there will be times when I don't need to
>allocate anything at all ... I don't want to allocate 32MB and then never use
>it ... or hog it until I do need it.

If you absolutely need it, then allocate it in DriverEntry and fail the
call if you can't get it. Physical memory gets more and more fragmented as
the system continues to run. It doesn't take very long for chaos to take
over, so that dozens of megabytes are simply not available.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

David Schwartz

unread,

Mar 3, 2010, 2:57:42 AM3/3/10

to

On Mar 2, 8:32 pm, mirage2k2 <mirage...@discussions.microsoft.com>
wrote:

> The documentation on MmAllocateContiguousMemory says that it attempts to
> allocate memory from the non-paged pool, and if this fails then "it attempts
> to perform the allocation from available unused pages".
>
> Firstly, what on earth does that mean?

Exactly what it says. The non-paged pool is tried first, but if it's
empty, unused pages are used.

> Secondly, will the memory "from
> available unused pages" still be non-paged? I will be accessing it at
> DISPATCH_LEVEL so I need to be certain.

Yes.

> My guess is that the memory available in "unused pages" is somehow mapped
> into the non-paged pool ... thus increasing the size of the pool.

Nope. Because the memory is allocated directly from unused pages, it
never technically becomes part of the non-paged pool. But the logical
effect is the same as if the non-paged pool were grown and then the
pages were allocated from it.

> My next question is about the size of the non-paged pool. I've read a few
> articles that suggest that there is a limit of 250MB, irrespective of total
> RAM. Does anyone know if there is a way of changing the configuration of
> the system to make this value higher, i.e. some weird registry setting or
> some system API call.

http://blogs.technet.com/markrussinovich/archive/2009/03/26/3211216.aspx
It's not 250MB any more.

> Basically I need to allocate between 32MB - 64MB of non-paged memory, at any
> given time, and I need it to work. Allocating the memory in DriverEntry at
> startup is probably my best shot but there will be times when I don't need to
> allocate anything at all ... I don't want to allocate 32MB and then never use
> it ... or hog it until I do need it.

I would strongly advise you to hog the whole 64MB immediately and
never let it go. Physical memory tends to get fragmented and the odds
of being able to allocate 64MB of contiguous physical memory later on
are not good.

DS

Maxim S. Shatskih

unread,

Mar 3, 2010, 6:18:34 AM3/3/10

to

> The documentation on MmAllocateContiguousMemory

Forget this call at all.

The only need in _contiguous_ memory is for DMA, and, for DMA, you must use ->AllocateCommonBuffer.

MmAllocateContiguousMemory is only provided for DMA adapter object implementations to implement ->AllocateCommonBuffer (it usually just calls MmAllocateContiguousMemory with proper parameters).

This is not a general memory allocation routine.

--
Maxim S. Shatskih
Windows DDK MVP
ma...@storagecraft.com
http://www.storagecraft.com

alberto

unread,

Mar 3, 2010, 1:57:43 PM3/3/10

to

I would just read http://msdn.microsoft.com/en-us/library/ms801986.aspx,
it has all the information you need. Here's my experience,

1. Your only guarantee is that the space is both physically and
logically contiguous,
2. Once you own the physical memory, you can always remap it any way
you want,
3. Sometimes you may get away by fixing the user buffer to physical
memory, mapping it into kernel space, and DMAing directly from it,
4. I may be wrong, but I wouldn't swear by the statement that
nonpaged pool allocations larger than one page are physically
contiguous,
5. I don't know about the nonpaged pool, but my driver sometimes
allocates hundreds of megabytes of contiguous physical memory without
a problem,
6. If you don't grab that memory at start time, chances are that you
won't get it after a long enough period of time,
7. Grab your physical memory at start time, not at DriverEntry time,
8. You may run out of kernel-side address space before you run out
of physical memory.

What I do is, I grab a sizeable chunk of physically contiguous memory
at driver start time, and I suballocate space from it as my DMA needs
it. I do it at start time, when I handle the Start IRP; doing it at
DriverEntry time may be too early, because you may need data from your
hardware to figure out how large a buffer pool you will need: for
example, I need more memory if I have a quad-core board than if I only
have a one-core board, and I need even more if I have two, four, or
even six boards in the system; and that information isn't available
until start time.

One thing you must watch out is, sometimes you may run out of address
space before you run out of physical memory, specially in 32-bit on a /
3Gb configuration. For example, sometimes I do DMA directly from user
space, when the user buffer is well-behaved enough; if I map user
space to a kernel side address range, sometimes my kernel-side address
space gets pretty crowded. Sometimes your call to
MmAllocateContiguousMemory fails not because you've run out of memory,
but because you've run out of page slots!

Also, I use MmAllocateContiguousMemorySpecifyCache: gives you more
flexibility, you may not need that your memory is cached.

One more thing: if you can afford it, put a Registry entry or two in
your driver, so that you can configure it without having to rebuild.

Hope this helps,

Alberto.

On Mar 2, 11:32 pm, mirage2k2 <mirage...@discussions.microsoft.com>
wrote:

Maxim S. Shatskih

unread,

Mar 3, 2010, 2:22:28 PM3/3/10

to

>Also, I use MmAllocateContiguousMemorySpecifyCache: gives you more
>flexibility, you may not need that your memory is cached.

->AllocateCommonBuffer is a proper way.

On x86 and x64, there is no need in uncached DMA memory.

mirage2k2

unread,

Mar 3, 2010, 5:04:01 PM3/3/10

to

I should probably tell you a bit more about my driver, it is an ndis
intermediate driver, I do not use DMA or deal directly with any hardware. I
need to allocate large chunks of contiguous memory that will be accessed at
DISPATCH_LEVEL. It must be completely contiguous. I dont necessarily need
one 64MB chunk but I might need 4 * 16MB ... and not necessarily all at once
... what I really need is a failsafe way of getting chunks of at least 16MB.
I'm guessing that after a few days of system up-time even this might fail.

I'm currently using NdisAllocateMemoryWithTag() but I'm considering changing
this to MmAllocateContiguousMemory() because it sounds like this function is
far more likely to succeed. I dont care at all if this functions is special
and should only be used for DMA ... if it works I will use it.

I dont like the idea of allocating 32MB-64MB at startup because I dont want
to hog it. If the system only has 128MB of non-paged pool I can't just take
half!

Mirage2k2

Maxim S. Shatskih

unread,

Mar 3, 2010, 5:14:39 PM3/3/10

to

>I should probably tell you a bit more about my driver, it is an ndis
> intermediate driver, I do not use DMA or deal directly with any hardware.

Then you do not need contiguous memory.

>I need to allocate large chunks of contiguous memory that will be accessed at
> DISPATCH_LEVEL.

No, you need to allocate just _large chunks of nonpaged memory_, regardless of whether it is contiguous or not.

>It must be completely contiguous.

No, it must not.

Your driver is hardware-less and thus it is never mind on what physical pages underly its allocated region.

> I'm currently using NdisAllocateMemoryWithTag() but I'm considering changing
> this to MmAllocateContiguousMemory() because it sounds like this function is
> far more likely to succeed.

Just plain vice versa. MmAllocateContiguousMemory will fail if the physical memory pages are present but not contiguous. NdisAllocateMemoryWithTag (==ExAllocatePoolWithTag) will succeed in this case.

Doron Holan [MSFT]

unread,

Mar 3, 2010, 6:24:30 PM3/3/10

to

why does the memory need to be physically contiguous if you are a software
only driver? it will be virtually contiguous obviously...why is that not
enough?

--

This posting is provided "AS IS" with no warranties, and confers no rights.

"mirage2k2" <mira...@discussions.microsoft.com> wrote in message
news:B05E86CC-24D2-4F25...@microsoft.com...

mirage2k2

unread,

Mar 3, 2010, 7:19:01 PM3/3/10

to

To be honest, I don't really know the difference between physically
contiguous and virtually contiguous ... I guess virtually contiguous is
fragmented blocks that are somehow made to look like one big block.

Anyway, I'm using the memory as a packet data history buffer ... I use
offsets into it, I search for recurring patterns in it, I might also need to
iterate all the way through it (by incrementing a UCHAR pointer) ... to me
this sounds like it needs to be physically contiguous. I also need to do all
of this in my packet receive/send handlers which run @ DISPATCH_LEVEL.

Any comments?
Mirage2k.

"Doron Holan [MSFT]" wrote:

> .
>

Maxim S. Shatskih

unread,

Mar 4, 2010, 2:11:29 AM3/4/10

to

> To be honest, I don't really know the difference between physically
> contiguous and virtually contiguous ...

Any allocation call (except MmAllocatePagesForMdl which does not allocate virtual addresses at all) will give you virtually contiguous memory.

Physically contiguous means that the underlying physical pages are also contiguous. This is only important for DMA, since no other code (except DMA and MM's internals) even care about the underlying physical addresses.

And yes, MmAllocateContiguousMemory is provided for _DMA adapter object implementors_ to implement ->AllocateCommonBuffer. The DMA-capable drivers must use ->AllocateCommonBuffer.

>I guess virtually contiguous is
> fragmented blocks that are somehow made to look like one big block.

Do you know what are physical and virtual addresses?

Virtual addresses are the pointers, or address values in the machine code.

Physical addresses are the addresses set to the physical CPU's addressing bus.

The CPU has the page tables to perform transparent automatic translations virtual -> physical. These page tables are maintained by the OS's MM, and there are APIs which you can use to govern this translation.

This is how per-process address spaces are implemented in the OS, and also some other things like MmMapLockedPages/MmGetSystemAddressForMdlSafe.

> Anyway, I'm using the memory as a packet data history buffer ... I use
> offsets into it, I search for recurring patterns in it, I might also need to
> iterate all the way through it (by incrementing a UCHAR pointer) ...

Bad idea. Implement a parser state machine instead - like the GNU grep or sed does (see the sources).

> this sounds like it needs to be physically contiguous.

No.

mirage2k2

unread,

Mar 4, 2010, 6:38:01 AM3/4/10

to

I get what you are saying about virtual and physical contiguous memory. So
what function should I call to allocate virtual contiguous memory that can be
accessed at DISPATCH_LEVEL? I'm hoping that the answer is not a function
that allocates from non-paged pool ... because then I am back to square 1 ...
there just isn't enough of it.

Mirage2k2

"Maxim S. Shatskih" wrote:

> .
>

mirage2k2

unread,

Mar 4, 2010, 7:33:01 AM3/4/10

to

The help for MmAllocateContiguousMemory says that it will try non-paged pool
first and if the allocation fails then it will allocate from unused pages
(physical I guess). So why do you think that MmAllocateContiguousMemory
could fail in a situation where NdisAllocateMemoryWithTag would not?

My requirement is large allocations of memory that cannot be paged out, so
if MmAllocateContiguousMemory can get me these allocations what is there for
me to worry about?

Is there another alternative ... can I allocate from paged memory and then
call some memory manager function to get the memory paged in and locked in?

Mirage2k2

"Maxim S. Shatskih" wrote:

> .
>

Don Burn

unread,

Mar 4, 2010, 7:49:49 AM3/4/10

to

MmAllocateContiguousMemory will fail because the physical memory needs to
be Contiguous! So the odds of getting a contiguous block of memory of
size X are much higher than getting size X random pages. You should never
go near MmAllocateContiguousMemory unless your adapter needs it, because
if you do you are taking memory the next device may need.

On the non-paged versus paged on Windows 7 it won't create a difference,
on earlier OS'es there are limits to the pools. If you need to go that
extreme look at other approaches.

Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

> __________ Information from ESET Smart Security, version of virus
> signature database 4914 (20100304) __________
>
> The message was checked by ESET Smart Security.
>
> http://www.eset.com
>

Scott Noone

unread,

Mar 4, 2010, 11:37:14 AM3/4/10

to

> My requirement is large allocations of memory that cannot be paged out, so
> if MmAllocateContiguousMemory can get me these allocations what is there
> for
> me to worry about?

The fact that you're asking for physically contiguous memory, which is more
likely to fail.

> Is there another alternative ... can I allocate from paged memory and then
> call some memory manager function to get the memory paged in and locked
> in?

Allocating from paged pool and locking the memory down via an MDL works
(IoAllocateMdl, MmProbeAndLockPages).

You might also want to look into MmAllocatePagesForMdl and then mapping the
MDL in smaller chunks as you need them. That would on the surface appear to
be a more conservative approach since you wouldn't be hammering the general
use pools and you wouldn't be asking the Mm for contiguous memory. You would
have to worry about the resources consumed by mapping the MDL into the
kernel virtual address space though (system PTEs, namely) and the time
needed to build/teardown the mappings (which can be expensive if you're
doing it often).

In the end, you're going to need to strike a nice balance between
performance and resource consumption. This is the sort of thing that makes
driver writing hard (and fun!).

-scott

--
Scott Noone
Consulting Associate
OSR Open Systems Resources, Inc.
http://www.osronline.com

"mirage2k2" <mira...@discussions.microsoft.com> wrote in message

news:63B24366-D0DB-4C41...@microsoft.com...

Maxim S. Shatskih

unread,

Mar 4, 2010, 2:18:35 PM3/4/10

to

> The help for MmAllocateContiguousMemory says that it will try non-paged pool
> first and if the allocation fails then it will allocate from unused pages
> (physical I guess).

Nonpaged pool itself will try unused pages :-)

>So why do you think that MmAllocateContiguousMemory
> could fail in a situation where NdisAllocateMemoryWithTag would not?

Because NPP has relaxed requirement of not being physically contiguous.

On a loaded machine, you can only call MmAllocateContiguousMemory successfully for a large amount _at boot_.

> if MmAllocateContiguousMemory can get me these allocations

It will not give you this under load.

> Is there another alternative ... can I allocate from paged memory and then
> call some memory manager function to get the memory paged in and locked in?

Yes.

mirage2k2

unread,

Mar 4, 2010, 5:54:01 PM3/4/10

to

Thanks everybody for all your help. So it seems that the best option for me
is to use paged memory and then use mm functions to page it in and lock it
down so that it can never be paged out. Does anyone see a reason why this
will not work? Also, is it definately ok to access paged memory at
DISPATCH_LEVEL ... provided that it is paged in and locked down?

Mirage2k2.

m

unread,

Mar 4, 2010, 8:16:03 PM3/4/10

to

None at all - but the better question is why you want to scan a pattern
buffer at DISPATCH! If you find something bad, then, since the upper level
has already received some of it, your response, in general, has no different
efficacy than if you were scanning on a passive thread / work item, but the
scanning has much less impact on overall system perf.

"mirage2k2" <mira...@discussions.microsoft.com> wrote in message

news:DA1CC7E1-1C50-4189...@microsoft.com...

mirage2k2

unread,

Mar 4, 2010, 9:08:01 PM3/4/10

to

I need to decompress a compressed inbound packet in my receive packet handler
... it runs at DISPATCH_LEVEL. The packet cannot be passed up the network
stack until it has been decompressed. Deferring the handling of the packet
to a system thread that runs at PASSIVE_LEVEL will have too much of an impact
on the packet flow.

"m" wrote:

> .
>

Maxim S. Shatskih

unread,

Mar 5, 2010, 3:47:47 AM3/5/10

to

> will not work? Also, is it definately ok to access paged memory at
> DISPATCH_LEVEL ... provided that it is paged in and locked down?

Yes, but you cannot lock at DISPATCH_LEVEL.

Maxim S. Shatskih

unread,

Mar 5, 2010, 3:48:59 AM3/5/10

to

>I need to decompress a compressed inbound packet in my receive packet handler

Then why do you need these huge megabytes?

Allocate the space enough for, say, 100 packets - i.e. 150KB.

mirage2k2

unread,

Mar 5, 2010, 7:16:02 AM3/5/10

to

the huge megabytes are for packet history, the larger the history the better
the compression. Anyway, thanks for all your help. I now do this and it
works ok ...

PMDL mdl;
UCHAR *testPagedMem = (
(UCHAR *)ExAllocatePoolWithTag(PagedPool, MEM_SIZE, 'aaa1')
);
if (testPagedMem == 0)
{
return;
}

mdl = IoAllocateMdl(testPagedMem, MEM_SIZE, FALSE, FALSE, 0);
if (mdl == 0)
{
return;
}

MmProbeAndLockPages(mdl, KernelMode, IoWriteAccess);

Mirage2k

"Maxim S. Shatskih" wrote:

> .
>

Aram Hăvărneanu

unread,

Mar 5, 2010, 8:07:20 AM3/5/10

to

"mirage2k2" <mira...@discussions.microsoft.com> wrote in message

news:22D262D5-0AFB-4DE3...@microsoft.com...

> the huge megabytes are for packet history, the larger the history the
> better
> the compression.

True, but hundreds of megabytes?!? Not to mention increasing compression
cost (CPU time) with packet history...

I estimate keeping track of more then 1,000 is useless.

--
Aram Hăvărneanu

Maxim S. Shatskih

unread,

Mar 5, 2010, 9:01:07 AM3/5/10

to

> the huge megabytes are for packet history, the larger the history the better
> the compression.

This is true for PPM algorithms only (all other algos are fine with <1M buffer), and sorry, but PPM is so slow, so that offloading to worker thread will not give you any significant loss.

Also, I cannot understand what are you trying to implement. Is is an implementation of some RFC, or some design of your own?

If the latter - then I would all WinSock LSP in user mode which will compress all per-socket data flow. Much simpler.

Probably even not a LSP, but an explicit tunnel tool like ZeeBeeDee. This would be the simplest solution.

Deep compression on _network_ level is a major problem. For instance, some packet can disappear _forever_, and the retransmitted packet will not save you - it is different. Also, the packets can be reordered, and so on.

Don Burn

unread,

Mar 5, 2010, 9:06:40 AM3/5/10

to

So you think it is acceptable to take most of the memory that can be
allocated in the kernel for your driver to keep packet history? Remember,
in most systems out there you can have 128MB total of non-paged pool and
256MB of paged pool. Consider that XP still runs on a lot of systems with
512MB total. Even if the systems have a ton of memory you are taking a
large portion on 32-bit of the address space of the kernel since that is
2GB.

So no matter how you look at this your design is flawed.

Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

> -----Original Message-----
> From: mirage2k2 [mailto:mira...@discussions.microsoft.com]
> Posted At: Friday, March 05, 2010 7:16 AM
> Posted To: microsoft.public.development.device.drivers
> Conversation: MmAllocateContiguousMemory and non-paged pool size
> Subject: Re: MmAllocateContiguousMemory and non-paged pool size
>

> __________ Information from ESET Smart Security, version of virus

> signature database 4917 (20100305) __________

mirage2k2

unread,

Mar 5, 2010, 3:32:17 PM3/5/10

to

Firstly, I do not need 100's of MB, I need 32-64MB. Secondly, it is not my
design, I have been provided with a compression/decompression library which
requires me to allocate memory for packet history (1 history buffer for
compressing outbound packets and 1 for decompressing inbound packets).

I do not think it is acceptable for my driver to use a large percentage of
available resources ... and this is why I've posted ... I'm looking for
alternatives to non-paged pool.

What I will probably do is limit my history buffer to about 16MB on xp
32-bit. I might allocate 1 history buffer from non-paged pool and the other
from paged pool ... one buffer can typically be larger than the other ... the
biggest one will come from paged pool. I will try to limit total allocation
(on xp 32-bit) to about 32MB.

Mirage2k2.

"Don Burn" wrote:

> .
>

Maxim S. Shatskih

unread,

Mar 5, 2010, 3:50:41 PM3/5/10

to

> Firstly, I do not need 100's of MB, I need 32-64MB. Secondly, it is not my
> design, I have been provided with a compression/decompression library which
> requires me to allocate memory for packet history (1 history buffer for
> compressing outbound packets and 1 for decompressing inbound packets).

This does not work.

Packet history is unreliable. Packets are often lost on the network (due to router overflow usually) and can be reordered.

And, if you're compressing, then surely offloading to a thread (in a paged pool) will not be so large a burden.

Don Burn

unread,

Mar 5, 2010, 4:11:58 PM3/5/10

to

Whether you were provided with the library or not, you are using it so it
is your design. In fields with professional engineering credentials it is
the person who is responsible for the delivery whose design it is. Yes it
may have come from a manager's requirements, but it is your responsibility
to indicate the design has a problem and the driver will not conform to
what most people expect in Windows.

If the driver is for an embedded type of situation where the environment
is controlled this needs to communicated so a decision can be made if it
is acceptable. If the driver is for general purpose, then it is almost
sure to be unacceptable since you are going to at a minimum significantly
degrade the performance of the system.

> signature database 4919 (20100305) __________

m

unread,

Mar 5, 2010, 11:18:39 PM3/5/10

to

My apologies: When i first saw your post, i assumed that you were using
these buffers to do content filtering as part of some kind of firewall
driver and derided the design as wasteful (notwithstanding my opinion of
security software in general); but you have been trying to develop something
altogether more useless and damaging!

As Maxim has pointed out, a differential algorithm is useless for packet
compression, and for stateless packet compression a single packet sized
buffer is sufficient. Of course stateless packet compression is mostly
useless unless you are trying to avoid fragmentation on media with
nonstandard MTU or you need to add encapsulation headers.

As Don has pointed out, while the is no problem using tens or hundreds of
megabytes of memory when it is needed, in KM every page should be valued and
conserved since excessive consumption of these resources impairs the ability
of the kernel to service applications.

Please don't take my jest personally, but it is humorous to see how much
havoc one simple question can cause!

"mirage2k2" <mira...@discussions.microsoft.com> wrote in message

news:43D91949-1E3E-48D1...@microsoft.com...