Is there a maximum contiguous memory allocation?

Peter Olcott

unread,

Dec 19, 2009, 9:42:50 AM12/19/09

to

My application needs to create a std::vector > 5GB, is that
possible in x64?

Bo Persson

unread,

Dec 19, 2009, 9:48:28 AM12/19/09

to

Peter Olcott wrote:
> My application needs to create a std::vector > 5GB, is that
> possible in x64?

Yes, if you have enough virtual memory.

The limitation is not in the amount of RAM, but in the address space.

Bo Persson

Peter Olcott

unread,

Dec 19, 2009, 10:15:38 AM12/19/09

to

"Bo Persson" <b...@gmb.dk> wrote in message
news:7p47dj...@mid.individual.net...

I have read the Microsoft has placed and artificial 2GB
limit on the size of an array. I also read that this same
limit applies to 64 bit .NET applications, maximum object
size of 2GB.

My application requires a single contiguous block of
physical memory, is this possible?

Giovanni Dicanio

unread,

Dec 19, 2009, 2:07:29 PM12/19/09

to

"Peter Olcott" <NoS...@SeeScreen.com> ha scritto nel messaggio
news:GsSdnVt8x8CAc7HW...@giganews.com...

> I have read the Microsoft has placed and artificial 2GB limit on the size
> of an array. I also read that this same limit applies to 64 bit .NET
> applications, maximum object size of 2GB.

Considering your request of allocating a huge std::vector, I wrote this
simple console-mode program in VS2008 SP1, that you can build in x64:

<code>

#include <Windows.h>
#include <stdlib.h>

#include <vector>
#include <iostream>
#include <exception>

using namespace std;

int main(int argc, char * argv[])
{
try
{
if ( argc != 2 )
{
cout << "Syntax: " << argv[0] << " <gigabytes>" << endl;
return EXIT_FAILURE;
}

int gigabyteCount = atoi(argv[1]);

cout << "Try allocating " << gigabyteCount << "GB in a vector..." <<
endl;
static const size_t gigabyte = 1024 * 1024 * 1024;
vector< BYTE > big( gigabyteCount * gigabyte );

cout << "Vector allocated." << endl;
}
catch (exception & e)
{
cout << "*** ERROR: " << e.what() << endl;
}

return 0;
}

</code>

I tested it on Windows 7 x64, and the allocation of a 4 GB std::vector
seemed to succeed; so I would think that the limit of 2GB does not apply to
native code.

Giovanni

Joseph M. Newcomer

unread,

Dec 20, 2009, 2:17:04 AM12/20/09

to

There are a lot of nonsensical rumors out there. For example, what you you think is meant
by the word "array"? Unless you have a SPECIFIC definition of that term, there is no way
to tell of one of these random rumors is relevant or not.

Of course, the trivial way to test this is to actually TEST it!

I spent most of the evening looking at a truly crufty set of documentation that advises
things like "don't use LSH of shift distances longer than 1, it is slow", so I ran some
tests, and can prove that shift time is constant independent of shift amount (the old
documents applied to Pentium III and earlier).

The way to check out a rumor is to first run the test. If the test works, the rumor is
false.

Run a test!
joe

Joseph M. Newcomer [MVP]
email: newc...@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

Bo Persson

unread,

Dec 20, 2009, 5:32:35 AM12/20/09

to

Peter Olcott wrote:
> "Bo Persson" <b...@gmb.dk> wrote in message
> news:7p47dj...@mid.individual.net...
>> Peter Olcott wrote:
>>> My application needs to create a std::vector > 5GB, is
>>> that
>>> possible in x64?
>>
>> Yes, if you have enough virtual memory.
>>
>> The limitation is not in the amount of RAM, but in the
>> address space.
>>
>>
>> Bo Persson
>>
>>
>
> I have read the Microsoft has placed and artificial 2GB
> limit on the size of an array. I also read that this same
> limit applies to 64 bit .NET applications, maximum object
> size of 2GB.

I don't know about .NET, I see std::vector as a part of real C++.

>
> My application requires a single contiguous block of
> physical memory, is this possible?

It doesn't require a contiguous block if physical memory, it requires
a large enough block of virtual memory. How that is mapped to physical
memory is not very important.

Bo Persson

Joseph M. Newcomer

unread,

Dec 20, 2009, 1:53:48 PM12/20/09

to

See below...

On Sat, 19 Dec 2009 09:15:38 -0600, "Peter Olcott" <NoS...@SeeScreen.com> wrote:

****
I'd missed this, and I'm only coming back to it based on another reply.

Unless you are writing a device driver for a piece of hardware designed by an amateur
designer, the chances that you will require contiguous physical memory is zero.
(Professional hardware designers as a matter of course specify what are called "infinite
scatter-gather DMA controllers", which although "infinite" is a bit of a misnomer (you are
usually limited to 4GB of descriptors), each descriptor specifies a 32-bit address and a
32-bit length, allowing a single DMA transfer to transfer as many discontiguous blocks of
data as are needed to complete the I/O, in a single operation).

You may require contiguous *virtual* memory, which is a different question, and when you
start looking at objects the size you are describing, either you have to assume that you
will be working with discontiguous memory or you have to go to a 64-bit native platform.
There are no other solutions.
joe
****

Peter Olcott

unread,

Dec 21, 2009, 7:53:57 AM12/21/09

to

"Bo Persson" <b...@gmb.dk> wrote in message

news:7p6cpo...@mid.individual.net...

My DFA Image recognizer needs a contiguous block of physical
memory, in some cases possibly > 4GB. If I allow time for
page swaps a very fast process becomes infeasibly slow.

>
>
> Bo Persson
>
>

Peter Olcott

unread,

Dec 21, 2009, 8:17:18 AM12/21/09

to

"Joseph M. Newcomer" <newc...@flounder.com> wrote in
message news:0dssi5thoj8m18oc3...@4ax.com...

My DFA recognizer needs contiguous physical memory, or disk
swap time would make this process infeasibly slow. There
would be a possible disk read for every pixel on the screen.
Current whole screen response time <= 100 ms.

If there was some disk equivalent technology that was
comparable in speed to RAM, then this limitation would not
have the same degree of impact. Conventional disk seek time
would kill my performance. The alternatives that I have
examined are solid state drives and various types (and
redesigns) of RAID arrays.

It would be really great if this problem did not exist
because I would then be able to process Chinese glyphs
efficiently. The current process is estimated to require
about 2.0 TB RAM. I am working on redesigning the process to
eliminate this restriction.

Scott McPhillips [MVP]

unread,

Dec 21, 2009, 9:22:35 AM12/21/09

to

"Peter Olcott" <NoS...@SeeScreen.com> wrote in message
news:upqdnUNYt6F78rLW...@giganews.com...

> My DFA Image recognizer needs a contiguous block of physical memory, in
> some cases possibly > 4GB. If I allow time for page swaps a very fast
> process becomes infeasibly slow.

Contiguous physical memory has nothing to do with avoiding page swaps. If
the pages are loaded then virtual memory is the same speed as physical
memory.

--
Scott McPhillips [VC++ MVP]

Alexander Grigoriev

unread,

Dec 21, 2009, 9:40:15 AM12/21/09

to

"Peter Olcott" <NoS...@SeeScreen.com> wrote in message

news:6padnbezC7kp6LLW...@giganews.com...

>
>
> My DFA recognizer needs contiguous physical memory, or disk swap time
> would make this process infeasibly slow. There would be a possible disk
> read for every pixel on the screen. Current whole screen response time <=
> 100 ms.
>

I'm not sure why you think you need physically contiguous memory. You
actually need memory which is just unlikely to page out. Addressing
Windowing Extensions (AWE) will do that. See AllocateUserPhysicalPages etc.

Peter Olcott

unread,

Dec 21, 2009, 9:55:09 AM12/21/09

to

"Scott McPhillips [MVP]" <org-dot-mvps-at-scottmcp> wrote in
message news:%23G$lQkkgK...@TK2MSFTNGP05.phx.gbl...

The DFA recognizer is stored in a std::vector, and from my
understanding this is only available as contiguous memory.

Peter Olcott

unread,

Dec 21, 2009, 10:00:38 AM12/21/09

to

"Alexander Grigoriev" <al...@earthlink.net> wrote in message
news:e$TCEukgK...@TK2MSFTNGP06.phx.gbl...

Will this be able to allocate an amount of physical memory
comparable to the amount of physical memory installed on the
machine, or is there some artificial limit that is
substantially less than this amount? For example I have read
that the maximum object size of a 64-bit .NET component is 2
GB, regardless of the amount of RAM installed.

Scott McPhillips [MVP]

unread,

Dec 21, 2009, 10:06:39 AM12/21/09

to

"Peter Olcott" <NoS...@SeeScreen.com> wrote in message

news:34CdnaWqM6HTEbLW...@giganews.com...

>>> My DFA Image recognizer needs a contiguous block of physical memory, in
>>> some cases possibly > 4GB. If I allow time for page swaps a very fast
>>> process becomes infeasibly slow.
>>
>>
>> Contiguous physical memory has nothing to do with avoiding page swaps.
>> If the pages are loaded then virtual memory is the same speed as physical
>> memory.
>
> The DFA recognizer is stored in a std::vector, and from my understanding
> this is only available as contiguous memory.

only available as contiguous virtual memory, not contiguous physical memory

Joseph M. Newcomer

unread,

Dec 21, 2009, 11:18:04 AM12/21/09

to

Yes, it requires contiguous memory. But since you can only work in terms of virtual
memory, that obviously means "contiguous virtual memory". Physical memory is irrelevant
and in any case you have no control over it, so there's no reason to even use the phrase
"physical memory" except in contexts like "virtual memory may or may not be resident in
physical memory" or "contiguous virtual memory is usually not contiguous in physical
memory".

You have no real control of page swapping, anyway, it either happens or doesn't. If you
use large amounts of virtual memory, it is most likely will be paged out, and you will
almost always have page faults.

There is an API, VirtualLock, that locks pages down. But its use, and the quantity of
pages you can lock down, is subject to administrative controls; furthermore, you cannot
lock down more memory than you have available, and that means even in the unlikely
circustance that a user was given a 2GB lock-down quota, you are unlikely to get more than
about 1GB of contiguous virtual memory (on a good day, with a tail wind from the south).

If you need massive amounts of memory, you either have to use 64 bit Windows or use some
mechanism like a mapped file. An algorithm that requires more than about 100MB of
contiguous storage really needs to be reconsidered in the light of needing to deal with
smaller pieces of storage. It essentially is not a very good design to assume that you
could get large contiguous blocks of storage of unlimited size.

And I have no idea why you have fixated on 4GB for Win32. The maximum limit for a typical
installation is 2GB, period. An Enterprise Server allows 3GB maximum user virtual address
space, but it is unusual to have an ordinary version of Windows with a 3GB user partition
(which drops the kernel partition to 1GB). Of that 2GB, some of it is your application,
some of it is system DLLs, some of it is end-user DLLs, some of it is assorted library
DLLs. You are unlikely to get large contiguous blocks of storage because of memory
fragmentation caused by all these DLLs, some of which have their own private heaps.

You've had this problem before; you really have to understand how memory allocation works,
and how the address space is allocated. I've tried to explain this in the past, but you
seem to keep ignoring it and fixate on meaningless values like 4GB, which are impossible
to achieve in Win32. And even if you went with meaningful values, like 2GB, you are not
going to get massive blocks of contiguous storage in Win32. So you either have to
redesign your code to work with discontiguous blocks, with blocks in memory mapped files,
or you have to move to 64-bit Windows. There really aren't any other choices.

I've looked at std::vector; the code is very hard to read, but it appears that the size is
limited by size_t. In Win64, size_t is a 64-bit value, which means that allocating
massively large vectors should not be a problem. But again, physical memory is not a
concept that comes into play here; you are *always* working in terms of virtual memory.
joe

Joseph M. Newcomer

unread,

Dec 21, 2009, 12:35:49 PM12/21/09

to

See below...

****
Nonsense. Complete and utter nonsense, beyond any shadow of a doubt. Why do you keep
talking about "contiguous physical memory"?
(a) it doesn't matter if the physical memory is or is not contiguous
(b) from an application, you cannot control physical memory
(c) even if you could control physical memory, you can't allocate large blocks of it
(d) what part of "virtual memory" are you failing to understand?
****

>or disk
>swap time would make this process infeasibly slow.

****
In the trade, we call this "life is hard". Meaning, there's nothing you can do about it.
You are making an impossible request, which
(a) has no meaning
(b) makes no sense because it is impossible to achieve
(c) requires something that makes no sense if a virtual memory world
(d) is impossible to achieve even for a kernel programmer working with physical memory
(MmAllocateContiguousMemory)
(e) even if it was possible, it would not change anything, since you can't address more
than 2GB
(f) that 2GB has to include space for all your application, other structures you use,
DLLs, all the storage they use, and the OS interface, so you are reduced to something less
than 2GB
(g) since those various pieces I just alluded to can fragment memory, in practice you
cannot get arbitrarily large contiguous blocks any time you fell like it; there is a
practical limit to the maximum block size, which varies from moment to moment in your
program; the longer your program runs, the smaller this size becomes.
****

>There would be a possible disk read for every pixel on the screen.
>Current whole screen response time <= 100 ms.

****
This is called "need to redesign the algorithm". Typically, in VM systems, you have to
consider things that repack FSM models to maximize locality of reference. This is a
problem that has been known and understood since at least 1961, and was well-understood
when I started using virtual storage in 1968 (that's 41 years ago). In 1969, we were
spending hours analyzing our algorithms and repacking data to minimize paging; in fact, we
were even using features of our linker to pack code adjacent. In 1971, I wrote a
diagnostic program that measured code page transitions during execution of an application
so we could understand how to pack our code to minimize page faults by studying its actual
behavior. The first LISP machines (in the 1980s) did not store lists as lists but as
contiguous arrays to minimize page faults (the extra cost and complexity of handling
complex array/list structures including automatic repacking of lists into arrays more than
paid off in terms of performance gains achieved by avoiding page faults). Once we got
machines with caches, we started redesigning algorithms to maximize cache hits ever for
pages that were resident. Cache hit performance can improve your program performance by a
factor of 10; paging optimization can improve your program performance by a factor of
100,000 to 1,000,000. Or more.

Note that you can use raw VirtualAlloc to improve your chances of getting storage (malloc
already guarantees fragmentation most of the time). But you are still going to hit limits
far smaller than 2GB. I just tried an experiment; I ran a program that tried to allocate
storage. If it succeeded, I would exit the program and try again.

Using either VirtualAlloc or malloc, the largest size I could allocate was 1100MB; 1200MB
failed. I did not try values between these two. And that was in a trivial MFC program
that did no other allocation, had no user DLLs loaded (just what MFC loads). amd allocated
essentially immediately upon startup. Your Mileage May Vary, but it shows that hopes of
getting larger allocations are very unlikely. Note that it took about 6 seconds to do the
allocation.

An assumption of uniform time to access large data arrays is not a valid assumption and
has NEVER been a valid assumption in virtual memory systems. If you created an algorithm
whose success depends on a performance that is in practice impossible to achieve, then you
need to rethink your design. It can be as simple as repacking your FSM so adjacent states
are packed adjacent. Or it simply may be that it is impossible to achieve the performance
you thought was possible.

Note that the issues of working set and VM do not go away in Win64. Paging does not go
away. Physical memory still has no meaning.
****

>
>If there was some disk equivalent technology that was
>comparable in speed to RAM, then this limitation would not
>have the same degree of impact. Conventional disk seek time
>would kill my performance. The alternatives that I have
>examined are solid state drives and various types (and
>redesigns) of RAID arrays.

****
Yes, those help. Sometimes the only solution is faster technology. Raw hardware can
solve problems. So can algorithm redesign. Those of us who grew up on machines with slow
swapping files and small address spaces learned these lessons. The current generation
thinks that memory is uniform, and it always comes as a surprise when they discover it is
not.
****

>
>It would be really great if this problem did not exist
>because I would then be able to process Chinese glyphs
>efficiently. The current process is estimated to require
>about 2.0 TB RAM. I am working on redesigning the process to
>eliminate this restriction.

****
Sounds like Win64 to me (8TB limit). But note that you will still be limited by how many
pages are available in physical memory, and that's not going to change a lot in the
foreseeable future because of memory costs. Memory costs are not only the cost of the
chips, but the cost of the space made available on the motherboards to plug the memory
into (sockets cost money; printed circuit board space costs money, and there are physical
limits to how many sockets you can place on a motherboard). For example, a 2GB chip costs
about $80. So a 2TB RAM system requires 1000 chips and would cost $80,000. But note that
this means you would need 1000 sockets on your motherboard! Not going to happen. So you
are going to be paging. Take that as a given. It is not negotiable, it is not avoidable,
it is going to be part of what you live with and you cannot change that fact. So your
algorithms have to change to accoutn for that.
joe
****

Joseph M. Newcomer

unread,

Dec 21, 2009, 12:38:08 PM12/21/09

to

So are you programming in .NET? I thought you were programming in MFC, or at least C++!

What does .NET have to do with native code programming? I've heard that you don't use
memory addresses in .NET, but use "references". I've heard that there is garbage
collection in .NET. I've heard lots of things about .NET. If you are programming in C#
or VB, they would matter. Are you?
joe

Peter Olcott

unread,

Dec 21, 2009, 3:50:06 PM12/21/09

to

"Scott McPhillips [MVP]" <org-dot-mvps-at-scottmcp> wrote in

message news:ewH148kg...@TK2MSFTNGP06.phx.gbl...

So contiguous virtual memory could be mapped to fragmented
physical memory?

Joseph M. Newcomer

unread,

Dec 21, 2009, 5:33:03 PM12/21/09

to

See below...

****
There is no "could" about it. That is EXACTLY how it is designed to work, and how it does
work. The general assumption is that contiguous virtual memory is not only mapped to
discontiguous physical memory, some of it isn't even in physical memory.

Device driver writers are painfully aware of this issue. It gets Real Exciting if a
driver writer has to be able to support "any size buffer" because it may not be possible
to lock down all the pages in physical memory, requiring the use of "mode Neither", and
expensive and complex driver design, for example, the addition of the need to create
partial MDLs and the need to use a passive driver thread to do the MmProbeAndLockPages to
lock down each portion of the buffer in turn.
joe
****

>
>>
>> --
>> Scott McPhillips [VC++ MVP]
>

Peter Olcott

unread,

Dec 21, 2009, 8:38:14 PM12/21/09

to

"Joseph M. Newcomer" <newc...@flounder.com> wrote in

message news:gu7vi5d9q8ojc8ae0...@4ax.com...

So contiguous virtual memory can be mapped to fragmented
physical memory?

>>or disk

>>swap time would make this process infeasibly slow.
> ****
> In the trade, we call this "life is hard". Meaning,
> there's nothing you can do about it.
> You are making an impossible request, which
> (a) has no meaning
> (b) makes no sense because it is impossible to achieve
> (c) requires something that makes no sense if a virtual
> memory world
> (d) is impossible to achieve even for a kernel programmer
> working with physical memory
> (MmAllocateContiguousMemory)
> (e) even if it was possible, it would not change anything,
> since you can't address more
> than 2GB
> (f) that 2GB has to include space for all your
> application, other structures you use,
> DLLs, all the storage they use, and the OS interface, so
> you are reduced to something less
> than 2GB

Win64 has a 2GB Limit?

In my case any change to the algorithm would degrade its
performance. In the case of recognizing Asian glyphs I will
have no choice, 2.0 TB of RAM is not yet cost-effective.

>
> Note that you can use raw VirtualAlloc to improve your
> chances of getting storage (malloc
> already guarantees fragmentation most of the time). But
> you are still going to hit limits
> far smaller than 2GB.

Even in the case of a machine that only has Win XP x64, and
my application with 32GB of RAM? That seems implausible. If
it is true then I could only conclude a horribly bad
architecture design.

> I just tried an experiment; I ran a program that tried to
> allocate
> storage. If it succeeded, I would exit the program and
> try again.
>
> Using either VirtualAlloc or malloc, the largest size I
> could allocate was 1100MB; 1200MB
> failed. I did not try values between these two. And that
> was in a trivial MFC program

Someone else on this same thread was able to allocate 4 GB.

Yes so I have several alternatives.
(1) Conventional Windows like glyphs can easily be
recognized with (by today's standards) small amounts of RAM
(2) Complex glyphs such as those that Apple, PDF, and some
Unix systems have may sometimes exceed the limits of Win32,
and thus require a 64 bit OS. (it will sometimes require
std:vector::size() > 4.0 GB)
(3) The recognition of Asian glyphs will require a redesign
that will save memory at the cost of speed. I think that I
have the elements of this redesign figured out.

Geoff

unread,

Dec 21, 2009, 9:28:33 PM12/21/09

to

On Mon, 21 Dec 2009 19:38:14 -0600, "Peter Olcott"
<NoS...@SeeScreen.com> wrote:

>Yes so I have several alternatives.
>(1) Conventional Windows like glyphs can easily be
>recognized with (by today's standards) small amounts of RAM
>(2) Complex glyphs such as those that Apple, PDF, and some
>Unix systems have may sometimes exceed the limits of Win32,
>and thus require a 64 bit OS. (it will sometimes require
>std:vector::size() > 4.0 GB)
>(3) The recognition of Asian glyphs will require a redesign
>that will save memory at the cost of speed. I think that I
>have the elements of this redesign figured out.

Unix or Windows, 32-bit OS means a hard limit on a single allocation:

#include <vector>
#include <iostream>

int main(int argc, char* argv[])
{
using namespace std;
vector <char> v1;
vector <char>::size_type i;

i = v1.max_size( );
cout << "Maximum possible length of the vector is " << i << "." <<
endl;
return 0;
}

OS X and Linux as 32 bit systems are constrained just as much as
Windows. You seriously need to examine your problem space and
partition it so your application isn't such a memory pig.

Joseph M. Newcomer

unread,

Dec 21, 2009, 9:29:49 PM12/21/09

to

Followup: with my trivial MFC app, the largest allocation I could get with VirtualAlloc or
malloc was 1194*1024*1024 bytes. 1195 failed.

For a real app, one with lots of DLLs, threading, etc. the value will typically be
smaller. Perhaps much smaller.

Your Mileage May Vary.
joe

Joseph M. Newcomer

unread,

Dec 21, 2009, 9:46:05 PM12/21/09

to

See below...

****
As I responded in another answer, this is EXACTLY how it is designed to work.
****

>
>>>or disk
>>>swap time would make this process infeasibly slow.
>> ****
>> In the trade, we call this "life is hard". Meaning,
>> there's nothing you can do about it.
>> You are making an impossible request, which
>> (a) has no meaning
>> (b) makes no sense because it is impossible to achieve
>> (c) requires something that makes no sense if a virtual
>> memory world
>> (d) is impossible to achieve even for a kernel programmer
>> working with physical memory
>> (MmAllocateContiguousMemory)
>> (e) even if it was possible, it would not change anything,
>> since you can't address more
>> than 2GB
>> (f) that 2GB has to include space for all your
>> application, other structures you use,
>> DLLs, all the storage they use, and the OS interface, so
>> you are reduced to something less
>> than 2GB
>
>Win64 has a 2GB Limit?

****
No. We are speaking about Win32 here. As I've already said, Win64 does not have these
limitations. Even 32-bit code in Win64 does not have a 2GB limit if you link
/LARGEADDRESSAWARE (it does have a 4GB limit for total address space. How much of that
you can get in a single allocation depends on a lot of features of your application)

A native Win64 program running in Win64 probably has a much larger limit. Note that
malloc, which will try to initialize the pages in debug mode, can take a long time to
perform this allocation in debug mode (but will run much faster in release mode). So
there you are going to be limited by the size of your swap space. If you want 10GB of
address space, you had better have a 10GB swapfile.
*****

*****
Note that you are confusing "degrading performance" with "making performance worse
overall". As I already pointed out, if your algorithm executes lots of extra code to
avoid page faults, it will be *faster*, possibly by orders of magnitude. You are making
the error of confusing instruction cycles with performance.
*****

>
>>
>> Note that you can use raw VirtualAlloc to improve your
>> chances of getting storage (malloc
>> already guarantees fragmentation most of the time). But
>> you are still going to hit limits
>> far smaller than 2GB.
>
>Even in the case of a machine that only has Win XP x64, and
>my application with 32GB of RAM? That seems implausible. If
>it is true then I could only conclude a horribly bad
>architecture design.

*****
Note that I have been talking always in terms of Win32, except when I explicitly refer to
Win64. In Win64, you will be able to allocate MASSIVE structures. However, they are
going to page like crazy, and you need a paging file at least as large as your largest
expected app usage. So if you need 32GB of data space, you will expect to need 32GB of
swap space for it. Now, given particular physical configurations of RAM (e.g., 64GB), you
may not need much of that swap space much of the time. There is even a slight chance you
will never have to page. Of course, this also requires that your app's working space size
be set high enough that it will not get trimmed. But what we have been saying for several
days now: if you need lots of space, you need Win64. That is not an option, that is a
necessity. If you cannot live with a 1GB or smaller contiguous allocation, you have no
choice but to move to Win64. Perhaps the simplest solution for you is simply to move to
Win64 now, and stop worrying about this, because otherwise you are just going to keep
insisting that you want the OS to do something it is incapable of.

Architecture is a collection of interacting decisions, many of which (such as working set)
are user-definable, but whose default values may be unsuitable for you. You have to find
out what all the parameters are, and make sure you have configured them correctly. So you
start by buying a machine with mongo memory, and proceed from there.
*****

>
>> I just tried an experiment; I ran a program that tried to
>> allocate
>> storage. If it succeeded, I would exit the program and
>> try again.
>>
>> Using either VirtualAlloc or malloc, the largest size I
>> could allocate was 1100MB; 1200MB
>> failed. I did not try values between these two. And that
>> was in a trivial MFC program
>
>Someone else on this same thread was able to allocate 4 GB.

*****
Not in this world. It is simply impossible. The system has no concept of allowing more
than either 2GB or 3GB TOTAL USER SPACE. So nobody is EVER going to be able to allocate
4GB of memory on Win32. On Win64, it is equally impossible in a 32-bit program, because
some segment of that memory holds your code, stacks, static data, and heap other than this
one massive structure. So you MIGHT get around 3.5GB plus or minus some change, but never
4GB (don't forget that the first 64K and the last 64K don't exist).

In Win64, 4GB is just a toy allocation. Sort of like 4K in Win32. Serious allocations
are in the terabtyes.
*****

****
These all sound reasonable. The time.vs.memory tradeoffs have always been with us.
joe
****

Peter Olcott

unread,

Dec 21, 2009, 11:21:39 PM12/21/09

to

"Geoff" <ge...@invalid.invalid> wrote in message
news:bqa0j5ds19b0lhbiu...@4ax.com...

It is not a pig it does something that nothing else in the
world can do it recognizes an entire screen of character
glyphs with 100% accuracy in 1/10 second. The only possible
way to reduce the memory requirements will necessarily
substantially increases the response time.

Peter Olcott

unread,

Dec 21, 2009, 11:46:54 PM12/21/09

to

"Joseph M. Newcomer" <newc...@flounder.com> wrote in

message news:3qb0j5tnmvbfof6s3...@4ax.com...

I am talking about response time performance, and page
faults in my case could ruin response time. There are some
uses of my technology where 100 ms response time would be
the maximum tolerable amount. If for what-ever-reason it
takes more than 100 ms, then (in some applications of this
technology) it fails.

> *****
>>
>>>
>>> Note that you can use raw VirtualAlloc to improve your
>>> chances of getting storage (malloc
>>> already guarantees fragmentation most of the time). But
>>> you are still going to hit limits
>>> far smaller than 2GB.
>>
>>Even in the case of a machine that only has Win XP x64,
>>and
>>my application with 32GB of RAM? That seems implausible.
>>If
>>it is true then I could only conclude a horribly bad
>>architecture design.
> *****
> Note that I have been talking always in terms of Win32,
> except when I explicitly refer to
> Win64. In Win64, you will be able to allocate MASSIVE
> structures. However, they are
> going to page like crazy, and you need a paging file at
> least as large as your largest

The ONLY reason that memory is every paged out is if
something else needs more physical memory than is currently
available. If the ONLY thing running on the machine besides
the OS is my application, then it should have no reason to
page memory out. This same reason should also apply
regardless of the number of processes on the machine as long
as nothing ever needs more memory than the amount of
physical installed RAM. It might be nice if you could tell
it to defrag memory once in a awhile. XP apparently lacked
this feature.

Thus if I need to 30 GB std::vector from Win64 XP, I should
be able to get a 30 GB std::vector from 32 GB of installed
RAM. If I can't then something is broken somewhere. An OS
should not ever grow its memory requirements without reason.

> expected app usage. So if you need 32GB of data space,
> you will expect to need 32GB of
> swap space for it. Now, given particular physical
> configurations of RAM (e.g., 64GB), you
> may not need much of that swap space much of the time.
> There is even a slight chance you
> will never have to page. Of course, this also requires
> that your app's working space size
> be set high enough that it will not get trimmed. But what
> we have been saying for several
> days now: if you need lots of space, you need Win64. That
> is not an option, that is a

I figured that was true before my first recent posting. I
was hoping for some sort of exception to the rule like the
DOS extender types of solutions that were used in years gone
by. A 32-bit OS could directly access far more memory that
can be addressed by 32-bits by using the 32 bit address,
prefixed by a page address. This was done on the original
8088 architecture.

> necessity. If you cannot live with a 1GB or smaller
> contiguous allocation, you have no
> choice but to move to Win64. Perhaps the simplest
> solution for you is simply to move to
> Win64 now, and stop worrying about this, because otherwise
> you are just going to keep
> insisting that you want the OS to do something it is
> incapable of.

There are a range of problems, and correspondingly a range
of solutions.

Joseph M. Newcomer

unread,

Dec 22, 2009, 1:26:50 AM12/22/09

to

See below...

Well, as I said, "Life Is Hard". There really isn't a whole lot you can do. You can
consider things like VirtualLock, but that only works if the target machine has enough
physical memory to lock down your data table and still leave room for other programs to
run. And only if the user has been granted VirtualLock privileges, and only if the
VirtualLock quota is set high enough to allow it to work. Thus, it requires serious admin
privileges, and even then I don't know if there are upper bounds to VirtualLock that are
hardwired. But any time you are running with locked pages, you are having a serious
impact on the overall system performance unless the system is incredibly memory-rich.

****
Agreed. But remember "working set"? You have to modify the working set of your process
to be large enough to avoid paging. Again, administrative controls and quotas apply, so
you need to deal with how these are set up (and I don't know, so I can't tell you)

There is never a need to "defrag memory" in the system relative to any one application;
and since separate applications have separate memory maps, they are independent of memory
fragmentation in real memory, and there is no way to "defrag memory" in your app. It is
not a "feature" that XP "lacks", it was NEVER possible to "defrag" an application's memory
for any C or C++ application. That only applies to .NET, where the runtime knows where
every pointer can be found so every pointer can be updated.

Perhaps you are confusing this with the old 16-bit Windows "Compact" operation on memory,
which was something entirely different and only applied to real-mode 16-bit Windows, that
is, all versions < Windows 386. Windows 3.0 is > Windows 386.
****

>
>Thus if I need to 30 GB std::vector from Win64 XP, I should
>be able to get a 30 GB std::vector from 32 GB of installed
>RAM. If I can't then something is broken somewhere. An OS
>should not ever grow its memory requirements without reason.

*****
You really, really, REALLY need to understand the difference between physical memory and
virtual memory. A std:vector is ALWAYS allocated in virtual memory. If it happens that
that virtual memory can all be resident in phsyical memory, hey, like, that's cool and
all, but otherwise they are independent concepts that work together by this piece of magic
we call an "operating system". An OS doesn't grow its memory requirements without reason;
the "reason" is always there: the applications need the memory!
****

>
>> expected app usage. So if you need 32GB of data space,
>> you will expect to need 32GB of
>> swap space for it. Now, given particular physical
>> configurations of RAM (e.g., 64GB), you
>> may not need much of that swap space much of the time.
>> There is even a slight chance you
>> will never have to page. Of course, this also requires
>> that your app's working space size
>> be set high enough that it will not get trimmed. But what
>> we have been saying for several
>> days now: if you need lots of space, you need Win64. That
>> is not an option, that is a
>
>I figured that was true before my first recent posting. I
>was hoping for some sort of exception to the rule like the
>DOS extender types of solutions that were used in years gone
>by. A 32-bit OS could directly access far more memory that
>can be addressed by 32-bits by using the 32 bit address,
>prefixed by a page address. This was done on the original
>8088 architecture.

****
DOS extenders were a poor substitute for real virtual memory. And note that they did NOT
extend memory by requiring more bits of addressibility than existed (32). You seem to
think that there is some way you can use a larger-than-32-bit address on Win32, but that
is not possible, any more than it was possible in a DOS extender. The way you get a
larger-than-32-bit address space is to use Win64.
****

>
>> necessity. If you cannot live with a 1GB or smaller
>> contiguous allocation, you have no
>> choice but to move to Win64. Perhaps the simplest
>> solution for you is simply to move to
>> Win64 now, and stop worrying about this, because otherwise
>> you are just going to keep
>> insisting that you want the OS to do something it is
>> incapable of.
>
>There are a range of problems, and correspondingly a range
>of solutions.

****
That's pretty much what I'm saying. And some of those solutions trade off instruction
cycles against page faults. And some just require money (more physical memory, 64-bit
processor)
*****

Liviu

unread,

Dec 22, 2009, 1:27:16 AM12/22/09

to

"Peter Olcott" <NoS...@SeeScreen.com> wrote...

>
> It might be nice if you could tell it to defrag memory once
> in a awhile. XP apparently lacked this feature.

Sigh... MS also forgot to bundle the DVD rewinder with XP.

Peter Olcott

unread,

Dec 22, 2009, 8:39:14 AM12/22/09

to

"Joseph M. Newcomer" <newc...@flounder.com> wrote in

message news:qjb0j55qf7r19bdd2...@4ax.com...

> Followup: with my trivial MFC app, the largest allocation
> I could get with VirtualAlloc or
> malloc was 1194*1024*1024 bytes. 1195 failed.

Yes but then are you testing this on Win64 with 8 GB + RAM?
I need to know that Win64 with 32 GB of RAM could provide me
with a 30 GB std::vector that maps to that much physical
memory.

Joseph M. Newcomer

unread,

Dec 22, 2009, 10:52:48 PM12/22/09

to

See below...

On Tue, 22 Dec 2009 07:39:14 -0600, "Peter Olcott" <NoS...@SeeScreen.com> wrote:

>
>"Joseph M. Newcomer" <newc...@flounder.com> wrote in

>message news:qjb0j55qf7r19bdd2...@4ax.com...
>> Followup: with my trivial MFC app, the largest allocation
>> I could get with VirtualAlloc or
>> malloc was 1194*1024*1024 bytes. 1195 failed.
>
>Yes but then are you testing this on Win64 with 8 GB + RAM?
>I need to know that Win64 with 32 GB of RAM could provide me
>with a 30 GB std::vector that maps to that much physical
>memory.

****
No, I'm testing it in the context of Win32. If you have Win64, then try the experiment
yourself!

Note that if I had my Win64 system up, with 4GB of RAM, I could STILL allocate 30GB of
vector. Most of it would be paged out most of the time, but could ALLOCATE it! I could
allocate it if I had 2GB of physical memory! I see no reason I would need 8GB+ of RAM to
do a simple allocation. In fact, if the allocation failed I would seriously start
worrying about Windows. Note, of course, that I also need, ir order to allocate 30GB of
VM, probably on the order of 60GB of paging file, which would consume pretty much my
entire 68GB C: drive (which won't happen because I have Office and VS installed on it, so
I don't have enough space left for that massive pagefile.sys), so I'd have to install a
second hard drive just to hold the paging file. So I have to meet the minimum system
requirements. But with a 60GB paging file I can allocate 30GB of vector independent of
the amount of physical memory I have installed.

PLEASE STOP USING THE PHRASE "PHYSICAL MEMORY". IT IS NEVER GOING TO HAPPEN! Your
continued use of this phrase in consistently erroneous ways is not helping make your case.
It keeps screaming "I have no idea what I'm talking about, but answer my question anyway".

If you allocate a 30GB std::vector, you get 30GB of virtual memory. The chances that this
will all be resident in a 32GB physical memory configuraiton is vanishingly small, let's
just call it "no chance whatsoever" to simplify the discussion. If you want 30GB to be
largely resident, you must
- make sure the user has the privilges to modify the working set quota
- set the working set quota to be large enough to encompass your
code, your data, AND your 30GB vector

Note that this merely IMPROVES the likelihood that your vector will be not paged out when
you go to access part of it. In NO WAY does it make any guarantees! And "working set" is
a "request", not a "non-negotiable demand". The system is completely free to ignore
anyone's working set request if it needs pages for any purpose it deems suitable.

If you want a 100% guarantee that the vector will be resident, and not get paged out, you
need to
- make sure the user has the privileges to use VirtualLock, and to be able
to lock at least 30GB down
- use VirtualLock to lock the memory [probably will fail on a system with
32GB, so plan on 64GB as your minimum config]

NOW you have locked down 30GB of VIRTUAL memory. It won't page out. But on a machine of
< 64GB, it is unlikely this can happen. It may not even be possible because there might
be no way to allow a user to set a working set large enough, or do a VirtualLock large
enough; I don't know what the administrative limits are. But these are the minimum steps
you need to take.

Note that both of these require administrative controls be exercised. Note that virtually
none of your customers would have a clue as to how to set any of these parameters, so you
will have to give them a script to follow (and I have never worked with these parameters,
and I have no idea how to access them). The script itself will require someone with
administrative privileges set these parameters.

Note that a system that can survive with only 2GB to run the kernel and the application
you are working with is extremely unlikely. So if you want 30GB, you had better have a LOT
of spare memory around! So don't even think of trying to do this on a system < 64GB.

You will never, ever, under any circumstances imaginable be able to allocate physical
memory. You can only allocate virtual memory.
joe
****

Peter Olcott

unread,

Dec 23, 2009, 10:02:36 AM12/23/09

to

"Joseph M. Newcomer" <newc...@flounder.com> wrote in

message news:io33j5tengqp4pfn2...@4ax.com...

> See below...
> On Tue, 22 Dec 2009 07:39:14 -0600, "Peter Olcott"
> <NoS...@SeeScreen.com> wrote:
>
>>
>>"Joseph M. Newcomer" <newc...@flounder.com> wrote in
>>message news:qjb0j55qf7r19bdd2...@4ax.com...
>>> Followup: with my trivial MFC app, the largest
>>> allocation
>>> I could get with VirtualAlloc or
>>> malloc was 1194*1024*1024 bytes. 1195 failed.
>>
>>Yes but then are you testing this on Win64 with 8 GB +
>>RAM?
>>I need to know that Win64 with 32 GB of RAM could provide
>>me
>>with a 30 GB std::vector that maps to that much physical
>>memory.
> ****
> No, I'm testing it in the context of Win32. If you have
> Win64, then try the experiment
> yourself!

I am trying to determine the benefits of this $800 upgrade
before I do it.

>
> Note that if I had my Win64 system up, with 4GB of RAM, I
> could STILL allocate 30GB of
> vector. Most of it would be paged out most of the time,
> but could ALLOCATE it! I could
> allocate it if I had 2GB of physical memory! I see no
> reason I would need 8GB+ of RAM to
> do a simple allocation. In fact, if the allocation failed
> I would seriously start
> worrying about Windows. Note, of course, that I also
> need, ir order to allocate 30GB of
> VM, probably on the order of 60GB of paging file, which
> would consume pretty much my
> entire 68GB C: drive (which won't happen because I have
> Office and VS installed on it, so

You can upgrade to a 500 GB WD SATA for $50

> I don't have enough space left for that massive
> pagefile.sys), so I'd have to install a
> second hard drive just to hold the paging file. So I have
> to meet the minimum system
> requirements. But with a 60GB paging file I can allocate
> 30GB of vector independent of
> the amount of physical memory I have installed.
>
> PLEASE STOP USING THE PHRASE "PHYSICAL MEMORY". IT IS
> NEVER GOING TO HAPPEN! Your
> continued use of this phrase in consistently erroneous
> ways is not helping make your case.
> It keeps screaming "I have no idea what I'm talking about,
> but answer my question anyway".

So then what does virtual memory map to? The VM pages are
swapped in and out of something, right?

Yes, but, I see no good reason why this last step should be
necessary in my case. I also see no reason why a 32 GB
system could not easily keep an entire 30 GB memory
allocation resident in RAM. Would Windows arbitrarily
allocate more than 2GB of memory to itself to do useless
things? Why can't windows keep itself in the extra 2 GB? If
it does this then a 30 GB resident allocation should be a
given.

>
> NOW you have locked down 30GB of VIRTUAL memory. It won't
> page out. But on a machine of
> < 64GB, it is unlikely this can happen. It may not even
> be possible because there might
> be no way to allow a user to set a working set large
> enough, or do a VirtualLock large
> enough; I don't know what the administrative limits are.
> But these are the minimum steps
> you need to take.
>
> Note that both of these require administrative controls be
> exercised. Note that virtually
> none of your customers would have a clue as to how to set
> any of these parameters, so you
> will have to give them a script to follow (and I have
> never worked with these parameters,
> and I have no idea how to access them). The script itself
> will require someone with
> administrative privileges set these parameters.
>
> Note that a system that can survive with only 2GB to run
> the kernel and the application
> you are working with is extremely unlikely. So if you want
> 30GB, you had better have a LOT
> of spare memory around! So don't even think of trying to
> do this on a system < 64GB.

Win XP 32 has run fine for years with far less than 1.0 GB,
Does Windows 7 insist on wasting much more than this to do
useless things? Can we turn those useless things off?

>
> You will never, ever, under any circumstances imaginable
> be able to allocate physical
> memory. You can only allocate virtual memory.
> joe

It seems to me that is merely a mathematical mapping with
one corresponding to the other equivalent to each other
except for speed. When two things are equivalent, then they
can be readily substituted, one for the other. As it
actually turns out, I would allocate physical memory, that
is what malloc() does. The OS re-interprets this to mean the
allocation of virtual memory on some, not all systems.

Joseph M. Newcomer

unread,

Dec 23, 2009, 11:41:07 AM12/23/09

to

See below...

****
But I don't want to do that, because I don't need to. Also, I don't have a free slot in
the machine to mount the drive, so it would have to dangle somehow (I have other, more
interesting, devices in the slots)
****

>
>> I don't have enough space left for that massive
>> pagefile.sys), so I'd have to install a
>> second hard drive just to hold the paging file. So I have
>> to meet the minimum system
>> requirements. But with a 60GB paging file I can allocate
>> 30GB of vector independent of
>> the amount of physical memory I have installed.
>>
>> PLEASE STOP USING THE PHRASE "PHYSICAL MEMORY". IT IS
>> NEVER GOING TO HAPPEN! Your
>> continued use of this phrase in consistently erroneous
>> ways is not helping make your case.
>> It keeps screaming "I have no idea what I'm talking about,
>> but answer my question anyway".
>
>So then what does virtual memory map to? The VM pages are
>swapped in and out of something, right?

****
Didn't I just say that? The file is called swapfile.sys, and it is where pages are
swapped. And you have to have a swapfile.sys file big enough to hold your entire virtual
memory. If you do not, then any attempt to allocate additional memory fails because there
isn't any place to swap it out to. Pages, when "committed", are given a fixed place in
swapfile.sys to which they will be swapped. So when you do the allocation (VirtualAlloc
with MEM_COMMIT, which is ultimately what malloc, new, and HeapAlloc call when they need
to expand), it checks to see that there is someplace to swap the pages TO, and that's why
you need a swapfile of on the order of 50GB, to allow for your large vector and everything
else that might be swapped (which includes parts of the kernel!)
****

****
Because it leaves only 2GB for the kernel, for all I/O, and for all other processes that
are running. Did you think your two programs are the ONLY things that are running? Look
at task manager, processes, and check the option to show ALL processes. Some of these are
critical to Windows, others may be optional but a machine would be severely crippled
without them. Windows has parameters that says how much space needs to be reserved. Note
also that you have not accounted for the non-paged pool. There is no reason to expect
that Win64 thinks 2GB of kernel space is going to be sufficient. And therefore, it will
probably swap. You need to think of LARGE headroom here, and the next step up from 32GB
is the MINIMUM you should consider. And I wouldn't bet that anything less than 64GB is
going to make this viable. So at 32GB you are working with between zero and negative
headroom, and that makes it unlikely that you will be able to get such a large allocation
and avoid swapping and/or be able to use VirtualLock.

And remember, unless you change the working set parameters, Windows doesn't care HOW much
space is there, it is going to trim that working set down by shoving the pages out,
INDEPENDENT of how much space you actually have. And setting working set size requires
administrative authorization (not administrative privileges, although the authorization
has to be set by someone with admin privileges, and as I said, I don't know how to do it,
so I don't know if a value that high can actually be set).
****

****
I run WinXP on a 512K machine (the machine I am using right now to answer this). Note
that (a) every version of Windows is larger (b) you are trying to compare Win32 with
Win64, and Win64 is inherently larger (c) the processes on Win64 might be larger (d) maybe
you can, maybe you can't, and even if you can, your customers might find it objectionable
to do so, because turning off some of those processes really does cripple the machine
****

>
>>
>> You will never, ever, under any circumstances imaginable
>> be able to allocate physical
>> memory. You can only allocate virtual memory.
>> joe
>
>It seems to me that is merely a mathematical mapping with
>one corresponding to the other equivalent to each other
>except for speed. When two things are equivalent, then they
>can be readily substituted, one for the other. As it
>actually turns out, I would allocate physical memory, that
>is what malloc() does. The OS re-interprets this to mean the
>allocation of virtual memory on some, not all systems.
****

It is, if you take into consideration the fact that there is this massively complex
mapping engine (called Windows) in between. And the complex mapping engine does NOT
guarantee that physical allocation == virtual allocation, and in fact part of its role is
to enforce that fact that virtual allocation is most definitely != physical allocation. So
while in the abstract they look like a smooth mathematical mapping, the reality of the
mapping function says that you have to go to a LOT of effort, some of questionable ability
to achieve (until you have research the administrative controls, for example), to make
that a "pure" mathematical mapping.

The OS has only one role: to convert virtual allocations to physical allocations. And it
does that in real-time, by pushing some virtual allocations out to the swapfile to make
room for other virutal allocations.

malloc does not, and never has, not once in its entire existence on Win32, EVER allocated
physical memory. That's because it ultimately calls VirtualAlloc, whose name says it all:
allocate virtual memory. There is NO other mechanism you have available in user space to
allocate memory. There is not, and never has been, a method in user applications to
allocate physical memory (ignoring the weird address extension mechanism, which gets even
stranger once you start looking at it in detail).

The OS has ALWAYS, and FOREVER, interpreted VirtualAlloc as "allocate virtual memory". It
means that on all systems. It meant this in Windows NT 3.1, and 3.5, 3.51, 4.0, 2000, XP,
Vista, and now Windows7. It never meant anything else, ever. Functions you see like new
just call malloc. The malloc function is interpreted in terms of HeapAlloc, and deep in
its guts, HeapAlloc calls VirtualAlloc. End of discussion. It is virtual memory, all the
time, every time. Physical memory is a happy place for virtual memory, but physical
memory is a transient condition, and subject to Change Without Notice. Always has been.
Unless you use VirtualLock to lock a page down, it is subject to being paged out. And
VirtualLock requires administrative rights be given to the user to use it.

So you are confused. malloc NEVER, EVER meant "allocate physical memory"; it has ALWAYS
meant "allocate virtual memory". Physical memory is a concept managed by the operating
system, which is why when on my little 512MB machine I switch from my newsreader to
Outlook, or vice-versa, there is a measurable delay (and the disk light blinks
frantically) as each of these massive programs contend for the little physical memory
available.

Note, by the way, that the KERNEL *ALSO* works in terms of virtual memory; the kernel
cannot access physical memory. It can only ask the lowest-level memory manager to map a
physical address to a virtual address, and it manages all the physical memory as virtual
memory in a 4GB virtual address space thereafter. So it doesn't care about physical
memory either. For I/O devices, HalTranslateBusAddress maps physical device addresses to
virtual memory addresses, so even a device driver programmer cannot work with physical
memory as physical memory. Deep, deep, DEEP inside Windows is a physical-to-virtual
mapper, but it always works at a VERY low level. Even on a 64GB Xeon server, it can't do
anything with that physical address of memory except pass it to a DMA chip (and there is a
LOT of fun with mapping registers to go above 4GB there!) or convert it to a virtual
address.

So your fixation on physical memory as an allocation mechanism is always wrong. Only in
the kernel is there a call to allocate physical memory,
MmAllocateContiguousPhysicalMemory, and (a) all the kernel sees is the virtual address of
this physical memory (b) it won't work for large allocations because it probably can't get
enough contiguous physical memory to satisfy them and (c) once gotten, if you would ever
need to use it for non-DMA work (e.g., MmGetMemoryAddressForMdlSafe) then it is likely to
fail because the kernel would need a block of that many virtual addresses to handle it,
which is unlikely. (I know (b) is true because a fellow device driver writer hit this
limitation)

And you are not working in the kernel on a device driver. So the ONLY thing you can
allocate is virtual memory. As far as the OS is concerned, physical memory is a
transient, somewhat accidental condition.

For example, if I am running a debug program (remember, the debug allocator initializes
all storage) and I call malloc(1GB), and I'm running on this poor 512K machine, I will see
massive page swapping as the allocator initializes that 1GB, because on a 512K machine, to
access all that 1GB, the stuff at the front is going to get swapped out before I get to
the back. And the memory at the back was already swapped out, so it has to be swapped in.

It's virtual memory, all the way.
joe
****