Cannot activate sbcl

7 views
Skip to first unread message

iu2

unread,
Nov 4, 2007, 8:47:05 AM11/4/07
to
Hi all,

I (am trying to) use sbcl on Windows. On some PC's it runs ok, while
on others I get the message (upon runnin sbcl.exe):

VirtualAlloc: 0x1e7.
ensure_space: failed to validate 536870912 bytes at 0x09000000
(hint: Try "ulimit -a"; maybe you should increase memory limits.)

What does it mean? I get this message on a PC with 1 GB RAM so I
believe memory is not the issue.

I'll appreciate your help

iu2

Robert Dodier

unread,
Nov 4, 2007, 12:14:29 PM11/4/07
to
iu2 wrote:

I believe the problem is that SBCL is attempting to execute code
from a data area of memory, and Windows prevents that (because
some malicious software does that too).

If I remember correctly, the Windows feature which governs this
is Data Execution Prevention. Try to find DEP in the conrol panel
and disable it for sbcl.exe. I think you have to enter the entire
path.

If that doesn't help, you might get better-informed help via one of
the SBCL-specific mailing lists.

Good luck

Robert Dodier

Alex Mizrahi

unread,
Nov 4, 2007, 3:00:56 PM11/4/07
to
i> I (am trying to) use sbcl on Windows. On some PC's it runs ok, while
i> on others I get the message (upon runnin sbcl.exe):

i> VirtualAlloc: 0x1e7.
i> ensure_space: failed to validate 536870912 bytes at 0x09000000
i> (hint: Try "ulimit -a"; maybe you should increase memory limits.)

i> What does it mean? I get this message on a PC with 1 GB RAM so I
i> believe memory is not the issue.

probably that means that address space is fragmentated on those windozes --
there still can be gigabytes of _virtual_ memory, but not in contignuous
chunk at that location. i think you can do nothing with this, just don't use
SBCL which uses hostile memory management..

p.s.: i haven't ever launched SBCL on Windows, so that's just my guess. i
also think that if you are not one to port SBCL on Windoze, you don't need
to launch it until it will become fully suported (i suspect it will never
be). barely i can image reason why you really need to do this..


Bernd Schmitt

unread,
Nov 4, 2007, 3:26:42 PM11/4/07
to
On 04.11.2007 14:47, iu2 wrote:
> I (am trying to) use sbcl on Windows.
> What does it mean? I get this message on a PC with 1 GB RAM so I
> believe memory is not the issue.

You are right, 1GB RAM can be sufficient ;)
I use* emacs/slime + sbcl since 0.9.x on a 1GB machine (WinXP).
Which version of Win do you use?


Bernd

* periodically to (re-)start learning lisp ...

Joost Diepenmaat

unread,
Nov 4, 2007, 5:20:11 PM11/4/07
to
On Sun, 04 Nov 2007 05:47:05 -0800, iu2 wrote:
> VirtualAlloc: 0x1e7.
> ensure_space: failed to validate 536870912 bytes at 0x09000000 (hint:
> Try "ulimit -a"; maybe you should increase memory limits.)
>
> What does it mean? I get this message on a PC with 1 GB RAM so I believe
> memory is not the issue.

Having 1 Gb of RAM (or even having 512+ Mb of memory "free") may not in
general mean you can allocate 1/2 a Gb of RAM in one giant blob (as seen
by the allocating program).

The ulimit remark points at a (as far as I know Unix/Posix specific)
command that allows per-process limits like memory/stack allocation and
cpu time to be specified - on unix you can use it to specify limits for
any child process you may start - they're passed down from parent
processes to their children and so on.

Possibly the Windows OS you're using has a similar facility that
currently restricts the maximum allocation/stack size, so if you can
influence it, the problem can be eliminated (if, as I said, that much
memory is available as a single blob at all)

I only have a very superficial knowledge of windows, so I can't help you
with the details.

Joost.

iu2

unread,
Nov 5, 2007, 1:00:47 AM11/5/07
to
Guys, thanks for your replies. As I get the picture, I won't be able
to deploy my sbcl lisp program to other PC's in my group without
hacking with some of the Windows...
That's a pity, because I liked sbcl so much (when it worked). It's 10
times faster than CLISP, and keeps all debugging information.

In that case I need to gnu.. ;-)

Thanks
iu2

ggga...@gmail.com

unread,
Nov 5, 2007, 2:08:29 AM11/5/07
to


Searching for VirtualAlloc on the sbcl-help mailing list gives:

> I keep getting this error when I try to run sbcl. Running
ulimit will
> be tough. Any thoughts?
>
> C:\Program Files\Steel Bank Common Lisp\1.0>sbcl


> VirtualAlloc: 0x1e7.
> ensure_space: failed to validate 536870912 bytes at 0x09000000
> (hint: Try "ulimit -a"; maybe you should increase memory
limits.)

Try

sbcl --dynamic-space-size 128

(or even something smaller). The issue is with SBCL needing
certain
chunks of address-space for itself, and Windows loading eg. dlls
there
before SBCL has mapped it's spaces. Proper fix forthcoming,
maybe
for 1.0.1, but more likely for 1.0.2.

Don't know if it helps -- Gerard

D Herring

unread,
Nov 5, 2007, 2:18:25 AM11/5/07
to
iu2 wrote:
> Guys, thanks for your replies. As I get the picture, I won't be able
> to deploy my sbcl lisp program to other PC's in my group without
> hacking with some of the Windows...

Assuming you're just hitting the DEP issue, that's just part of the
headache in configuring a MSWin installer package.

> That's a pity, because I liked sbcl so much (when it worked). It's 10
> times faster than CLISP, and keeps all debugging information.
>
> In that case I need to gnu.. ;-)

If you are doing a lot of Windows devel, Corman Lisp might be a good
investment (or allegro or lispworks).

- Daniel

Alex Mizrahi

unread,
Nov 5, 2007, 2:29:01 AM11/5/07
to
i> Guys, thanks for your replies. As I get the picture, I won't be able
i> to deploy my sbcl lisp program to other PC's in my group without
i> hacking with some of the Windows...

you can deploy via vmware virtual machine -- create machine on free vmware
server, and "play" it with free vmware player.
resources usage shouldn't be a problem -- virtual machine instance with
Linux eats memory comparable to one Eclipse IDE instance..

i> That's a pity, because I liked sbcl so much (when it worked). It's 10
i> times faster than CLISP

have you tried ECL?


Tim Bradshaw

unread,
Nov 5, 2007, 5:17:50 AM11/5/07
to
On Nov 4, 10:20 pm, Joost Diepenmaat <jo...@zeekat.nl> wrote:

> Having 1 Gb of RAM (or even having 512+ Mb of memory "free") may not in
> general mean you can allocate 1/2 a Gb of RAM in one giant blob (as seen
> by the allocating program).

It should do, unless that program has already done something to
fragment its address space, or there is some limit in place. Remember
it's virtual address space - the physical addresses do not need to be
contiguous.

However what is much less likely to succeed is a request for a chunk
of virtual address space starting from a specific address.

David Lichteblau

unread,
Nov 5, 2007, 9:20:09 AM11/5/07
to
On 2007-11-05, Tim Bradshaw <tfb+g...@tfeb.org> wrote:
> However what is much less likely to succeed is a request for a chunk
> of virtual address space starting from a specific address.

IIRC the person who sent the original bug report to sbcl-devel about
this also tried my relocation patch with no success, which would imply
that no contiguous space large enough for the default dynamic space size
was available on his system at all.

Now that the --dynamic-space-size option is available, it should be
easier to work around this issue.


And in reply to other articles in this thread:

- sbcl-devel is really a better place for this kind of question.

- If --dynamic-space-size alone does not help, or only works for
really small values, it might be worth trying it in combination with
the relocation patch.

- As far as I know, DEP has nothing to do with this.

- There have been reports that starting SBCL from emacs results in a
memory layout different from that of SBCL started in cmd.exe, but I
don't know the details. Perhaps someone else can say more about
that.


d.

George Neuner

unread,
Nov 5, 2007, 9:22:43 AM11/5/07
to
On Sun, 04 Nov 2007 05:47:05 -0800, iu2 <isr...@elbit.co.il> wrote:

>Hi all,
>
>I (am trying to) use sbcl on Windows. On some PC's it runs ok, while
>on others I get the message (upon runnin sbcl.exe):
>
>VirtualAlloc: 0x1e7.
>ensure_space: failed to validate 536870912 bytes at 0x09000000
>(hint: Try "ulimit -a"; maybe you should increase memory limits.)

I posted a longish answer to this question a while back giving some
reasons why Windows might cause the problem:

http://groups.google.com/group/comp.lang.lisp/msg/e03a8ebf612f6a0d

The upshot is you need to reconfigure either SBCL or Windows.

George
--
for email reply remove "/" from address

David Lichteblau

unread,
Nov 5, 2007, 9:51:47 AM11/5/07
to
On 2007-11-05, George Neuner <gneuner2/@/comcast.net> wrote:
> http://groups.google.com/group/comp.lang.lisp/msg/e03a8ebf612f6a0d

[...]
| This can cause fragmentation of the global user address space such
| that there might not be enough contiguous addresses to satisfy a large
| request.

> The upshot is you need to reconfigure either SBCL or Windows.

What I would like to understand is whether relocation together with a
limit on the size of dynamic space is enough to support Windows, or
whether fragmentation on Windows is so bad that SBCL would need to leave
holes in dynamic space in order to support a reasonably-sized dynamic
space.

- Does placement of dlls affect only those processes that actually
load these dlls or is there an effect on unrelated processes?

- Is there a reason a process might load more dlls on startup under
some circumstances and not others? (As in the case of SBCL being
started from Emacs.)

- Is the situation on 64 bit Windows equally difficult?


Thanks,
David

Dimiter "malkia" Stanev

unread,
Nov 5, 2007, 5:43:04 PM11/5/07
to

I've had similar issue in a C program, trying to allocate huge
contiguous buffer. What I've ended up doing was kind of hacky, instead
of relying on HeapAlloc, VirtualAlloc, etc., I've made a huge array of 1.5GB

char memory[1024*1024*1024 + 512*1024*1024]; /* Unitilized in BSS */

Then I've changed all my memory through ad reused it from there (using
Doug Lea's dlmalloc()'s mspace_s API).

Q: Why this worked, and not just using VirtualAlloc?
A: Looks like if windows sees that your BSS or DATA section is
enourmous, it would reallocate any DLL's that might be mapped into that
are somewhere else (HIGH in the memory space I guess).

Q: Why VirtualAlloc 1.5GB might almost never work?
A: Possibly as once your application is started, you can't move around
code (or you can't easily move it, there might be some tricks), so you
have to do before the actual application is started - SBCL.EXE

For a crude example take any application, and open it with the
Dependancy Walker (depends.exe from http://www.dependencywalker.com/),
see how various DLL's (hooks, google-desktop search is there, IME's,
etc) are mapped into it. Most likely on other people's machines that
would be mapped even more differently.

So I would say, if you always want safe deployment, hack SBCL this way -
instead of using VirtualAlloc, make it use a huge unitilized BSS
section, and use the data from there. It might be quite hacking, or not
much (haven't looked into the source code).

George Neuner

unread,
Nov 6, 2007, 6:53:49 PM11/6/07
to
On 5 Nov 2007 14:51:47 GMT, David Lichteblau
<usene...@lichteblau.com> wrote:

>On 2007-11-05, George Neuner <gneuner2/@/comcast.net> wrote:
>> http://groups.google.com/group/comp.lang.lisp/msg/e03a8ebf612f6a0d
>
>[...]
>| This can cause fragmentation of the global user address space such
>| that there might not be enough contiguous addresses to satisfy a large
>| request.
>
>> The upshot is you need to reconfigure either SBCL or Windows.
>
>What I would like to understand is whether relocation together with a
>limit on the size of dynamic space is enough to support Windows, or
>whether fragmentation on Windows is so bad that SBCL would need to leave
>holes in dynamic space in order to support a reasonably-sized dynamic
>space.
>
> - Does placement of dlls affect only those processes that actually
> load these dlls or is there an effect on unrelated processes?

I hadn't wanted to go into gory detail, but I guess the problem isn't
widely known so I guess it's time.

First a couple of words about virtual memory - I'm sure you know most
or all of it already, but I want to make sure we're using the same
terminology.

Ignoring physical addresses, each process has 2 sets of corresponding
linear addresses which I'll call "local" and "global". The global
address space is shared by all processes and is the summ total of the
virtual memory being managed by the OS. The local address space is
private to each process. Addresses may overlap in different spaces -
the value stored at a local address in process A may be different than
the value stored at the same local address in process B. Every local
address corresponds to a single global address, however a single
global address may correspond to multiple different local addresses.

Now to DLLs

Windows memory use problems come because it uses relocatable[*] code.
When a program or DLL is loaded for the first time, the system loader
tries to find a contiguous chunk of global address space for it. If
there are multiple segments (the norm), the various segments may be
loaded into separate chunks that may not be adjacent, but each segment
will, itself, be contiguous. The global chunks are then mapped into
the process's local address space and the code is patched to execute
at those addresses. Note: code executes in _local_ address space.

The first issue with DLLs is that most developers just use the default
base address settings when they compile - resulting in every DLL they
create having the same base address. When a program uses several such
DLLs, all but the first loaded will have to be relocated to work.
[Actually even the first DLL may have to be relocated. Read on...]

When a process tries to load an already loaded DLL, the system loader
attempts to reuse it. The loader tries to map the DLL into the new
process using the same local addresses as in the process that first
loaded the DLL. If the mapping succeeds all is well ... but if the
mapping fails, the existing DLL can't be reused (because it has been
patched to work at a particular set of addresses that aren't
available). The loader then has to start over, loading another copy
of the "shared" DLL and patching it to work somewhere in the new
process.

The final bit of bad news is that different programs that use the same
DLLs don't necessarily load them in the same order - it depends on how
they were linked (or _if_ they were linked. see below).


Program executables also share base addresses by default, however when
a program is loaded, all its "preload" and "normal" segments are
mapped before execution begins, so if another copy were to be started,
the addresses would all be the same and so "preload" and "normal"
program segments can be shared. However, developers can still screw
up VMM by marking segments "dynamic". "Dynamic" segments are not
mapped until touched - they act like and share the traits of private
DLLs even though they are structurally part of the program executable.
Dynamic segments go all the way back to the 80286 and Windows 2.0
(maybe even further). Unfortunately 32-bit Windows still supports
them for backward compatibility and idiots still occasionally discover
and use them. I don't know offhand if 64-bit Windows supports them.


Unix's shared libraries are position independent[*] which is why you
don't see the same problems in Unix as in Windows. Window's memory
use issues are the compatibility legacy of poor design choices made
back in the days of DOS. PI code was the norm before micros - why the
early micro compiler writers decided to switch to relocatable is a
mystery to me. Any still around who'd like to explain it?

[*] "Relocatable" code is not "position independent" code. See
http://en.wikipedia.org/wiki/Position_independent_code for a pretty
good treatise on both including a section covering the highlights of
Windows DLLs.


> - Is there a reason a process might load more dlls on startup under
> some circumstances and not others? (As in the case of SBCL being
> started from Emacs.)

DLLs can be used without having been linked to the program by using
LoadLibrary() and GetProcAddress(). Unix provides a similar facility
using dlopen() and dlsym(). Many programs use DLLs dynamically in
this fashion to support things like user customization, plug-n-play
modules, levels of service, etc.

And as explained above, even when loading only normal DLLs a program
can get into trouble itself and cause even more trouble for subsequent
programs.


> - Is the situation on 64 bit Windows equally difficult?

64-bit Windows has the same potential problem, but its global address
space is so much larger that it can avoid a lot of address contention
by spacing out the loads.

The only real fix is to return to position independent code.

>Thanks,
>David

You're welcome.

To answer your original question, the problem is a function of
application load order. It is solvable in situations where the load
order can be controlled - such as on a server.

For a more general use workstation, you might try running it solo in a
VM - the illusion of lots of space might allow it to work even if the
reality is lots of paging.

George Neuner

unread,
Nov 6, 2007, 7:31:17 PM11/6/07
to

It's an interesting suggestion. It still requires enough contiguous
free address space to eventually map the array though. BSS addresses
are reserved at load time (using VirtualAlloc).

As you speculated, the difference between this approach and using
VirtualAlloc dynamically in your code is that your program and all its
DLLs were loaded first and possibly fragmented the space you could
have used.

It is NOT safe for deployment - at least not by itself. Fragmentation
is a function of application load order and it is not limited to the
set of currently running applications - their loading may have been
influenced by other programs that are no longer executing. You would
still have to guarantee that SBCL was loaded at a time when there is
enough contiguous global address space to satisfy the reservation.

Dimiter "malkia" Stanev

unread,
Nov 6, 2007, 10:00:43 PM11/6/07
to
> It's an interesting suggestion. It still requires enough contiguous
> free address space to eventually map the array though. BSS addresses
> are reserved at load time (using VirtualAlloc).

I could be wrong, but I think what happens is that Windows sees that
your executable has a DATA section that might overlap a DLL's address
space then it would reallocate that DLL (or others) to different address
space. This happens before your application is started, or even fully
loaded (not sure here, the wine source code might reveal something in
that direction).

> As you speculated, the difference between this approach and using
> VirtualAlloc dynamically in your code is that your program and all its
> DLLs were loaded first and possibly fragmented the space you could
> have used.

Something like that. Off course if your "array" was say 1.9gb, and there
wasn't memory for the DLLs to fit with it, it won't run. What can be
done here (another tricky thing) is to have many tiny executables with
different BSS sections in them, which then load a bigger executable in
the form of a DLL (e.g. the real executable is in DLL).

That's not a nice solution, and there might be way better solution than
this one. But it worked for us, on one specific tool that required such
big contigous address space.

> It is NOT safe for deployment - at least not by itself. Fragmentation
> is a function of application load order and it is not limited to the
> set of currently running applications - their loading may have been
> influenced by other programs that are no longer executing. You would
> still have to guarantee that SBCL was loaded at a time when there is
> enough contiguous global address space to satisfy the reservation.

Each application has its own virtual address space. The DLL's are
usually loaded at some "standard" addresses, which is all for benefit of
not having more than one copy in memory. In the hack I'm describing, it
would make those DLL's to be reallocated (and probably wasting some real
memory).

I mean really. In a perfect world, you would say - load all my dlls from
any address, and this way make sure that I have contigous memory. But
what is done by Windows (and I guess other OS) is a kind of "heh
premature" (not really... or may be) optimization, for the sake of
saving memory - e.g. use the same code for any other application - and
because it's loaded in the same address space - the relocations are the
same...

Here is an example to test the idea. Write the next one to file called
a.c, and use the Microsoft Visual Studio compiler to compile it, just
type "cl a.c". (If you use cygwin or mingw, remove the pragmas, and add
the libraries to the command-line).

#include "windows.h"

unsigned char a[1024*1024*1024 + 850*1024*1024]; // 1.85GB contigous
memory that can be used for allocs.

#pragma comment(lib,"GDI32")
#pragma comment(lib,"USER32")
#pragma comment(lib,"KERNEL32")

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR
lpCmdLine, int nCmdShow)
{
MSG msg;
while (GetMessage(&msg, NULL, 0, 0))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
a[0] = 10; // This is simply so that the compiler/linker includes the
array a
}
return msg.wParam;
}

Now compile that one to say "a.exe", then run depends.exe on a.exe and
press F7. Sort then by Actual Base. Here's what I've got:

http://img222.imageshack.us/my.php?image=99627412ne2.png

Then change the source code of a.c into a2.c by modifying the array a to
be just a[1]:

unsigned char a[1];

Do the same and see the results:

http://img144.imageshack.us/my.php?image=a2oh2.png

As you can see when the buffer is 1.85GB three DLLs were reallocated -
The Google Desktop one, WS2_32 and WS2HELP.DLL. in the second case
(a[1]) the Google Desktop DLL was just sitting "in-the middle" of the
virtual space. Once it's there after your application is loaded, I don't
think you can move it.

Note: If that number a[1024*1024*1024 + 850*1024*1024] is too high for
your system, decrease it (otherwise it would say something like "Access
Denied").

Yes this is the kind of problem you would end up if you want to deliver
it, this number would vary on different systems, but you can either come
up with requirments for the application to the client, or find some
other solution (for example different executables, or executable that's
generated on the fly, or something even more low-level).

I guess that's now quite off-topic Common Lisp :)

George Neuner

unread,
Nov 7, 2007, 2:00:09 AM11/7/07
to
On Tue, 06 Nov 2007 19:00:43 -0800, "Dimiter \"malkia\" Stanev"
<mal...@gmail.com> wrote:

>> It's an interesting suggestion. It still requires enough contiguous
>> free address space to eventually map the array though. BSS addresses
>> are reserved at load time (using VirtualAlloc).
>
>I could be wrong, but I think what happens is that Windows sees that
>your executable has a DATA section that might overlap a DLL's address
>space then it would reallocate that DLL (or others) to different address
>space. This happens before your application is started, or even fully
>loaded (not sure here, the wine source code might reveal something in
>that direction).

No. Load segments are mapped into a newly created process in the
following order:

1. the OS kernel
2. the program executable.
3. statically linked system DLLs
4. statically linked user DLLs

DLLs referenced at runtime using LoadLibrary() are mapped at the time
of the call.

See my reply to David Lichteblau for details on the DLL load process.
http://groups.google.com/group/comp.lang.lisp/msg/03f9f6e0bacd851b


The kernel and many of the system DLLs live above the kernel address
boundary which is normally 2GB (but optionally may be 3GB for a
server). Memory allocations that are marked to be shared between
processes also go above the kernel boundary whenever possible to keep
the user space as open as possible.

The heavily used system DLLs have been deliberately compiled so that
their default load addresses do not conflict. Since they are pretty
much always loaded, they take priority over other lesser used system
DLLs that might have a conflict. The conflicting DLLs will be
relocated at load time. There may also be (and usually are) conflicts
between user written DLLs which may be significant if they are shared.


>> As you speculated, the difference between this approach and using
>> VirtualAlloc dynamically in your code is that your program and all its
>> DLLs were loaded first and possibly fragmented the space you could
>> have used.
>
>Something like that. Off course if your "array" was say 1.9gb, and there
>wasn't memory for the DLLs to fit with it, it won't run. What can be
>done here (another tricky thing) is to have many tiny executables with
>different BSS sections in them, which then load a bigger executable in
>the form of a DLL (e.g. the real executable is in DLL).

It still won't work if there aren't enough contiguous addresses.


>That's not a nice solution, and there might be way better solution than
>this one. But it worked for us, on one specific tool that required such
>big contigous address space.
>
>> It is NOT safe for deployment - at least not by itself. Fragmentation
>> is a function of application load order and it is not limited to the
>> set of currently running applications - their loading may have been
>> influenced by other programs that are no longer executing. You would
>> still have to guarantee that SBCL was loaded at a time when there is
>> enough contiguous global address space to satisfy the reservation.
>
>Each application has its own virtual address space. The DLL's are
>usually loaded at some "standard" addresses, which is all for benefit of
>not having more than one copy in memory. In the hack I'm describing, it
>would make those DLL's to be reallocated (and probably wasting some real
>memory).

You may actually understand the concepts of virtual memory - that
isn't clear to me just now. What is clear to me is that you don't
understand how Windows uses virtual memory.

32-bit Windows can manipulate 4GB of addresses - period. (A 32-bit x86
CPU can actually handle more but Windows VMM design doesn't permit
it). All of the operating system, concurrently running process
executables, all of their DLLs, and all of their in-memory data must
all fit into that 4GB address space.

This is a separate issue from the paging system which allows a small
physical RAM to simulate a much larger memory.

The VMM gives each process the illusion that it can use all the memory
by creating for it a private parallel address space. Allocated
address regions in that private space are mapped one-to-one with
corresponding address regions carved out of the shared global space.

Your hack may run on your machine, but I can guarantee it won't be
portable to other configurations.

If you want proof that your understanding is wrong, try running 3
instances of your program at the same time. If each process really
gets a separate address space, you should be able to do it. I'm
willing to bet that you can't.

>I guess that's now quite off-topic Common Lisp :)

Common Lisp runs on Windows. This thread is becoming exotic but is
still marginally on-topic AFAICT.

Dimiter "malkia" Stanev

unread,
Nov 7, 2007, 2:33:08 AM11/7/07
to
Thanks, George!

Good to find such knowledgeable people here. I think I have to read very
carefully what you've posted to David.

And you are right, I can't run more than 8 copies of my app, even that's
more than 3 as you've speculated, but your speculation is right - it's
because I've set the virtual memory page file to be 16GB of memory, but
then again usually people will have that at lower (4GB max or something
like that).

I've also browsed a bit the SBCL source code, and I don't think my hack
would help there, unless some other heavy hacking is done. Somehow my
hack can give you contigous memory (if you set VM pagefile to be big
enough), but later you can't reuse that memory for VirtualAlloc() - e.g.
you can't give it back to VM for other stuff. Which might mean that you
can't map it for code execution - pretty useless for a lisp system :) Or
can you?

One of these days I have to dabble more into the Windows VM stuff -
pretty much my daily job is optimizing the tools used to build two games
here and those are constantly running out of memory (that would be C++
memory, hehehe)

Cheers!

Tim Bradshaw

unread,
Nov 7, 2007, 6:53:23 AM11/7/07
to
On Nov 6, 11:53 pm, George Neuner <gneuner2/@/comcast.net> wrote:

> [Many interesting things ]

> Unix's shared libraries are position independent[*] which is why you
> don't see the same problems in Unix as in Windows. Window's memory
> use issues are the compatibility legacy of poor design choices made
> back in the days of DOS.

A light has just come on. I did not realise that Windows used
relocatable, not position-independent libraries (coming from an
entirely Unix background it just never occurred to me), and I now
understand why virtual mappings can be such a horrific saga for
Windows, especially 32bit Windows where there's not a vast amount of
address space to go around.

Thanks! I know a new thing now.

--tim

Michael Livshin

unread,
Nov 7, 2007, 7:27:48 AM11/7/07
to
George Neuner <gneuner2/@/comcast.net> writes:

> PI code was the norm before micros - why the early micro compiler
> writers decided to switch to relocatable is a mystery to me. Any
> still around who'd like to explain it?

let me guess: because PI code eats up a /whole register/, of which x86
doesn't exactly have a lot to begin with -- so, together with the
observation that those early micros didn't do multiprocessing anyway,
the cost/benefit analysis went pretty straightforwardly?

cheers,
--m

John Thingstad

unread,
Nov 7, 2007, 12:05:27 PM11/7/07
to
På Wed, 07 Nov 2007 08:33:08 +0100, skrev Dimiter "malkia" Stanev
<mal...@gmail.com>:

>
> One of these days I have to dabble more into the Windows VM stuff -
> pretty much my daily job is optimizing the tools used to build two games
> here and those are constantly running out of memory (that would be C++
> memory, hehehe)
>
> Cheers!

Microsoft Windows internals is you friend.
Available in dead treee version from Microsoft press or you can dig up
soething with google.

--
Sendt med Operas revolusjonerende e-postprogram: http://www.opera.com/mail/

George Neuner

unread,
Nov 8, 2007, 4:33:01 PM11/8/07
to
On Tue, 06 Nov 2007 23:33:08 -0800, "Dimiter \"malkia\" Stanev"
<mal...@gmail.com> wrote:

>Thanks, George!
>
>Good to find such knowledgeable people here. I think I have to read very
>carefully what you've posted to David.
>
>And you are right, I can't run more than 8 copies of my app, even that's
>more than 3 as you've speculated, but your speculation is right - it's
>because I've set the virtual memory page file to be 16GB of memory, but
>then again usually people will have that at lower (4GB max or something
>like that).

You must be running a 64-bit version of Windows - 32-bit versions are
limited to 4GB page files. Either way you can't allocate any more
than the VMM model allows. Even though the 32-bit x86 can address 64TB
using segments, 32-bit Windows uses only 2 segment selectors (kernel
and user) and 4GB of addresses. The segment base registers are all
zeroed and addresses are just the 32-bit offset. 64-bit Windows
doesn't use segments at all - the x86-64 doesn't have segmenting -
instead each page has a kernel/user protection bit.

But if you are running 64-bit, then I'm very interested in why you had
to use a hack to allocate a 2GB array. Do you have a resource limit
set? Or is it just some old code from 32-bit land?


>I've also browsed a bit the SBCL source code, and I don't think my hack
>would help there, unless some other heavy hacking is done. Somehow my
>hack can give you contigous memory (if you set VM pagefile to be big
>enough), but later you can't reuse that memory for VirtualAlloc() - e.g.
>you can't give it back to VM for other stuff. Which might mean that you
>can't map it for code execution - pretty useless for a lisp system :) Or
>can you?

Yes you can give it back (see VirtualFree), but it's dangerous because
C still thinks it is a valid array, but you'll get a protection fault
if you touch any part of it that has been released to the VMM.


>One of these days I have to dabble more into the Windows VM stuff -
>pretty much my daily job is optimizing the tools used to build two games
>here and those are constantly running out of memory (that would be C++
>memory, hehehe)
>
>Cheers!

And to you!

George Neuner

unread,
Nov 8, 2007, 5:01:09 PM11/8/07
to

Actually the question was mostly rhetorical. There is a dearth of
registers as you suggest, but also the x86 instruction set makes PI
code difficult to write:

- there is no unconditional PC-relative branch, the unconditional JMP
takes an absolute address or an absolute offset within the current
segment.
- the conditional PC-relative branch instructions were limited to ±127
bytes on the 16-bit chips and (somewhat ridiculously) limited to ±32KB
on the 32-bit chips.
- there is no branch-to-subroutine instruction, the CALL instruction
takes an absolute address.

On some other difficult CPUs you can get around the short branch
problem by calculating the effective address of the target, pushing it
on the stack and simulating a return from subroutine. But on the x86,
the RET instruction pulls several register values from the stack - not
just the PC - so it is complex to do and rarely worth the effort.

Anyway - enough ancient history. We now return to your regularly
scheduled c.l.l.

Dimiter "malkia" Stanev

unread,
Nov 9, 2007, 3:01:40 PM11/9/07
to
Hi George,

> You must be running a 64-bit version of Windows - 32-bit versions are
> limited to 4GB page files.

Somehow I've managed to use 16GB and I'm using the 32bit version of
Windows XP (Latest service pack, updates, etc), and I was able to use
all that memory (e.g. I was able to run 9 copies of the "a.exe" with
1.85GB of memory in each = 16.65GB). My machine has 4GB itself + 16GB
virtual memory additional.

Here are my VM settings:

http://img220.imageshack.us/my.php?image=vmyo2.png

> Either way you can't allocate any more
> than the VMM model allows. Even though the 32-bit x86 can address 64TB
> using segments, 32-bit Windows uses only 2 segment selectors (kernel
> and user) and 4GB of addresses.

But each application has it's own virtual space (2GB local, 2GB global),
or you can set-up your windows, and recompile your executables for 3GB
local, 1GB global.

> But if you are running 64-bit, then I'm very interested in why you had
> to use a hack to allocate a 2GB array. Do you have a resource limit
> set? Or is it just some old code from 32-bit land?

Well I'm not :) But I wish to (Dreaming that one day everybody would
switch to OS X at the workplace, still not so worse as dreaming
everybody would switch to Common Lisp).

>> I've also browsed a bit the SBCL source code, and I don't think my hack
>> would help there, unless some other heavy hacking is done. Somehow my
>> hack can give you contigous memory (if you set VM pagefile to be big
>> enough), but later you can't reuse that memory for VirtualAlloc() - e.g.
>> you can't give it back to VM for other stuff. Which might mean that you
>> can't map it for code execution - pretty useless for a lisp system :) Or
>> can you?
>
> Yes you can give it back (see VirtualFree), but it's dangerous because
> C still thinks it is a valid array, but you'll get a protection fault
> if you touch any part of it that has been released to the VMM.

Ah, that's an interresting idea, but I was not able to get it working.
Might work, as long as this whole array is in a separate "segment" (not
sure about the correct term). This little application might show, what I
mean by that (I've actually tried something like that:

#include <stdio.h>
#include <windows.h>

int stump1;
char huge[1024*1024*1800];
int stump2;

int main()
{
unsigned i =0;
printf( "BaseAddr,AllocBas,AllocPro,Reg.Size, State, Protect, Type\n" );

huge[1024*1024*1024] = 100;
huge[1024*1024*1500] = 100;

memset( huge, 0, 1024*1024 );

for( ;; )
{
MEMORY_BASIC_INFORMATION mbi;
memset(&mbi, 0, sizeof(mbi));
VirtualQuery( i, &mbi, sizeof(mbi));

printf( "%p,%p,%p,%p,%p,%p,%p\n",
mbi.BaseAddress,
mbi.AllocationBase,
mbi.AllocationProtect,
mbi.RegionSize,
mbi.State,
mbi.Protect,
mbi.Type
);
if( i + mbi.RegionSize <= i )
break;
i += mbi.RegionSize;
}

for( ;; ) {};

}

Cheers!

George Neuner

unread,
Nov 10, 2007, 2:56:46 AM11/10/07
to
On Fri, 09 Nov 2007 12:01:40 -0800, "Dimiter \"malkia\" Stanev"
<mal...@gmail.com> wrote:

>Hi George,
>
>> You must be running a 64-bit version of Windows - 32-bit versions are
>> limited to 4GB page files.
>
>Somehow I've managed to use 16GB and I'm using the 32bit version of
>Windows XP (Latest service pack, updates, etc), and I was able to use
>all that memory (e.g. I was able to run 9 copies of the "a.exe" with
>1.85GB of memory in each = 16.65GB). My machine has 4GB itself + 16GB
>virtual memory additional.

Ok, I was just reminded privately that XP Workstation (kind of)
supports PAE.

PAE allows access to RAM above 4GB and virtual memory (RAM + page
file) up to 128GB. Applications still are limited to 4GB of addresses
at any one time, but can access RAM above 4GB through AWE, a paging
scheme similar to Expanded memory under DOS (see VirtualAlloc's
MEM_PHYSICAL switch, AllocateUserPhysicalPages, MapUserPhysicalPages).
32-bit server versions of Windows allow more than 4GB of RAM to be
installed so enabling PAE for them makes sense - in addition to use by
applications, the OS can use high RAM to hot swap programs.

However, XP allows only 4GB of RAM regardless of PAE, so AWE doesn't
work. But when PAE is enabled, XP will allow a very large page file
to be created and use the extra space to allow running more programs.
You must enable PAE to get a 4GB configuration on XP (actually to use
>3GB of physical RAM) but IMO it really doesn't make sense to
configure an enormous page file. VMM page table entries double in
size under PAE and the swapped pages on disk have to be tracked, so
you're actually giving up a significant bit of memory without gaining
any of the benefits of AWE.

I had known about the PAE support in XP, but had dismissed it because
it is crippled and essentially useless from the application's point of
view. Please forgive me if I got overly pedantic arguing with you.

Dimiter "malkia" Stanev

unread,
Nov 10, 2007, 4:13:38 AM11/10/07
to
George Neuner wrote:
> On Fri, 09 Nov 2007 12:01:40 -0800, "Dimiter \"malkia\" Stanev"
> <mal...@gmail.com> wrote:
>
>> Hi George,
>>
>>> You must be running a 64-bit version of Windows - 32-bit versions are
>>> limited to 4GB page files.
>> Somehow I've managed to use 16GB and I'm using the 32bit version of
>> Windows XP (Latest service pack, updates, etc), and I was able to use
>> all that memory (e.g. I was able to run 9 copies of the "a.exe" with
>> 1.85GB of memory in each = 16.65GB). My machine has 4GB itself + 16GB
>> virtual memory additional.
>
> Ok, I was just reminded privately that XP Workstation (kind of)
> supports PAE.

That could be the case. I've heard of it, but never really thought that
Windows XP might use it in 32bit without some extra API calls.

It loooks like we do have some pretty cool machines at work :) - The HP
xw8400 desktop:

http://h10010.www1.hp.com/wwpc/us/en/sm/WF05a/12454-12454-296719-307907-296721-1844968.html

Ours are with the Quad-Core Intel Xeon cpu 5355 at 2.66 GHz 8 MB L2
cache 1333 Mhz and custom video cards.

> PAE allows access to RAM above 4GB and virtual memory (RAM + page
> file) up to 128GB. Applications still are limited to 4GB of addresses
> at any one time, but can access RAM above 4GB through AWE, a paging
> scheme similar to Expanded memory under DOS (see VirtualAlloc's
> MEM_PHYSICAL switch, AllocateUserPhysicalPages, MapUserPhysicalPages).
> 32-bit server versions of Windows allow more than 4GB of RAM to be
> installed so enabling PAE for them makes sense - in addition to use by
> applications, the OS can use high RAM to hot swap programs.
>
> However, XP allows only 4GB of RAM regardless of PAE, so AWE doesn't
> work. But when PAE is enabled, XP will allow a very large page file
> to be created and use the extra space to allow running more programs.
> You must enable PAE to get a 4GB configuration on XP (actually to use
>> 3GB of physical RAM) but IMO it really doesn't make sense to
> configure an enormous page file. VMM page table entries double in
> size under PAE and the swapped pages on disk have to be tracked, so
> you're actually giving up a significant bit of memory without gaining
> any of the benefits of AWE.

Aha that clears it then. Fortunately all our machines should support it,
so I can deploy the hacky mechanism I've described. I'm already using it
in an application that servers as quick/dirty replacement for swapping
memory-manager that would use another applications virtual memory for
address space and through ReadProcessMemory WriteProcessMemory would
access it (highly hacked mechanism, that is not recommended for general
usage, especially Read/WriteProcessMemory).

> I had known about the PAE support in XP, but had dismissed it because
> it is crippled and essentially useless from the application's point of
> view.

> Please forgive me if I got overly pedantic arguing with you.

On the contrary! The information you provided gave me insights into some
of the things I thought I knew, and made me read about them again and
others.

That's what I like about comp.lang.lisp - you can find people with great
knowledge on lots of different topics :)

>
> George
> --
> for email reply remove "/" from address

Cheers,
malkia

George Neuner

unread,
Nov 10, 2007, 11:24:36 PM11/10/07
to
On Sat, 10 Nov 2007 01:13:38 -0800, "Dimiter \"malkia\" Stanev"
<mal...@gmail.com> wrote:

>... I can deploy the hacky mechanism I've described. I'm already using it

>in an application that servers as quick/dirty replacement for swapping
>memory-manager that would use another applications virtual memory for
>address space and through ReadProcessMemory WriteProcessMemory would
>access it (highly hacked mechanism, that is not recommended for general
>usage, especially Read/WriteProcessMemory).

If you can stipulate execution on a server, I think it would be better
to use AWE and directly access high RAM. Or better yet, move to
64-bits and forget the hacks altogether.

If you have to run on a 32-bit workstation, I think you're fooling
yourself with this memory server hack. A workstation can't use more
than 4GB RAM, so the "server" processes get swapped and all your
"memory" data ends up being stored on disk in the page file. Plus the
xxProcessMemory functions copy data rather than accessing it directly,
so the pages you're accessing may have to be swapped in anyway and
then you add security overhead moving data across process boundaries.

It would be more straightforward, and probably more efficient to just
create one or more contiguous files and map views of them as necessary
to access the data. If the data is transient and you don't care to
retain it, you can allocate space directly from the page file.
(see CreateFileMapping w/ hFile = INVALID_HANDLE_VALUE)

I think we've beat this horse to death now and we've definitely
wandered off topic for c.l.l. Good developing!

Reply all
Reply to author
Forward
0 new messages