Hi I have run into some problems with allocating numpy.memmaps
exceeding and accumulated size of about 2 GB. I have found out that
the real problem relates to numpy.memmap using mmap.mmap
I've written a small test program to illustrate it:
import itertools
import mmap
import os
files = []
mmaps = []
file_names= []
mmap_cap=0
bytes_per_mmap = 100 * 1024 ** 2
try:
for i in itertools.count(1):
file_name = "d:/%d.tst" % i
file_names.append(file_name)
f = open(file_name, "w+b")
files.append(f)
mm = mmap.mmap(f.fileno(), bytes_per_mmap)
mmaps.append(mm)
mmap_cap += bytes_per_mmap
print "Created %d writeable mmaps containing %d MB" % (i,
mmap_cap/(1024**2))
#Clean up
finally:
print "Removing mmaps..."
for mm, f, file_name in zip(mmaps, files, file_names):
mm.close()
f.close()
os.remove(file_name)
print "Done..."
which creates this output
Created 1 writeable mmaps containing 100 MB
Created 2 writeable mmaps containing 200 MB
....
Created 17 writeable mmaps containing 1700 MB
Created 18 writeable mmaps containing 1800 MB
Removing mmaps...
Done...
Traceback (most recent call last):
File "C:\svn-sandbox\research\scipy\scipy\src\com\terma\kha
\mmaptest.py", line 16, in <module>
mm = mmap.mmap(f.fileno(), bytes_per_mmap)
WindowsError: [Error 8] Not enough storage is available to process
this command
There is more than 25 GB of free space on drive d: at this stage.
Is it a bug or a "feature" of the 32 bit OS?
I am surprised about it as I have not found any notes about these
kinds of limitations in the documentation.
I am in dire need of these large memmaps for my task, and it is not an
option to change OS due to other constraints in the system.
Is there anything I can do about it?
Best wishes,
Kim
It's a limitation, yes. That's what 64-bit-OSes are for.
> I am surprised about it as I have not found any notes about these
> kinds of limitations in the documentation.
>
> I am in dire need of these large memmaps for my task, and it is not an
> option to change OS due to other constraints in the system.
>
> Is there anything I can do about it?
Only by partitioning data yourself, and accessing these partitions. Like
in the good old days of DOS-programming.
Diez
>S> OS: Win XP SP3, 32 bit
>S> Python 2.5.4
>S> Hi I have run into some problems with allocating numpy.memmaps
>S> exceeding and accumulated size of about 2 GB. I have found out that
>S> the real problem relates to numpy.memmap using mmap.mmap
On Windows XP the virtual address space of a process is limited to 2 GB
unless the /3GB switch is used in the Boot.ini file.
http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org
Years ago I worked on Sun Sparc system which had much more limited
shared memory access, due to hardware limitations. So 2gig seems pretty
good to me.
There is supposed to be a way to tell the Windows OS to only use 1 gb of
virtual space, leaving 3gb for application use. But there are some
limitations, and I don't recall what they are. I believe it has to be
done globally (probably in Boot.ini), rather than per process. And some
things didn't work in that configuration.
DaveA
Perhaps you still don't recognize what the limit is. 32 bits can only
address 4 gigabytes of things as first-class addresses. So roughly the
same limit that's on mmap is also on list, dict, bytearray, or anything
else. If you had 20 lists taking 100 meg each, you would fill up
memory. If you had 10 of them, you might have enough room for a 1gb
mmap area. And your code takes up some of that space, as well as the
Python interpreter, the standard library, and all the data structures
that are normally ignored by the application developer.
BTW, there is one difference between mmap and most of the other
allocations. Most data is allocated out of the swapfile, while mmap is
allocated from the specified file (unless you use -1 for fileno).
Consequently, if the swapfile is already clogged with all the other
running applications, you can still take your 1.8gb or whatever of your
virtual space, when much less than that might be available for other
kinds of allocations.
Executables and dlls are also (mostly) mapped into memory just the same
as mmap. So they tend not to take up much space from the swapfile. In
fact, with planning, a DLL needn't take up any swapfile space (well, a
few K is always needed, realistically).. But that's a linking issue for
compiled languages.
DaveA
I do understand the 2 GB address space limitation. However, I think I
have found a solution to my original numpy.memmap problem (which spun
off to this problem), and that is PyTables, where I can address 2^64
data on a 32 bit machine using hd5 files and thus circumventing the
"implementation detail" of the intermedia 2^32 memory address problem
in the numpy.memmap/mmap.mmap implementation.
I just watched the first tutorial video, and that seems like just what
I am after (if it works as well in practise at it appears to do).
http://showmedo.com/videos/video?name=1780000&fromSeriesID=178
Cheers,
Kim