>>> a=[str(i) for i in xrange(10000000)]
This takes 635m/552m/2044 memory (VIRT/RES/SHR)
>>> b={}
>>> for i in xrange(10000000):
... b[str(i)]=i
Then the memory usage increased to 1726m/1.6g/2048
>>> del b
I expect the memory usage drop to the ammount before b was
created(635m/552m/2044), but it's actually 1341m/1.2g/2048
Could anyone please explain why this happens? It seems some memory
are not freed. I'm running into problems with this as my program is
very memory cosuming and need to frequently free some object to reuse
the memory. What is the best way to free the memory of b completely
(i.e. back to the status as b was created)? I'm using Python 2.5.2
Thanks.
Yuanxin
My understanding is that for efficiency purposes Python hangs on to
the extra memory even after the object has been GC-ed and doesn't give
it back to the OS right away. This shouldn't matter to your program
though, as the "extra" memory will still be available for its use. And
assuming you're using CPython and a/b isn't referred to by anything
else, I believe they should be GC-ed immediately by the refcount
system after the 'del' s executed. So, in theory you shouldn't need to
worry about any of this, unless your example is not accurate and
you're dealing with cyclical references or C objects or something.
Cheers,
Chris
--
Follow the path of the Iguana...
http://rebertia.com
> I'm having some problems with the memory recycling/garbage collecting of
> the following testing code:
>
>>>> a=[str(i) for i in xrange(10000000)]
> This takes 635m/552m/2044 memory (VIRT/RES/SHR)
>
>>>> b={}
>>>> for i in xrange(10000000):
> ... b[str(i)]=i
>
> Then the memory usage increased to 1726m/1.6g/2048
>
>>>> del b
> I expect the memory usage drop to the ammount before b was
> created(635m/552m/2044), but it's actually 1341m/1.2g/2048
>
> Could anyone please explain why this happens? It seems some memory are
> not freed.
It seems the memory is not given back to the operating system. This
doesn't mean that it is not freed by Python and can't be used again by
Python. Create the dictionary again and see if the memory usage rises
again or if it stays stable.
Ciao,
Marc 'BlackJack' Rintsch
'gc.collect()' -- I believe, but I'm not the specialist in it.
If I understand correctly, that only effects objects that are part of
a reference cycle and doesn't necessarily force the freed memory to be
released to the OS.
I believe that's correct.
If the OP is worrying about memory usage then they should also be aware
that there are lots of very clever things done pre-assigning and keeping
hold of memory with python's in-built types to let them scale well that
can be confusing when you're looking at memory usage.
Basically malloc() and free() are computationally expensive, so Python
tries to call them as little as possible - but it's quite clever at
knowing what to do - e.g. if a list has already grown large then python
assumes it might grow large again and keeps hold of a percentage of the
memory.
The outcome is that trying to reduce memory usage can change what data
structures you should use - tupples use less space than lists, etc.
Tim W
Even if Python would free() the space no more used by it's own memory
allocator (PyMem_Malloc(), PyMem_Free() & Co) the OS usually doesn't
return this space to the global free memory pool but instead leaves it
assigned to the process, again for performance reasons. Only when the
OS is running out of memory it will go and get the free()ed memory of
processes back. There might be a way to force your OS to do so
earlier manually if you really want but I'm not sure how you'd do
that.
Regards
Floris
Python uses malloc() and free() to allocate memory on the heap. Most
malloc() implementations don't give back memory to the system. Instead
the memory segment is still assigned to the process. In order to give
back memory to the system pool, a memory manager has to use mapped
memory (mmap()) instead of increasing the heap by changing the data
segment size with brk(). This isn't a Python flaw but a general issue
with malloc() based memory management. [1]
By the way Python has its own memory management system on top of the
system's malloc() system. The memory arena system is explained in great
detail in the file obmalloc.c [2].
Christian
[1] http://en.wikipedia.org/wiki/Malloc#Implementations
[2]
http://svn.python.org/view/python/branches/release25-maint/Objects/obmalloc.c?revision=65261&view=markup
You are almost right. Python's mutable container types like have a
non-linear growth rate.
>From the file listobject.c
/*
This over-allocates proportional to the list size, making room
for additional growth. The over-allocation is mild, but is
enough to give linear-time amortized behavior over a long
sequence of appends() in the presence of a poorly-performing
system realloc().
The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
Sorry, I think I didn't phrase myself very well - I was trying to
explain that de-allocation of memory follows different scaling behaviour
to allocation - so a large list that's shrunk is likely to take more
memory than a small list that's grown i.e. the part just above your
quote:
/*
Bypass realloc() when a previous overallocation is large enough
to accommodate the newsize. If the newsize falls lower than half
the allocated size, then proceed with the realloc() to shrink the list.
*/
if (allocated >= newsize && newsize >= (allocated >> 1)) {
assert(self->ob_item != NULL || newsize == 0);
Py_SIZE(self) = newsize;
return 0;
}
it's all very clever stuff btw.
There is a "bug" in versions of Python prior to 2.5 where memory
really isn't released back to the OS. Python 2.5 contains a new object
allocator that is able to return memory to the operating system that
fixes this issue. Here's an explanation:
http://evanjones.ca/python-memory-part3.html
What version of Python are you using? I have a machine running several
long-running processes, each of which occasionally spike up to 500M
memory usage, although normally they only require about 25M. Prior to
2.5, those processes never released that memory back to the OS and I
would need to periodically restart them. With 2.5, this is no longer a
problem. I don't always see memory usage drop back down immediately
but the OS does recover the memory eventually. Make sure you use 2.5
if this is an issue for you.
--David
http://mail.python.org/pipermail/python-dev/2006-March/061991.html
> For simpler fun, run this silly little program, and look at memory
> consumption at the prompts:
>
> """
> x = []
> for i in xrange(1000000):
> x.append([])
> raw_input("full ")
> del x[:]
> raw_input("empty ")
> """
>
> For example, in a release build on WinXP, VM size is about 48MB at the
> "full" prompt, and drops to 3MB at the "empty" prompt. In the trunk
> (without this patch), VM size falls relatively little from what it is
> at the "full" prompt (the contiguous vector holding a million
> PyObject* pointers is freed, but the obmalloc arenas holding a
> million+1 list objects are never freed).
>
> For more info about the patch, see Evan's slides from _last_ year's PyCon:
>
> http://evanjones.ca/memory-allocator.pdf
I'm not sure what deleting a slice accomplishes (del x[:]); the
behavior is the same whether I do del x or del x[:]. Any ideas?
--David
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
If there is another reference to the list, which there well might be in
an actual application with memory problems, then 'del x' only
disassociates the name but the object and its large contents are not gc'ed.
> del x[:] leaves x referencing a cleared list.
which is guaranteed to be cleared, regardless of other refs.
Terry