I am having some difficulty with memory management with numpy arrays. I have some c-code which creates a numpy array which is fairly large (2 Gb), this is passed back to python. Checking the reference count, it is 2 at this point. After performing a further operation, the reference count is still 2 and then I delete it.
The problem is that the memory never gets released. It does not take too many passes for this to basically fail as the system runs out of memory.
So? Does anyone have any ideas? I can send code later. What should the ref count be before del to ensure that the object is garbage collected?
I find calling gc.collect() does not solve the problem.
Any ideas?
S
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
David
Thanks for your reply. So far I am not inc'refing the variable, as far as I can tell.
My c-code produces the array as:
qOut = PyArray_SimpleNew(2, dims, NPY_DOUBLE);
I assign the data using the pointer I get from:
qOutp = (_float *)PyArray_DATA(qOut);
When the routine ends, it is passed back by:
return Py_BuildValue("N", qOut);
(I have also tried just "return qOut;")
If i check the refcount when the routine returns I get a value of 2 from the code below....
totSet = ctrans.ccdToQ(...)
print sys.getrefcount(totSet)
If I Py_DECREF before the c routine returns then the code segfaults.....
Thanks,
Stuart
I wish I could help, but what I can tell you is that this stuff is a
pain. When I was writing C extensions by hand, I think I managed to
always pass in a results array, so I didn't have to create anything new
in the extension to avoid these issues.
Now I use Cython to avoid them -- you may want to give that a look-see.
It can save a lot of pain.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Is it like that on the C side?
In Python it is always one larger than it should be:
In [17]: import sys
In [18]: import numpy
In [19]: x = numpy.array([1,2,3])
In [20]: sys.getrefcount(x)
Out[20]: 2
Sebastian
That's because the refcount is incremented when x is passed to the
getrefcount() function itself -- it will drop back to one when the
function is done running.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
You also might want to try calling gc.get_referrers(arr) before
deleting it, to check if you have any stray references to the array
still around. (gc.get_referrers isn't guaranteed to detect every
reference, I think, but it will detect many.)
-- Nathaniel
Thanks for the info, it appears that all refcounts are 2, as told from getrefcount() which means it is really one (from what you say)
In that case, I am really confused, if i "del" the array or even "del" the whole class in which the arrays are defined. the memory still leaks. Is it possible that these numpy arrays are not being freed by the gc, or that the memory is not being "freed" when the numpy array is deleted?
This is really frustrating!
S
I suspect that it's not a problem with the python object sticking
around, but with the underlying memory. Are you quite sure you're not
mallocing something you're not freeing?
Sorry, I've deleted your original post, but are you passing in a data
pointer to PyArray_NewFromDescr() or PyArray_New() ? if so, then you to
free that memory in your own code.
Anyway, perhaps you could post your array creation code again, and
someone will see something (or just use cython!)
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception