I just spent a few days tracking down a memory leak in NLMSASlice that
was causing me some serious problems. I've submitted three patches and
a brief description to the issue tracker, here:
http://code.google.com/p/pygr/issues/detail?id=27
Description:
cnestedlist.NLMSASlice has a memory leak (under Pyrex 0.9.8.4) where
__dealloc__ cannot properly delete cache objects from
seqdb.cacheProxyDict stored under self.deallocID, because __dealloc__
cannot call Python methods.
---
The first patch adds tests (to seqdb_test) that demonstrate the problem;
the second patch fixes the problem with minimal changes to the code, by
changing seqdb.cacheProxyDict to be a weakref dict; and the third patch
does some wholesale refactoring, resulting in what I think is much
simpler code. Comments -- here or on the tracker -- are appreciated.
You can grab the patches to trunk at
git://iorich.caltech.edu/git/public/pygr-ctb
I've also attached them.
I suspect that this problem is only revealed by the newer Pyrex
versions, but it may have lurked for a while. I'm not sure. In any
case, it's definitely a problem *now* and I think I've fixed it :).
Paranthetically, I think this was the most difficult issue I've actually
succeeded in patching in pygr. Yay me! Luckily the code didn't
actually seem to be that entangled...
cheers,
--titus
--
C. Titus Brown, c...@msu.edu
One question: why not just keep the owner reference directly as an
attribute on the NLMSASlice, rather than employing an intermediate
layer (appending it to a list, then saving the list as an attribute on
the NLMSASlice)?
-- Chris
Weakref support has been added to Pyrex in recent days; I'll go
implement that tonight, it should be pretty easy.
-> One question: why not just keep the owner reference directly as an
-> attribute on the NLMSASlice, rather than employing an intermediate
-> layer (appending it to a list, then saving the list as an attribute on
-> the NLMSASlice)?
Because initially I was concerned with preserving your code
structure, and when I started refactoring I didn't think of it. <sigh>
That is, no good reason :). I will fix.
--t
Note for getting rid of the intermediate list layer: "cdef object" is
how Pyrex lets you declare a "reference to arbitrary Python object"...
Yours,
Chris
Hi, Chris,
I don't know what to say about __dealloc__; I could only reproduce the
error when I tried to call __delitem__ on a dict from within a Pyrex
__dealloc__ function. See
http://iorich.caltech.edu/~t/transfer/public/a.pyx
http://iorich.caltech.edu/~t/transfer/public/b.py
http://iorich.caltech.edu/~t/transfer/public/run.py
for a small example test suite.
-> Note for getting rid of the intermediate list layer: "cdef object" is
-> how Pyrex lets you declare a "reference to arbitrary Python object"...
yep :). I thought using a list() was fine, tho...?
thanks,
--titus
Good ideas all, patch attached and also available at
git://iorich.caltech.edu/git/public/pygr-ctb
> -> One question: why not just keep the owner reference directly as an
> -> attribute on the NLMSASlice, rather than employing an intermediate
> -> layer (appending it to a list, then saving the list as an
> attribute on
> -> the NLMSASlice)?
>
> Good ideas all, patch attached and also available at
>
> git://iorich.caltech.edu/git/public/pygr-ctb
>
> --titus
Hi Titus,
in what branch did you make this change? I just tried pulling again
from your staging branch, and this change did not show up. The last
commit I see on that branch is 9/1 8d996689...
Of course I can try to apply your email patch instead...
Yours,
Chris
Hmm, I thought I'd put it there, but maybe not!? Sometimes I feel like
git loses things... I know that it's more likely I'm scatterbrained.
Anyway, it's been updated; see bbfe9424b29617214afc9f3f0e871a854424b161
on branch staging.
Using my patch is also of course fine :)
------------------------------------------------------------
FAILED:seqdb_test.SeqDBCache_Test.nlmsaslice_cache_test()
------------------------------------------------------------
OUTPUT:
this should not be empty: [{'seq2': [5, 10], 'seq1': [5, 10]}]
this should be empty: [{'seq2': [5, 10], 'seq1': [5, 10]}]
STDERR:
Traceback (most recent call last):
File "protest.py", line 123, in <module>
if do_test(sys.argv[2],sys.argv[3],sys.argv[4]):
File "protest.py", line 33, in do_test
m()
File "/Users/leec/projects/pygr/tests/seqdb_test.py", line 377, in
nlmsaslice_cache_test
assert n2 == 0, '%d objects remain; cache memory leak!' % n2
AssertionError: 1 objects remain; cache memory leak!
Any suggestions?
0.9.8.4
That's odd -- perhaps try starting from scratch? Is the __weakref__
attribute present on NLMSASlice? If so, then it's probably the Pyrex
version.
If you want to keep supporting the old Pyrex -- reasonable enough! -- I
can revert that patch and go back to the slightly messier previous patch
which solves the memory leak AND is compatible with older versions of
Pyrex. Let me know.
--titus
> That's odd -- perhaps try starting from scratch? Is the __weakref__
> attribute present on NLMSASlice? If so, then it's probably the Pyrex
> version.
>
> If you want to keep supporting the old Pyrex -- reasonable enough!
> -- I
> can revert that patch and go back to the slightly messier previous
> patch
> which solves the memory leak AND is compatible with older versions of
> Pyrex. Let me know.
Hi Titus,
I think this is why I gave up on Pyrex's advertised __weakref__
support originally -- it didn't work, at least on my Mac, and it
*still* doesn't. Rather than try to debug Pyrex (version issues?),
I've switched to using your "intermediate object" approach, and that
works fine. I.e. I added an attribute to NLMSASlice called
"weakestLink"; I save a generic Python object there, and hand that
reference to saveCache() as the owner argument. This passes your
cache cleanup test (on my mac) just the way I expect your original
version did...
- Chris