Memory leak in NLMSASlice -- fixed.

1 view
Skip to first unread message

C. Titus Brown

unread,
Sep 1, 2008, 2:19:47 PM9/1/08
to pygr...@googlegroups.com
Hi everyone,

I just spent a few days tracking down a memory leak in NLMSASlice that
was causing me some serious problems. I've submitted three patches and
a brief description to the issue tracker, here:

http://code.google.com/p/pygr/issues/detail?id=27

Description:

cnestedlist.NLMSASlice has a memory leak (under Pyrex 0.9.8.4) where
__dealloc__ cannot properly delete cache objects from
seqdb.cacheProxyDict stored under self.deallocID, because __dealloc__
cannot call Python methods.

---

The first patch adds tests (to seqdb_test) that demonstrate the problem;
the second patch fixes the problem with minimal changes to the code, by
changing seqdb.cacheProxyDict to be a weakref dict; and the third patch
does some wholesale refactoring, resulting in what I think is much
simpler code. Comments -- here or on the tracker -- are appreciated.

You can grab the patches to trunk at

git://iorich.caltech.edu/git/public/pygr-ctb

I've also attached them.

I suspect that this problem is only revealed by the newer Pyrex
versions, but it may have lurked for a while. I'm not sure. In any
case, it's definitely a problem *now* and I think I've fixed it :).

Paranthetically, I think this was the most difficult issue I've actually
succeeded in patching in pygr. Yay me! Luckily the code didn't
actually seem to be that entangled...

cheers,
--titus
--
C. Titus Brown, c...@msu.edu

0002-fixed-NLMSASlice-memory-leak-by-switching-to-using-w.patch
0003-cleanup-and-refactoring-of-cache-code.patch
0001-added-cache-tests-for-basic-seqslice-cache-NLMSASl.patch

Christopher Lee

unread,
Sep 2, 2008, 1:19:03 PM9/2/08
to pygr...@googlegroups.com
Hi Titus,
all this nastiness arose from the fact that Pyrex is supposed to
support weak references, but when I tried it I got a mysterious
crash. Out of laziness I didn't try to investigate this at all, and
instead just used a proxy intermediate. I.e. the proxy keeps a
dictionary that maps ID values to python objects, allowing the client
to delete a specified ID. Now seems like a good time to clean all
this up. Either we should revisit the question of Pyrex's weak
reference support and get that working, so that the cache "owner"
object can just be the NLMSASlice object itself (my original design),
or else just go with your solution.

One question: why not just keep the owner reference directly as an
attribute on the NLMSASlice, rather than employing an intermediate
layer (appending it to a list, then saving the list as an attribute on
the NLMSASlice)?

-- Chris

C. Titus Brown

unread,
Sep 2, 2008, 2:03:35 PM9/2/08
to pygr...@googlegroups.com
On Tue, Sep 02, 2008 at 10:19:03AM -0700, Christopher Lee wrote:
-> Hi Titus,
-> all this nastiness arose from the fact that Pyrex is supposed to
-> support weak references, but when I tried it I got a mysterious
-> crash. Out of laziness I didn't try to investigate this at all, and
-> instead just used a proxy intermediate. I.e. the proxy keeps a
-> dictionary that maps ID values to python objects, allowing the client
-> to delete a specified ID. Now seems like a good time to clean all
-> this up. Either we should revisit the question of Pyrex's weak
-> reference support and get that working, so that the cache "owner"
-> object can just be the NLMSASlice object itself (my original design),
-> or else just go with your solution.

Weakref support has been added to Pyrex in recent days; I'll go
implement that tonight, it should be pretty easy.

-> One question: why not just keep the owner reference directly as an
-> attribute on the NLMSASlice, rather than employing an intermediate
-> layer (appending it to a list, then saving the list as an attribute on
-> the NLMSASlice)?

Because initially I was concerned with preserving your code
structure, and when I started refactoring I didn't think of it. <sigh>

That is, no good reason :). I will fix.

--t

Christopher Lee

unread,
Sep 2, 2008, 2:23:46 PM9/2/08
to pygr...@googlegroups.com
Hi Titus,
another thing that puzzles me: I thought the prohibition in
self.__dealloc__() is against calling Python methods of self (because
they may already have been deleted). It would really surprise me if
__dealloc__() is not allowed to call *any* Python functions, as you
seemed to imply. That doesn't fit the Pyrex object model as I
understand it. Note also that the NLMSASlice.deallocID attribute was
a C type, not a Python object, so accessing it in __dealloc__() is
guaranteed to be valid, at least as far as I know. After all,
__dealloc__'s job is to perform any actions on those C type attributes
necessary for final clean up (e.g. free())...

Note for getting rid of the intermediate list layer: "cdef object" is
how Pyrex lets you declare a "reference to arbitrary Python object"...

Yours,

Chris

C. Titus Brown

unread,
Sep 2, 2008, 3:14:27 PM9/2/08
to pygr...@googlegroups.com
On Tue, Sep 02, 2008 at 11:23:46AM -0700, Christopher Lee wrote:
->
-> Hi Titus,
-> another thing that puzzles me: I thought the prohibition in
-> self.__dealloc__() is against calling Python methods of self (because
-> they may already have been deleted). It would really surprise me if
-> __dealloc__() is not allowed to call *any* Python functions, as you
-> seemed to imply. That doesn't fit the Pyrex object model as I
-> understand it. Note also that the NLMSASlice.deallocID attribute was
-> a C type, not a Python object, so accessing it in __dealloc__() is
-> guaranteed to be valid, at least as far as I know. After all,
-> __dealloc__'s job is to perform any actions on those C type attributes
-> necessary for final clean up (e.g. free())...

Hi, Chris,

I don't know what to say about __dealloc__; I could only reproduce the
error when I tried to call __delitem__ on a dict from within a Pyrex
__dealloc__ function. See

http://iorich.caltech.edu/~t/transfer/public/a.pyx
http://iorich.caltech.edu/~t/transfer/public/b.py
http://iorich.caltech.edu/~t/transfer/public/run.py

for a small example test suite.

-> Note for getting rid of the intermediate list layer: "cdef object" is
-> how Pyrex lets you declare a "reference to arbitrary Python object"...

yep :). I thought using a list() was fine, tho...?

thanks,
--titus

C. Titus Brown

unread,
Sep 2, 2008, 3:15:12 PM9/2/08
to pygr...@googlegroups.com
On Tue, Sep 02, 2008 at 10:19:03AM -0700, Christopher Lee wrote:
-> all this nastiness arose from the fact that Pyrex is supposed to
-> support weak references, but when I tried it I got a mysterious
-> crash. Out of laziness I didn't try to investigate this at all, and
-> instead just used a proxy intermediate. I.e. the proxy keeps a
-> dictionary that maps ID values to python objects, allowing the client
-> to delete a specified ID. Now seems like a good time to clean all
-> this up. Either we should revisit the question of Pyrex's weak
-> reference support and get that working, so that the cache "owner"
-> object can just be the NLMSASlice object itself (my original design),
-> or else just go with your solution.
->
-> One question: why not just keep the owner reference directly as an
-> attribute on the NLMSASlice, rather than employing an intermediate
-> layer (appending it to a list, then saving the list as an attribute on
-> the NLMSASlice)?

Good ideas all, patch attached and also available at

git://iorich.caltech.edu/git/public/pygr-ctb

0001-Switched-to-using-the-NLMSASlice-as-the-cache-holder.patch

Christopher Lee

unread,
Sep 10, 2008, 4:35:03 PM9/10/08
to C. Titus Brown, Pygr Development Group

On Sep 2, 2008, at 12:15 PM, C. Titus Brown wrote:

> -> One question: why not just keep the owner reference directly as an
> -> attribute on the NLMSASlice, rather than employing an intermediate
> -> layer (appending it to a list, then saving the list as an
> attribute on
> -> the NLMSASlice)?
>
> Good ideas all, patch attached and also available at
>
> git://iorich.caltech.edu/git/public/pygr-ctb
>
> --titus

Hi Titus,
in what branch did you make this change? I just tried pulling again
from your staging branch, and this change did not show up. The last
commit I see on that branch is 9/1 8d996689...

Of course I can try to apply your email patch instead...

Yours,

Chris

C. Titus Brown

unread,
Sep 10, 2008, 4:37:18 PM9/10/08
to Christopher Lee, Pygr Development Group
On Wed, Sep 10, 2008 at 01:35:03PM -0700, Christopher Lee wrote:
->
-> On Sep 2, 2008, at 12:15 PM, C. Titus Brown wrote:
->
-> >-> One question: why not just keep the owner reference directly as an
-> >-> attribute on the NLMSASlice, rather than employing an intermediate
-> >-> layer (appending it to a list, then saving the list as an
-> >attribute on
-> >-> the NLMSASlice)?
-> >
-> >Good ideas all, patch attached and also available at
-> >
-> > git://iorich.caltech.edu/git/public/pygr-ctb
-> >
-> >--titus
->
-> Hi Titus,
-> in what branch did you make this change? I just tried pulling again
-> from your staging branch, and this change did not show up. The last
-> commit I see on that branch is 9/1 8d996689...

Hmm, I thought I'd put it there, but maybe not!? Sometimes I feel like
git loses things... I know that it's more likely I'm scatterbrained.

Anyway, it's been updated; see bbfe9424b29617214afc9f3f0e871a854424b161
on branch staging.

Using my patch is also of course fine :)

Christopher Lee

unread,
Sep 10, 2008, 4:51:23 PM9/10/08
to C. Titus Brown, Pygr Development Group
Hmm, your patch isn't working for me:

------------------------------------------------------------
FAILED:seqdb_test.SeqDBCache_Test.nlmsaslice_cache_test()
------------------------------------------------------------
OUTPUT:
this should not be empty: [{'seq2': [5, 10], 'seq1': [5, 10]}]
this should be empty: [{'seq2': [5, 10], 'seq1': [5, 10]}]

STDERR:
Traceback (most recent call last):
File "protest.py", line 123, in <module>
if do_test(sys.argv[2],sys.argv[3],sys.argv[4]):
File "protest.py", line 33, in do_test
m()
File "/Users/leec/projects/pygr/tests/seqdb_test.py", line 377, in
nlmsaslice_cache_test
assert n2 == 0, '%d objects remain; cache memory leak!' % n2
AssertionError: 1 objects remain; cache memory leak!

Any suggestions?

Christopher Lee

unread,
Sep 10, 2008, 5:08:17 PM9/10/08
to Pygr Development Group
Hi Titus,
what version of pyrex are you using? I have 0.9.4.1, installed via
Fink.
-- Chris

C. Titus Brown

unread,
Sep 10, 2008, 5:09:41 PM9/10/08
to pygr...@googlegroups.com
On Wed, Sep 10, 2008 at 02:08:17PM -0700, Christopher Lee wrote:
-> what version of pyrex are you using? I have 0.9.4.1, installed via
-> Fink.

0.9.8.4

C. Titus Brown

unread,
Sep 10, 2008, 5:11:18 PM9/10/08
to pygr...@googlegroups.com
On Wed, Sep 10, 2008 at 01:51:23PM -0700, Christopher Lee wrote:
->
-> Hmm, your patch isn't working for me:

That's odd -- perhaps try starting from scratch? Is the __weakref__
attribute present on NLMSASlice? If so, then it's probably the Pyrex
version.

If you want to keep supporting the old Pyrex -- reasonable enough! -- I
can revert that patch and go back to the slightly messier previous patch
which solves the memory leak AND is compatible with older versions of
Pyrex. Let me know.

--titus

Christopher Lee

unread,
Sep 10, 2008, 5:33:29 PM9/10/08
to pygr...@googlegroups.com

On Sep 10, 2008, at 2:11 PM, C. Titus Brown wrote:

> That's odd -- perhaps try starting from scratch? Is the __weakref__
> attribute present on NLMSASlice? If so, then it's probably the Pyrex
> version.
>
> If you want to keep supporting the old Pyrex -- reasonable enough!
> -- I
> can revert that patch and go back to the slightly messier previous
> patch
> which solves the memory leak AND is compatible with older versions of
> Pyrex. Let me know.

Hi Titus,
I think this is why I gave up on Pyrex's advertised __weakref__
support originally -- it didn't work, at least on my Mac, and it
*still* doesn't. Rather than try to debug Pyrex (version issues?),
I've switched to using your "intermediate object" approach, and that
works fine. I.e. I added an attribute to NLMSASlice called
"weakestLink"; I save a generic Python object there, and hand that
reference to saveCache() as the owner argument. This passes your
cache cleanup test (on my mac) just the way I expect your original
version did...

- Chris

Reply all
Reply to author
Forward
0 new messages