On Mon, Jun 24, 2013 at 8:37 AM, Zak <
cyt...@m.allo.ws> wrote:
> I want to dynamically generate a Python dictionary, with two caveats:
>
> 1. The Python dictionary should not be accessible from interpreted Python
> (only accessible within the Cython module).
>
> 2. The Python dictionary should NOT be garbage collected.
never, ever? or simply not when you still need it?
> First, I tried this (inside a Cython module, my_code.pyx):
>
> cdef object make_dict():
> the_dict = {'foo': 'bar'}
> return the_dict
I'd probably do "cdef dict make_dict():" -- why not tel Cython that
this is always return a dict?
> def do_stuff():
> the_dict = make_dict()
> # I'm worried the_dict will be garbage collected right here
> # I want to keep it and use it, for example:
> return the_dict['foo']
>
> The code above works, but I am afraid the_dict will be garbage collected. I
> am not sure how Cython works, but it seems like the function make_dict() may
> be returning a pointer to a Python dictionary.
yes.
> Once make_dict() finishes and
> returns, it seems like the memory actually storing the dictionary may be
> garbage collected (freed), leaving us with a dangling pointer.
no -- Cython uses Python's reference counting for python objects --
that's part of the point, you still get Python's memory management.
So when
the_dict = make_dict()
the dict's reference count is increased, so it won't get cleared out
until that reference goes away.
You might want to play with calling sys.getrefcount() in various
places to watch what happens:
"""
sys.getrefcount(object)
Return the reference count of the object. The count returned is
generally one higher than you might expect, because it includes the
(temporary) reference as an argument to getrefcount().
"""
> I am afraid it is just because garbage
> collection hasn't happened yet.
Python uses a reference counting scheme, so objects are deleted as
soon as their reference count goes to zero.
> Is the following code safer?
>
> cdef object THE_DICT
>
> cdef void make_dict(output_dict):
> output_dict = {'foo': 'bar'}
>
> make_dict(THE_DICT)
>
> def do_stuff():
> return THE_DICT['foo']
does this even work? In:
cdef void make_dict(output_dict):
output_dict = {'foo': 'bar'}
you are pasing ouput_dict in to the function, but then in:
output_dict = {'foo': 'bar'}
you are assigning a NEW dict to the name ouput__dict -- so you would
not have changed the dict passed in.
If you really want to do this, you need to mutate the dict passed in,
rather than making a new one:
cdef void make_dict(output_dict):
ouput_dict.clear()
output_dict['foo'] = 'bar'
But this would let you have only a single dict in the module
namespace, and I suspect you're trying to solve a problem you don't
have.
> Is there some explicit way to tell Cython "do not garbage collect this"?
I'm not sure about that, tough you could explicitly increase the
reference count -- ugly hack.
if it's a cdef class attribute it won't get deleted as long as the
class instance is there. note that in that case, and with your module
attribute: THE_DICT above, the cdef call simply tells Cython that
there should be that name there with that type (creating a pointer)
the actual object still needs to be created somewhere. that could be
on the same line, but it's an additional operation:
cdef dict THE_DICT = {}
HTH,
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R
(206) 526-6959 voice
7600 Sand Point Way NE
(206) 526-6329 fax
Seattle, WA 98115
(206) 526-6317 main reception
Chris....@noaa.gov