> A weak reference to an object is not enough to keep the object alive:
> when the only remaining references to a referent are weak references,
> garbage collection is free to destroy the referent and reuse its
> memory for something else.
This leads to a difference in behaviour between CPython and the other implementations: CPython will (currently) immediately destroy any objects that only have weak references to them with the result that trying to access said object will require making a new one; other implementations (at least PyPy, and presumably the others that don't use ref-count gc's) can "reach into the grave" and pull back objects that don't have any strong references left.
I would like to have the guarantees for weakrefs strengthened such that any weakref'ed object that has no strong references left will return None instead of the object, even if the object has not yet been garbage collected.
Without this stronger guarantee programs that are relying on weakrefs to disappear when strong refs are gone end up relying on the gc method instead, with the result that the program behaves differently on different implementations.
Ethan Furman <et...@stoneleaf.us> wrote:
> From the manual [8.11]:
> > A weak reference to an object is not enough to keep the object alive:
> > when the only remaining references to a referent are weak references,
> > garbage collection is free to destroy the referent and reuse its
> > memory for something else.
> This leads to a difference in behaviour between CPython and the other > implementations: CPython will (currently) immediately destroy any > objects that only have weak references to them with the result that > trying to access said object will require making a new one;
This is only true if the object isn't caught in a reference cycle.
> Without this stronger guarantee programs that are relying on weakrefs to > disappear when strong refs are gone end up relying on the gc method > instead, with the result that the program behaves differently on > different implementations.
Why would they "rely on weakrefs to disappear when strong refs are
gone"? What is the use case?
On Thu, May 17, 2012 at 8:44 AM, Antoine Pitrou <solip...@pitrou.net> wrote:
> On Thu, 17 May 2012 08:10:40 -0700
> Ethan Furman <et...@stoneleaf.us> wrote:
> > From the manual [8.11]:
> > > A weak reference to an object is not enough to keep the object alive:
> > > when the only remaining references to a referent are weak references,
> > > garbage collection is free to destroy the referent and reuse its
> > > memory for something else.
> > This leads to a difference in behaviour between CPython and the other
> > implementations: CPython will (currently) immediately destroy any
> > objects that only have weak references to them with the result that
> > trying to access said object will require making a new one;
> This is only true if the object isn't caught in a reference cycle.
To further this, consider the following example, ran in CPython2.6:
Ethan Furman wrote:
> I would like to have the guarantees for weakrefs strengthened such that > any weakref'ed object that has no strong references left will return > None instead of the object, even if the object has not yet been garbage > collected.
Why do you want this guarantee? It would complicate
implementations for which ref counting is not the
native method of managing memory.
>> A weak reference to an object is not enough to keep the object alive:
>> when the only remaining references to a referent are weak references,
>> garbage collection is free to destroy the referent and reuse its
>> memory for something else.
> This leads to a difference in behaviour between CPython and the other
> implementations: CPython will (currently) immediately destroy any
> objects that only have weak references to them with the result that
> trying to access said object will require making a new one; other
> implementations (at least PyPy, and presumably the others that don't use
> ref-count gc's) can "reach into the grave" and pull back objects that
> don't have any strong references left.
Antione Pitrou wrote:
> This is only true if the object isn't caught in a reference cycle.
Good point -- so I would also like the proposed change in CPython as
well.
Ethan Furman wrote:
> I would like to have the guarantees for weakrefs strengthened such that
> any weakref'ed object that has no strong references left will return
> None instead of the object, even if the object has not yet been garbage
> collected.
> Without this stronger guarantee programs that are relying on weakrefs to
> disappear when strong refs are gone end up relying on the gc method
> instead, with the result that the program behaves differently on
> different implementations.
Antione Pitrou wrote:
> Why would they "rely on weakrefs to disappear when strong refs are
> gone"? What is the use case?
Greg Ewing wrote:
> Why do you want this guarantee? It would complicate
> implementations for which ref counting is not the
> native method of managing memory.
My dbf module provides direct access to dbf files. A retrieved record
is
a singleton object, and allows temporary changes that are not written
to
disk. Whether those changes are seen by the next incarnation depends
on
(I had thought) whether or not the record with the unwritten changes
has
gone out of scope.
I see two questions that determine whether this change should be made:
1) How difficult it would be for the non-ref counting
implementations
to implement
2) Whether it's appropriate to have objects be changed, but not
saved,
and then discarded when the strong references are gone so the
next
incarnation doesn't see the changes, even if the object hasn't
been
destroyed yet.
~Ethan~
FYI: For dbf I am going to disallow temporary changes so this won't
be
an immediate issue for me.
_______________________________________________
Python-ideas mailing list
Python-id...@python.org
http://mail.python.org/mailman/listinfo/python-ideas
> My dbf module provides direct access to dbf files. A retrieved record
> is
> a singleton object, and allows temporary changes that are not written
> to
> disk. Whether those changes are seen by the next incarnation depends
> on
> (I had thought) whether or not the record with the unwritten changes
> has
> gone out of scope.
If a record is a singleton, that singleton-ification would be handled
through weakrefs would it not?
In that case, until the GC is triggered (and the weakref is
invalidated), you will keep getting your initial singleton and there
will be no "next record", I fail to see why that would be an issue.
> I see two questions that determine whether this change should be made:
> 1) How difficult it would be for the non-ref counting
> implementations
> to implement
Pretty much impossible I'd expect, the weakrefs can only be broken on GC
runs (at object deallocation) and that is generally non-deterministic
without specifying precisely which type of GC implementation is used.
You'd need a fully deterministic deallocation model to ensure a weakref
is broken as soon as the corresponding object has no outstanding strong
(and soft, in some VMs like the JVM) reference.
> 2) Whether it's appropriate to have objects be changed, but not
> saved,
> and then discarded when the strong references are gone so the
> next
> incarnation doesn't see the changes, even if the object hasn't
> been
> destroyed yet.
If your saves are synchronized with the weakref being broken (the object
being *effectively* collected) and the singleton behavior is as well,
there will be no difference, I'm not sure what the issue would be, you
might just have a second change cycle using the same unsaved (but still
modified) object.
Although frankly speaking such reliance on non-deterministic events would
scare the shit out of me.
_______________________________________________
Python-ideas mailing list
Python-id...@python.org
http://mail.python.org/mailman/listinfo/python-ideas
> On 2012-05-18, at 18:08 , stoneleaf wrote:
>> My dbf module provides direct access to dbf files. A retrieved record
>> is
>> a singleton object, and allows temporary changes that are not written
>> to
>> disk. Whether those changes are seen by the next incarnation depends
>> on
>> (I had thought) whether or not the record with the unwritten changes
>> has
>> gone out of scope.
> If a record is a singleton, that singleton-ification would be handled
> through weakrefs would it not?
Indeed, that is the current bahavior.
> In that case, until the GC is triggered (and the weakref is
> invalidated), you will keep getting your initial singleton and there
> will be no "next record", I fail to see why that would be an issue.
Because, since I had only been using CPython, I was able to count on
records that had gone out of scope disappearing along with their
_temporary_ changes. If I get that same record back the next time I
loop
through the table -- well, then the changes weren't temporary, were
they?
>> I see two questions that determine whether this change should be made:
>> 1) How difficult it would be for the non-ref counting
>> implementations to implement
> Pretty much impossible I'd expect, the weakrefs can only be broken on GC
> runs (at object deallocation) and that is generally non-deterministic
> without specifying precisely which type of GC implementation is used.
> You'd need a fully deterministic deallocation model to ensure a weakref
> is broken as soon as the corresponding object has no outstanding strong
> (and soft, in some VMs like the JVM) reference.
>> 2) Whether it's appropriate to have objects be changed, but not
>> saved, and then discarded when the strong references are gone so the
>> next incarnation doesn't see the changes, even if the object hasn't
>> been destroyed yet.
> If your saves are synchronized with the weakref being broken (the object
> being *effectively* collected) and the singleton behavior is as well,
> there will be no difference, I'm not sure what the issue would be, you
> might just have a second change cycle using the same unsaved (but still
> modified) object.
And that's exactly the problem -- I don't want to see the
modifications the
second time 'round, and if I can't count on weakrefs invalidating as
soon as
the strong refs are gone I'll have to completely rethink how I handle
records
from the table.
> Although frankly speaking such reliance on non-deterministic events would
> scare the shit out of me.
Indeed -- I hadn't realized that I was until somebody using PyPy
noticed the
problem.
> On May 18, 9:38 am, Masklinn wrote:
> > On 2012-05-18, at 18:08 , stoneleaf wrote:
> >> My dbf module provides direct access to dbf files. A retrieved record
> >> is
> >> a singleton object, and allows temporary changes that are not written
> >> to
> >> disk. Whether those changes are seen by the next incarnation depends
> >> on
> >> (I had thought) whether or not the record with the unwritten changes
> >> has
> >> gone out of scope.
> > If a record is a singleton, that singleton-ification would be handled
> > through weakrefs would it not?
> Indeed, that is the current bahavior.
> > In that case, until the GC is triggered (and the weakref is
> > invalidated), you will keep getting your initial singleton and there
> > will be no "next record", I fail to see why that would be an issue.
> Because, since I had only been using CPython, I was able to count on
> records that had gone out of scope disappearing along with their
> _temporary_ changes. If I get that same record back the next time I
> loop
> through the table -- well, then the changes weren't temporary, were
> they?
So you're taking a *dependence* on the reference counting garbage
collection of the CPython implementation, and when that doesn't work for
you with other implementations trying to force the same semantics on them.
Your proposal can't reasonably be implemented by other implementations as
working out whether there are any references to an object is an expensive
operation.
A much better technique would be for you to use explicit
life-cycle-management (like the with statement) for your objects.
> >> I see two questions that determine whether this change should be made:
> >> 1) How difficult it would be for the non-ref counting
> >> implementations to implement
> > Pretty much impossible I'd expect, the weakrefs can only be broken on GC
> > runs (at object deallocation) and that is generally non-deterministic
> > without specifying precisely which type of GC implementation is used.
> > You'd need a fully deterministic deallocation model to ensure a weakref
> > is broken as soon as the corresponding object has no outstanding strong
> > (and soft, in some VMs like the JVM) reference.
> >> 2) Whether it's appropriate to have objects be changed, but not
> >> saved, and then discarded when the strong references are gone so the
> >> next incarnation doesn't see the changes, even if the object hasn't
> >> been destroyed yet.
> > If your saves are synchronized with the weakref being broken (the object
> > being *effectively* collected) and the singleton behavior is as well,
> > there will be no difference, I'm not sure what the issue would be, you
> > might just have a second change cycle using the same unsaved (but still
> > modified) object.
> And that's exactly the problem -- I don't want to see the
> modifications the
> second time 'round, and if I can't count on weakrefs invalidating as
> soon as
> the strong refs are gone I'll have to completely rethink how I handle
> records
> from the table.
> > Although frankly speaking such reliance on non-deterministic events would
> > scare the shit out of me.
> Indeed -- I hadn't realized that I was until somebody using PyPy
> noticed the
> problem.
May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
> So you're taking a *dependence* on the reference counting garbage
> collection of the CPython implementation, and when that doesn't work for
> you with other implementations trying to force the same semantics on them.
I am not trying to force anything. I stated what I would like, and
followed
up with questions to further the discussion.
> Your proposal can't reasonably be implemented by other implementations as
> working out whether there are any references to an object is an expensive
> operation.
Then that nixes it. The (debatable) advantages aren't worth a large
expenditure in programmer time, nor a large hit in performance.
> A much better technique would be for you to use explicit
> life-cycle-management (like the with statement) for your objects.
I'm leaning strongly towards just not allowing temporary changes,
which will
also solve my problem.