OneToOneField clarifications

96 views
Skip to first unread message

George Sakkis

unread,
Oct 7, 2010, 5:30:21 AM10/7/10
to Django developers
There are at least three open tickets related to OneToOneFields
(#10227, #14043, #14368) that, even if deemed invalid, hint at lack of
adequate documentation. After reading the docs on OneToOneField, I
don't think one can easily answer the following questions:

- It is mentioned that multi-table inheritance in implemented with an
implicit O2O relation from the child to the parent model, but O2O is
also useful for cases where neither model "extends" (in the OO sense)
the other. What are the implications of model A linking to model B
versus the inverse ?
- Does A.save() propagate to the linked B instance (if any) or vice
versa ?
- Does A.delete() propagate to the linked B instance (if any) or vice
versa ?
- Is "A.link_to_B = B" equivalent to "B.link_to_A = A" ?
- Is "A.link_to_B = None" or "B.link_to_A = None" valid ?
- Are the assignments above done exclusively in memory or they may hit
the database for reading and/or writing ?
- How does null=True affect the answers to all the above ?

Having clear answers to these would be a good first step to moving on
with the mentioned tickets and perhaps augmenting the docs.

George

Russell Keith-Magee

unread,
Oct 8, 2010, 8:20:12 AM10/8/10
to django-d...@googlegroups.com

The problem we have here is that O2O fields are the worst possible
case for the leaky abstraction that forms the basis of the ORM.
ForeignKeys have the same problems, but they're not quite as acute
because they at least *appear* asymmetrical.

*In theory* there shouldn't be a difference between which side of the
fence you operate upon.

However, in practice, there is. One of the two models has to store the
column that represents the relation. If A contains the field
definition, you need to save the instance of A in order to make a
relation change persist.

This means that the pure object abstraction gets leaky, and you have
to care a little bit about which model holds the physical
manifestation of your relation. And pretty much all the questions you
ask related to exactly how leaky the abstraction is (or should be).

So - the overarching behavior should be that it the ORM looks and
feels like you're manipulating a bunch of objects. However, there are
practical limitations to that pure vision, and places where the
abstraction leaks, and the solution we've implemented is mindful of
that fact.

Looking at the specific bugs in question:

****

#10227 is a bit of a weird one (which is why it's DDN). I can
certainly see the argument for why the reverse case should return None
rather than an exception. If you consider a O2O field to be a
ForeignKey with unique=True (which is essentially how they're
implemented), you don't get an exception when there is no objects in
your object_set.all(); you get an empty query set. I think there's
grounds to say that "None" should be the O2O interpretation of an
empty query set, rather than a DoesNotExist exception.

The patch as proposed breaks 2 tests in the current suite, but the
error occurs in deletion code, so it's possible it's just a case of an
incomplete patch, rather than an outright problem.

The bigger problem is that the proposed change would change existing
API usage. I'm not quite ready to call it backwards-incompatible,
because our backwards-compatibility policy documents the fact that we
don't consider bugfixes to be a backwards incompatible change, and it
could be argued that this is bufix.

Providing additional wiggle room: the 'does it return None or raise an
exception' doesn't appear to be tested, and it isn't clearly
documented. This makes it much easier to argue that this is
undocumented and undesigned behavior.

The fact that this bug has existed for a long time without being
reported suggests that this isn't a high-traffic area of the API,
either, which means that the impact footprint of any change will be
smaller.

However, this isn't an area where only one opinion should matter. I'm
not comfortable making a judgement here without hearing the opinion of
others.

****

#14043 is clearly a bug to me (hence the accepted status). If I had to
guess at a cause, I'd say it's either:
* The OneToOneField special case not being handled by deletion traversal
* The related object cache on the o2o field not being cleared
correctly when null is assigned, causing the delete cascade to operate
on the older cached object.

The 'on delete set null' comment doesn't seem to be related here -- if
I've understood the ticket correctly, the relation has been broken by
'become_ghost()', so deletion shouldn't be cascading, regardless of
whether the relation implements ON DELETE CASCADE or not.

****

#14368 also strikes me as a bug, but it's one that's a little hard to
account for without some other changes.

In saying bob.soul = None, you then need to save the soul object in
order for the change to take effect. That isn't something that is
currently done. However, interestingly, it *is* done if you do the
analogous operation with a foreign key -- if you assign a queryset to
a reverse FK relation, every object in the queryset is modified and
saved (and looking at the code, this is something we can *massively*
optimize with an update statement). In the interests of removing a
wierd inconsistency, I can cer

However, this is a situation where pragmatism will need to beat
purity. OneToOne fields are also the cornerstone of inheritance, and
autosaving OneToOne reverse relations strikes me as something that
could backfire in the internals to inheritance. I'd need to see a
sample patch that doesn't break the existing test suite before I could
make any more firm judgements.

****

I hope that sheds some light on the situation.

Yours,
Russ Magee %-)

George Sakkis

unread,
Oct 9, 2010, 5:13:35 AM10/9/10
to Django developers
Thanks for the thorough reply, it was helpful, even without replying
directly to any of the specific questions about the leakiness of the
abstraction :-)

I'll try to come up with patches+tests for #14043 and #14368 since
they strike you as bugs. As for #10227, what do you think about my
suggestion at the end for a new optional 'related_default' parameter ?
To reiterate:

- It's fully backwards compatible (if necessary); things can stay as
they are by default.
- Returning None is trivial by passing ``related_default=lambda
instance: None``.
- More flexible in cases where None is not ideal or suitable.
- More appropriate than the suggested ``related_null`` parameter
because "null" indicates a DB-level parameter (translating directly to
NULL or NOT NULL db column). The related model of a O2O field does not
even have such a column, so it doesn't make sense to say it is or is
not null.

George

Russell Keith-Magee

unread,
Oct 9, 2010, 11:41:37 AM10/9/10
to django-d...@googlegroups.com
On Sat, Oct 9, 2010 at 5:13 PM, George Sakkis <george...@gmail.com> wrote:
> Thanks for the thorough reply, it was helpful, even without replying
> directly to any of the specific questions about the leakiness of the
> abstraction :-)

Damn. You noticed :-)

> I'll try to come up with patches+tests for #14043 and #14368 since
> they strike you as bugs. As for #10227, what do you think about my
> suggestion at the end for a new optional 'related_default' parameter ?

I'm not sold on related_default. My problem with this is that it's not
just a callable that you need -- you need an object factory. We're
dealing with one-to-one relations, so we can't just point to a single
object, because a single object can only be used in one case. Tthere's
no way to guarantee that a query for existing unassigned objects will
return a result, so that means we need to be creating new objects.
Spontaneous object creation when someone requests an attribute sounds
like a recipe for disaster to me. And spontaneous related-object
creation as part of object creation sounds just as bad.

I'm slightly less concerned about related_null; My hesitation here is
that while it's an elegant solution, I'm not sure whether we actually
have the problem that it is solving.

*If* you accept that there is a reasonable use case for having both
"returns None" *and*' "raises exception" as responses for the reverse
lookup case, then I think it's a pretty good solution.

I also like the fact that it is a completely backwards compatible
solution to the problem. From a purely pragmatic point of view, this
is a big plus.

However, that all hinges on the "if". I'm not completely convinced
that base proposition holds. I can't think of an interpretation of the
reverse nullable O2O raising an exception that makes sense in an ORM
sense. I don't want to introduce a flag that provides two different
interpretations of behavior just because we can; there needs to be a
legitimate interpretation of that flag.

One other reason to introduce such a flag would be as a gateway to
gradually changing the behavior in a backwards compatible way. So - we
could start raising PendingDeprecationWarnings if a DoesNotExist
exception is raised on a nullable OneToOne lookup, and provide the
'reverse_null' as a way to opt into the new behavior; in a couple of
releases, we flick the default value of reverse_null to make 'return
none' the default, and then start deprecating the use of the flag.
It's a long term migration process, but it provides plenty of warning
to those affected.

And, of course, one option that is on the table is to say "Yeah, it
doesn't entirely make sense -- what of it?". It's entirely predictable
behavior, it's not *obviously* wrong, and we've lived with it for a
long time. There's an argument to be made that it's "right by
defacto". This is a leaky abstraction, after all, so we should expect
to get our socks wet occasionally. As long as the boat isn't in danger
of sinking, there's no real harm in closing the ticket and moving on.

However, any of these plans hinge on us determining the right behavior
in the first place. Like I said last time -- I really need to hear
other opinions on this.

Yours,
Russ Magee %-)

George Sakkis

unread,
Oct 10, 2010, 3:15:12 PM10/10/10
to Django developers
On Oct 8, 2:20 pm, Russell Keith-Magee <russ...@keith-magee.com>
wrote:

> #14043 is clearly a bug to me (hence the accepted status). If I had to
> guess at a cause, I'd say it's either:
>  * The OneToOneField special case not being handled by deletion traversal
>  * The related object cache on the o2o field not being cleared
> correctly when null is assigned, causing the delete cascade to operate
> on the older cached object.

Turns out the related object didn't have a cache at all. The fix is a
single line in SingleRelatedObjectDescriptor.__get__() that sets the
back link from the related object, which ends up setting the cache
[1].

> #14368 also strikes me as a bug, but it's one that's a little hard to
> account for without some other changes.
>
> In saying bob.soul = None, you then need to save the soul object in
> order for the change to take effect. That isn't something that is
> currently done. However, interestingly, it *is* done if you do the
> analogous operation with a foreign key -- if you assign a queryset to
> a reverse FK relation, every object in the queryset is modified and
> saved (and looking at the code, this is something we can *massively*
> optimize with an update statement). In the interests of removing a
> wierd inconsistency, I can cer
>
> However, this is a situation where pragmatism will need to beat
> purity. OneToOne fields are also the cornerstone of inheritance, and
> autosaving OneToOne reverse relations strikes me as something that
> could backfire in the internals to inheritance. I'd need to see a
> sample patch that doesn't break the existing test suite before I could
> make any more firm judgements.

Just posted a patch [2] that does the autosave and passes all the
tests. In addition to the set to None issue, it also fixes the child
reassignment problem. For the example above, doing ``bob.soul =
other_soul`` currently raises a unique constraint integrity error when
attempting to save() since bob would end up with two souls. With the
patch, the existing bobs_soul is unlinked from bob and therefore no
integrity error is raised. This is consistent with the FK behaviour.
If for some reason you object to autosave for this case, it's
straightforward to restrict it to only when setting to None.

Since you mentioned the optimization of using an update statement
instead of save(), the patch uses an update statement. I'm not sure if
this is right though; shouldn't the pre_save/post_save signals be sent
here (as well as in the FK case) ?

George


[1] http://code.djangoproject.com/attachment/ticket/14043/ticket-14043.patch
[2] http://code.djangoproject.com/attachment/ticket/14368/ticket-14368.patch

George Sakkis

unread,
Oct 10, 2010, 3:53:22 PM10/10/10
to Django developers
On Oct 9, 5:41 pm, Russell Keith-Magee <russ...@keith-magee.com>
wrote:

> > I'll try to come up with patches+tests for #14043 and #14368 since
> > they strike you as bugs. As for #10227, what do you think about my
> > suggestion at the end for a new optional 'related_default' parameter ?
>
> I'm not sold on related_default. My problem with this is that it's not
> just a callable that you need -- you need an object factory. We're
> dealing with one-to-one relations, so we can't just point to a single
> object, because a single object can only be used in one case. Tthere's
> no way to guarantee that a query for existing unassigned objects will
> return a result, so that means we need to be creating new objects.
> Spontaneous object creation when someone requests an attribute sounds
> like a recipe for disaster to me. And spontaneous related-object
> creation as part of object creation sounds just as bad.

I'm not sure I'm following. It's up to the user to pass whatever
object factory is appropriate for the use case at hand, Django will
not be creating objects automatically. I could pass ``related_default
= lambda instance: MyModel.objects.get_or_create(fk=instance)`` so
that at most one MyModel is created for any given instance, or
``related_default = lambda instance: MyModel(fk=instance)`` to return
new unsaved instances that it's my responsibility to save later. And
of course the most common case would be simply ``related_default =
lambda instance: None`` where no object is created or retrieved.

> I'm slightly less concerned about related_null; My hesitation here is
> that while it's an elegant solution, I'm not sure whether we actually
> have the problem that it is solving.
>
> *If* you accept that there is a reasonable use case for having both
> "returns None" *and*' "raises exception" as responses for the reverse
> lookup case, then I think it's a pretty good solution.
>
> I also like the fact that it is a completely backwards compatible
> solution to the problem. From a purely pragmatic point of view, this
> is a big plus.
>
> However, that all hinges on the "if". I'm not completely convinced
> that base proposition holds. I can't think of an interpretation of the
> reverse nullable O2O raising an exception that makes sense in an ORM
> sense. I don't want to introduce a flag that provides two different
> interpretations of behavior just because we can; there needs to be a
> legitimate interpretation of that flag.

I found convincing the comment posted by olau on 05/10/10, that
basically there are four cases:
(1) one to one # null=False, related_null=False
(2) zero or one to one # null=True, related_null=False
(3) one to zero or one # null=False, related_null=True
(4) zero or one to zero or one # null=True, related_null=True

The current behavior of reverse nullable O2O raising an exception
corresponds to (2). For the Person/Soul example, a Soul may belong to
one or zero Persons but a Person must have exactly one Soul, so
Person().soul should raise a DoesNotExist. OTOH the Place/Shop example
the relationship is probably (4); a Shop may or may not be located at
a known Place and a Place may or may not be the location of a Shop, so
DoesNotExist should not be raised.

George

George Sakkis

unread,
Nov 1, 2010, 9:14:37 AM11/1/10
to Django developers
On Oct 9, 4:41 pm, Russell Keith-Magee <russ...@keith-magee.com>
wrote:

> However, any of these plans hinge on us determining the right behavior
> in the first place. Like I said last time -- I really need to hear
> other opinions on this.

Bumping this up (#10227). Btw the other two tickets (#14043, #14368)
are RFC; would be great if some Django dev can verify and commit them.

George
Reply all
Reply to author
Forward
0 new messages