I'm being bitten by http://code.djangoproject.com/ticket/6886. Short
version: ``object.related = obj`` doesn't stick the assigned object in
the model cache (the same way accessing ``object.related`` does) and
can lead to weird mis-matches between modified objects and ones fresh
from the db.
I've attached a fix to that ticket -- it's a very simple patch -- but
I'm not sure why were weren't simply doing this in the first case, and
I'm a bit concerned that there's a good reason...
Can someone with some familiarity with related fields take a look at
this and reassure me that I'm not doing something stupid here?
Thanks!
Jacob
SVN knows all: http://code.djangoproject.com/changeset/2598
The checkin message is slightly misleading - the code duplication
identified by the checkin message isn't a significant problem in this
case, it's the need to validate the cache before it is populated.
> Can someone with some familiarity with related fields take a look at
> this and reassure me that I'm not doing something stupid here?
The edge case to look at is assigning None to the a foreign key where
null=False.
>>> book.author = None
>>> print book.author
As the code currently stands, the second line will throw a
DoesNotExist exception. If you apply the patch (or revert 2598,
however you want to look at it), the None is dutifully returned - you
don't get the exception because the invalid value has been forced into
the cache.
Looking more broadly, a similar problem exists if you were to assign a
non-Author object. However, the currently deployed code doesn't handle
this case in a particularly elegant fashion: you're guaranteed to get
back either an object or an exception, but the object might not be
particularly meaningful. This suggests to me that the current code
could do with some validation improvements.
Ticket #6886 poses a problem. I can see that having to refetch from
the database is inefficient, and I can see how race conditions could
emerge. The change you are proposing would work if you expanded it to
bring the validation check into the __set__ method, but it would have
the side effect of introducing field validation at time of assignment.
This would make ForeignKeys inconsistent with other attribute
assignments. That said, I could probably live with this inconsistency.
Foreign keys are already slightly special cases; I don't see any
particular harm in making them a little more special.
Yours,
Russ Magee %-)
D'oh, your SVN-fu is greater than mine, apparently.
> The edge case to look at is assigning None to the a foreign key where
> null=False.
Ah, there it is. There's nothing in the test suite that relies on that
``DoesNotExist`` exception so all the tests patch with [2598] reverted
so I had trouble figuring out what difference it made. 'spose I'll
need to add a test verifying the correct behavior once we figure out
what that is.
I'd noticed the inconsistency about assigning the "wrong" type; right
now, this happens::
>>> book.author = Place(id=7)
>>> book.author
<Author id=7>
That's kinda ridiculous, actually. And I don't really feel bad
breaking it -- anyone relying on that actually working is seriously
twisted.
And, for the record, the race conditions and inconsistencies with the
current approach aren't in any way academic: I'm running into those
types of problems quite a bit, actually.
So, a proposal: I'd like to change FK assignment to be "special" like
you suggested. I'd like to:
* Raise a ``ValueError`` if you try to assign an object of the wrong
type to a FK.
* Raise a ``ValueError`` if you try to assign ``None`` to an FK that's
``null=False``.
* Cache the value in ``__set__``.
This will be slightly backwards-incompatible, but only the second
point -- dealing with ``None`` -- is likely to bite anyone. Dunno if
it's worth being "loose" there just to be nice...
Thoughts?
Jacob
Why not defer it to save()? Currently you can assign invalid values to
other fields and it won't break until save():
obj.time = 125 # Ok
obj.save() # ProgrammingError
And many things even won't break at all like assigning a string to a
boolean attribute.
Anyway my point is that early validation will break code that relies on
legitimate behavior: "assing values in bulk, then fix some, then save".
I was just bitten by this (having since svn up'd trunk). I have some
data migration scripts that make a lot of assignments up front, extra
logic to clean up a few things, and then wraps the save() in a try
block.
-Rob
Hrm, I was afraid of that. Unfortunatly there isn't a "right" way
that's gonna work for everyone -- if we're sloppy about FK assignments
you get nasty race conditions (especially when signals are involved);
if we're tight then data imports must be more careful.
One suggestion is to assign to "foo.bar_id" instead of "foo.bar"; that
skips the validation hook. But if you've got more suggestions I'm
listening...
Jacob
I got around it just by putting the FK assignments in a try block and
catching and echoing the exceptions. (This is an old database that
has no constraints and I'm migrating it to one that does -- so echoing
out the errors helps point out places that need manual updates
anyhow.)
Personally I'm ok with the change but it does change the way I think
about trapping errors with Django models somewhat -- being more aware
that assignment could raise and error rather than save().
-Rob