I'm confused by Django's behaviour when saving related models. Take for example:class X(models.Model):passclass Y(models.Model):x = models.ForeignKey(X)Now if I create some objects (unsaved):x = X()y = Y(x=x)All well so far. But now odd things happen when I save:A) y.save() throws an integrity error because there's no PK for xI kind of understand this, but it's not obvious to me why Django doesn't at least try to save the related object first
B) y.x.save(); y.save() also throws an integrity error because y.x_id is None.However, y.x.id is not None, so I don't understand why it can't update y.x_id (and thus make the save succeed).C) y.x.save(); y.x = y.x; y.save() - succeeds, but I don't see why the y.x = y.x is needed.Is this a deliberate design decision, something I'm misunderstanding, or a bug/implementation artefact?
I'm running into this with serialization in Django Rest Framework - my API provides a facade over something that's actually stored across two models, so when creating the resource I want to deserialise the data into the two related models. DRF serializers by default return unsaved versions of the model, but this is broken by the above.Any insight into what's going on and why would be much appreciated.
On Sun, Jun 8, 2014 at 10:34 PM, Malcolm Box <mal...@tellybug.com> wrote:
I'm confused by Django's behaviour when saving related models. Take for example:
<snip details of classes and saving behaviour>
I kind of understand this, but it's not obvious to me why Django doesn't at least try to save the related object first
Ok - so how does Django decide that the related object needs to be saved?If it saves all related objects, then saving one object could result in a save call being invoked on every object in the database (since y points to x, which points to a, which points to b,…). I hope we can agree that a cascading save like this would be a bad idea.If it's not *every* related object, then we need to make a decision - which ones get saved? Ok - so lets say we just save the newly created objects (i.e., objects with no primary keys.
That means that the following would work:x = X(value=37)y = Y(x=x)y.save()and on retrieval, y.x.value == 37. Sure - that makes sense. But what about:x = X(value=37)x.save()x.value = 42y = Y(x=x)y.save()and on retrieval, y.x.value == 37. Huh? Why? Oh - it's because in *that* case, x was already in existence, so it wasn't re-saved as a result of y being created. So now we've got inconsistent behaviour, depending on when save() has been called on an object.
The only way I can see to rectify *this* problem would be to keep a track of every value that has been modified, and save any "modified" objects. This is in the realm of the possible -- and it has been proposed in the past -- but it means carrying a lot of accounting baggage around on *every* attribute change.
B) y.x.save(); y.save() also throws an integrity error because y.x_id is None.However, y.x.id is not None, so I don't understand why it can't update y.x_id (and thus make the save succeed).C) y.x.save(); y.x = y.x; y.save() - succeeds, but I don't see why the y.x = y.x is needed.Is this a deliberate design decision, something I'm misunderstanding, or a bug/implementation artefact?It's a deliberate design decision, for reasons that my example above hopefully makes clear. The reason the re-assignment is needed in your example is because y.x implies a query; if you directly save the original object (i.e., x.save(), not y.x.save()), you should find the reassignment isn't needed.