We use UUIDs as primary keys.
Because a collision is unlikely to happen, we set the default value of the
model's PK field to uuid.uuid4.
But then, we had some issues with Django trying to UPDATE and then INSERT
(OK we can understand the logic, but its boring). So we created a small
mixin that set the value of force_insert to self._state.adding.
Unfortunately we discovered that the pre_save method of the fields aren't
called if insert is forced:
https://github.com/django/django/blob/master/django/db/models/base.py#L786.
That's an undocumented and unexpected feature. The documentation specify
that the Field's pre_save methods are called on save, and its a save.
Expected: Field.pre_save called unconditionally.
Current: Field.pre_save called at some conditions.
Workaround: Connect to model pre_save signal ...
--
Ticket URL: <https://code.djangoproject.com/ticket/25363>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
I'm not sure if there might be any problems in changing this behavior.
Would you be able to investigate and offer a patch?
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:1>
* owner: nobody => davidfischer-ch
* status: new => assigned
Comment:
Yes I will.
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:2>
* has_patch: 0 => 1
Comment:
I just wrote a test, and discovered the pre_save method is always called,
but in two different places depending if its an INSERT or an UPDATE.
INSERT: in SqlInsertCompiler and UPDATE: in Model._save_table.
https://github.com/django/django/blob/stable/1.8.x/django/db/models/sql/compiler.py#L927
https://github.com/django/django/blob/stable/1.8.x/django/db/models/base.py#L823
Please, can anyone explain me why?
Also, unfortunately the fact that pre_save is called lately during the
save defeat the implementation of a clean mix-in to track updated fields
and make UPDATE queries lighter. But that's another story.
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:3>
* has_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:4>
Comment (by timgraham):
Those types of "why" questions are often easiest answered by trying to
make a change you think will solve your problem and then seeing if any
tests break. Looking through `git blame` can also be helpful.
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:5>
* status: assigned => closed
* resolution: => needsinfo
Comment:
Feel free to reopen if you have a suggestion on what changes we could
make. I'm not sure how to proceed with this ticket otherwise. Thanks!
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:6>
Comment (by akaariai):
A clean mix-in to track updated fields to both make UPDATE queries
lighter, and to also skip update completely when no data has changed is a
big issue for me. So, if we can make that easier for users, I think we
should do that.
BTW here is my approach for this issue:
{{{
class ModTrackingMixin(object):
@classmethod
def from_db(cls, *args, **kwargs):
new = super().from_db(*args, **kwargs)
new._state.db_vals = new.__dict__.copy()
return new
def save(self, update_fields=None):
if hasattr(self._state, 'db_vals') and update_fields is None:
changed = []
orig = self._state.db_vals
for f, fname in [(f, f.attname) for f in
self._meta.get_fields()
if f.concrete]:
if (fname in orig and
f.to_python(getattr(self, fname)) !=
orig.get(fname, NO_VALUE)):
changed.append(fname)
return super().save(update_fields=changed)
else:
return super().save(update_fields=update_fields)
class Organization(models.Model):
code = models.CharField(max_length=20)
name= models.CharField(max_length=100)
class Meta:
db_table = 'organization'
class ModTrackingOrganization(ModTrackingMixin, Organization):
class Meta:
proxy = True
}}}
The idea is that if you do:
{{{
o = ModTrackingOrganization.objects.first()
o.name = o.name
o.save()
}}}
Then the save will actually be silently skipped (as no data has changed in
the DB).
I wonder if this is actually broken for some cases by the pre_save logic
we have currently in Django?
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:7>
Comment (by davidfischer-ch):
Hi akaariai,
Your mix-in is interesting but it has some issues, hopefully easy to fix.
1. This is actually broken by the `pre_save` logic for a simple reason :
The `pre_save` methods are called inside the save, a.k.a inside the
`super().save(...)` so your mix-in and my mix-in [https://github.com
/davidfischer-
ch/pytoolbox/blob/master/pytoolbox/django/models/mixins.py#L115
AutoUpdateFieldsMixin] will not detect those. Its a reason why I
implemented [https://github.com/davidfischer-
ch/pytoolbox/blob/master/pytoolbox/django/models/mixins.py#L153
CallFieldsPreSaveMixin].
I also implemented another mix-in (sometimes I feel like a DJ, doing many
mixes),[https://github.com/davidfischer-
ch/pytoolbox/blob/master/pytoolbox/django/models/mixins.py#L89
AutoRemovePKFromUpdateFieldsMixin] that will check if the primary key is
set to a new value, then the intention of the developer is probably to
duplicate the model by saving it with a new primary key. So the mix-in let
Django save with its own default options for save.
2. `__dict__.copy()` is not sufficient, if one of the fields, is, for
example, a `JSONField` then the content is a mutable type (`dict`). This
container may contains nested mutable types, and you have to `deepcopy` to
ensure your copy will reflect the unchanged data from the database and
will not be modified.
3. your overload of `save()` isn't generic enough to support all possible
arguments.
Interesting you saved the data in `_state`.
That's cleaner than my mix-in.
I am thinking loudly, but is it more performant that any field set by the
code should be updated regardless the value, something that will not
happen often, or is it better to check for diff. Checking for diff will
also double the memory usage for the data (is it any copy-on-write mix-in
out here?).
We also need to think about concurrent updates when designing a save-the-
changes mix-in. For that reason I implemented another mix-in
[https://github.com/davidfischer-
ch/pytoolbox/blob/master/pytoolbox/django/models/mixins.py#L243
UpdatePreconditionsMixin]. I don't know what is the best option, but maybe
its more explicit and intuitive that if the program sets the fields, that
they will be saved regardless the value.
Yes, many questions and thoughts.
I am eager to heard about you guys.
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:8>
Comment (by davidfischer-ch):
Interesting:
https://docs.python.org/3.4/library/collections.html#collections.ChainMap.
That can be our copy-on-write "kind of" data structure!
--
Ticket URL: <https://code.djangoproject.com/ticket/25363#comment:9>