I have a small improvement to suggest for one-to-one fields: Make them cache
back-references on related objects. That is, assume
class A(model):
pass
class B(model):
a = OneToOneField(A)
a1 = A.objects.get(...) # assume a1 has a related B record
b1 = B.objects.get(...)
Today, both (a1.b.a is a1) and (b1.a.b is b1) are false; each of the
expressions generates two database hits. I propose to fix this.
I've run into the issue while doing some performance-improvement work on a
large project; in this project, the painful instance of the problem was a
user-profile model. The developers used the reverse of the 1-to-1 on the profile
model instead of django.contrib.auth.models.User.get_profile(), and they often
navigated between the user object and the profile object in both ways
(sometimes even for good reasons), generating redundant database hits.
I wrote tests for the problem, and a fix, in the context of the project; I've
now ported it to a fix in Django itself. It is a small fix, and I think beyond
its general value, it will allow removing the special-case-caching from
User.get_profile().
Do you see a reason why I should not post a ticket?
Thanks,
Shai.
+1
Do you see a reason why I should not post a ticket?
Yes! Good improvement. And we should do the same thing for one-to-many
fields (ForeignKeys):
"""
class Author(models.Model):
name = models.CharField(...)
class Book(models.Model):
author = models.ForeignKey(Author)
john = Author.objects.get(name='john')
books = list(john.book_set.all())
# Does a database query, but should be smart enough to simply return john.
books[0].author
"""
I'm pretty sure there's a long-standing ticket for this, but I'm not
sure which one it is. Shai, does your solution approach this in a way
that can solve the issue for ForeignKeys as well?
Adrian
This one seems much more likely to be a breaking change - it isn't
that unusual to do something like:
special_book = Book.objects.get(...)
special_book.fiddly_bit = True
# note no save
for author in Author.objects.all():
for book in author.books.all():
if book.fiddly_bit:
book.related_books.add(special_book)
The semantics of this change under a new caching layer.
If the intention is that it's really the same object (and not a new
instance of the same db object retrieved from a caching layer) then
this introduces actions at a distance.
This is ye olde Django ticket:
https://code.djangoproject.com/ticket/17
And for context: ;-)
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
I don't think Adrian is proposing anything as extensive as #17. What
he's proposing (IIUC) wouldn't change the semantics of your sample code
at all. All it would do is prepopulate the FK field on the results of a
reverse-FK query, so the innermost "book.author" here doesn't need to do
a query:
for author in Author.objects.all():
for book in author.books.all():
book.author
There wouldn't be any action-at-a-distance with other Book instances.
Carl
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8YXbgACgkQ8W4rlRKtE2fWPQCfXUp1NVmTaNvwZVPpyOsFVAtP
+UsAoJqJ5l0nz+98L+Z+dJ8rdPe5DeM8
=BTg6
-----END PGP SIGNATURE-----
>
> john = Author.objects.get(name='john')
> books = list(john.book_set.all())
>
> # Does a database query, but should be smart enough to simply return john.
> books[0].author
> """
>
> I'm pretty sure there's a long-standing ticket for this, but I'm not
> sure which one it is. Shai, does your solution approach this in a way
> that can solve the issue for ForeignKeys as well?
>
Only the tests...
My solution is a little modification on the descriptors used for the related
objects. But for foreign keys, the relevant descriptor returns a manager which
uses the QuerySet of the related model's default manager. To get the desired
behavior, that manager should modify (or wrap) this QuerySet, to seed the
cache on each object it returns.
I may have time to work on something like this, but probably only next week.
Shai.
Right, what Carl said -- no action-at-a-distance, just mere cache
population on each related object. I don't think there would be a risk
of backwards-incompatible side effects.
Adrian