New "pristine" managers kill overriding of "get"

4 views
Skip to first unread message

Ivan Sagalaev

unread,
Jul 23, 2008, 9:41:32 AM7/23/08
to django-d...@googlegroups.com
I'd like to second magneto's concern[1] about new "pristine" managers
that access parent object. As it's implemented now it kills our memory
map for commonly accessed models. For example we have a City model with
the contents that changes very rarely. We overrode manager's get to
cache objects in memory:

def get(self, *args, **kwargs):
if not hasattr(self, '_cache'):
self._cache = # build a dictself.all())
return self._cache.get(...)

This was working for every some_object.city event across requests. Now
it doesn't work because parent is not accessed through custom manager.

For now I can't propose any nice solution apart from backing out that
revision because it needs to be reworked.


[1]: http://code.djangoproject.com/ticket/7666#comment:7

Johannes Dollinger

unread,
Jul 23, 2008, 10:52:58 AM7/23/08
to django-d...@googlegroups.com
#7566[1] and #2698[2] are somewhat related to #7666.

After reading Malcolm's comment on #7154 (concerning #7566 and #2698)
I expected #7666 to be wontfix:
> I think the two tickets you pointed out aren't really related to
> this, since they're based on people hoping for different behaviour
> from the default manager than what actually happens. Default
> manager == default behaviour. So I'm not going to worry about those
> tickets yet.

Perhaps a more transparent explanation than "several parts of Django
will use that Manager exclusively for that model" would be all that's
needed. So the following warning won't be taken lightly: "it’s often
a good idea to be careful in your choice of default manager, in order
to avoid a situation where overriding of get_query_set() results in
an inability to retrieve objects you’d like to work with.".

[1] http://code.djangoproject.com/ticket/7566
[2] http://code.djangoproject.com/ticket/2698

Ivan Sagalaev

unread,
Jul 23, 2008, 12:14:02 PM7/23/08
to django-d...@googlegroups.com
Johannes Dollinger wrote:
> Perhaps a more transparent explanation than "several parts of Django
> will use that Manager exclusively for that model" would be all that's
> needed.

I don't need an explanation why my code doesn't work :-). For that
matter I was the one who was objecting to this change because "managers
are used not only for filtering".

So I want to solve the problem, not document that it's insolvable.

Malcolm Tredinnick

unread,
Jul 23, 2008, 1:29:40 PM7/23/08
to django-d...@googlegroups.com

On Wed, 2008-07-23 at 16:52 +0200, Johannes Dollinger wrote:
> #7566[1] and #2698[2] are somewhat related to #7666.
>
> After reading Malcolm's comment on #7154 (concerning #7566 and #2698)
> I expected #7666 to be wontfix:

Don't confuse things I would like to happen as a broad goal with things
that will happen. I was pointing out in that comment that those two
tickets were certainly independent of #7154. But I also realise that the
whole default manage issue is controversial. There are two reasonably
legitimate sides (of course the side that doesn't agree with *me* is
obviously less legitimate, but they don't seem to have read that memo
yet :-) ) to this and I wasn't about to go around wontfixing tickets
based on what I'd like to see happen eventually.

There are smart people with real use-cases involved in this thread, so
I'm leaving it up to them to come up with a reasonable solution. Yes,
personally I think people should treat the default manager as the thing
that is selected by default, but to make that work conherently, we would
probably need to be able to specify managers on related fields
(currently we use the default manager of the related object) for things
like Joseph's use-case. That's quite probably required for multi-db work
anyway, but that's work for the far future, rather than now.

Regards,
Malcolm

Ivan Sagalaev

unread,
Jul 23, 2008, 4:16:43 PM7/23/08
to django-d...@googlegroups.com
Ivan Sagalaev wrote:
> For now I can't propose any nice solution apart from backing out that
> revision because it needs to be reworked.

Looks like I've got something that could work...

Manager class should have a separate method get_full_query_set() that
will return an unfiltered queryset. Current get_query_set() will just
call get_full_query_set() by default. This way users could override
get_full_query_set however they like but it will be required to return
all records. get_query_set() will remain a "filtering point".

Overriding Manager.get still won't work as it was working before but at
least we will be able to subclass a QuerySet, override "get" there and
return this QuerySet from get_full_query_set()

/me went away to write a patch...

Johannes Dollinger

unread,
Jul 23, 2008, 4:54:16 PM7/23/08
to django-d...@googlegroups.com
I somehow assumed "default manager" was already well defined. Sorry
about the confused noise then. It's probably that I still don't get
why one would ever want to restrict get_query_set() for any other
reason than to restrict the set of db entries visible to django. Or
why one would want more than one manager. But that's a different
discussion.

Johannes Dollinger

unread,
Jul 23, 2008, 5:06:12 PM7/23/08
to django-d...@googlegroups.com

I was arguing for reverting r8017 and wontfixing the issue that
caused it (#7666).
That would have solved your problem.

Unfortunatelly "default manager" is not as clearly defined as I hoped.

>
> >


Ivan Sagalaev

unread,
Jul 23, 2008, 5:08:47 PM7/23/08
to django-d...@googlegroups.com
Johannes Dollinger wrote:
> It's probably that I still don't get
> why one would ever want to restrict get_query_set() for any other
> reason than to restrict the set of db entries visible to django.

It doesn't have to *restrict* a queryset. Here's a usecase (it might
seem boring but it's real). I have a forum where Article refers to User
and Profile refers to User. I want Articles selected with their Users
*and* their Profiles. select_related() would fetch Users just fine but
it won't fetch their Profiles because it's a reverse relation (Profiles
refers to Users). So I subclass a QuerySet with ArticleQuerySet where I
have one additional query for Profiles which I then just assign to their
Articles. This is where I have a manager whose get_query_set() returns
ArticleQuerySet instead of standard QuerySet. It's not restricted but
rather "enhanced".

Ivan Sagalaev

unread,
Jul 23, 2008, 5:10:30 PM7/23/08
to django-d...@googlegroups.com
Johannes Dollinger wrote:
> I was arguing for reverting r8017 and wontfixing the issue that
> caused it (#7666).
> That would have solved your problem.

Ah, I see now, thanks for clarification. However the issue that caused
#7666 is also real and I think it should be fixed also. I have proposed
a solution (though not perfect) elsewhere in the thread.

Johannes Dollinger

unread,
Jul 23, 2008, 5:40:22 PM7/23/08
to django-d...@googlegroups.com

That's fine. Then you don't have problems as in #7666, #2698, or
#7566. I understand that overriding get_query_set() is useful, I do
that myself. But if you want subsets of your data, I think you're
better off providing a filter method on a custom QuerySet class and
optionally providing a proxy on your manager for consistency. That
gives you more flexibility (http://dpaste.com/hold/67035/) and
bypasses the problems in the mentioned tickets.

>
> >


Ivan Sagalaev

unread,
Jul 23, 2008, 5:52:17 PM7/23/08
to django-d...@googlegroups.com
Ivan Sagalaev wrote:
> Looks like I've got something that could work...
>
> Manager class should have a separate method get_full_query_set() that
> will return an unfiltered queryset. Current get_query_set() will just
> call get_full_query_set() by default. This way users could override
> get_full_query_set however they like but it will be required to return
> all records. get_query_set() will remain a "filtering point".
>
> Overriding Manager.get still won't work as it was working before but at
> least we will be able to subclass a QuerySet, override "get" there and
> return this QuerySet from get_full_query_set()
>
> /me went away to write a patch...

Actually I didn't... I still won't to invent some way of preserving
current behavior where we use manager instance to store a cache that
"get" uses. I'm thinking of something along the lines `Manager.get(self,
use_full_queryset=False)` for all proxy methods. Or may be really having
a pre-defined separate `all_objects` property on a model for a second
manager with full results... If anyone have any ideas don't hesitate to
share!

Justin Bronn

unread,
Jul 23, 2008, 7:23:15 PM7/23/08
to Django developers
> Actually I didn't... I still won't to invent some way of preserving
> current behavior where we use manager instance to store a cache that
> "get" uses. I'm thinking of something along the lines `Manager.get(self,
> use_full_queryset=False)` for all proxy methods. Or may be really having
> a pre-defined separate `all_objects` property on a model for a second
> manager with full results... If anyone have any ideas don't hesitate to
> share!

I attached a patch to #7904. Basically, if you add a
`use_for_related_fields` attribute to your custom manager, the field
will respect that and use the manager when calling `get`. This allows
those using their managers to be able to disable the new behavior.

-Justin

Ivan Sagalaev

unread,
Jul 24, 2008, 2:19:56 AM7/24/08
to django-d...@googlegroups.com
Johannes Dollinger wrote:
> That's fine. Then you don't have problems as in #7666

I provided this example just because you asked what people can do apart
from filtering.

My current problem is described in first post of the thread: accessing
`child.parent` doesn't call Parent.objects.get. It's not #7666 by itself
it's just mentioned there in comments.

Ivan Sagalaev

unread,
Aug 5, 2008, 2:51:38 AM8/5/08
to django-d...@googlegroups.com
Ivan Sagalaev wrote:
> Looks like I've got something that could work...
>
> Manager class should have a separate method get_full_query_set() that
> will return an unfiltered queryset. Current get_query_set() will just
> call get_full_query_set() by default. This way users could override
> get_full_query_set however they like but it will be required to return
> all records. get_query_set() will remain a "filtering point".
>
> Overriding Manager.get still won't work as it was working before but at
> least we will be able to subclass a QuerySet, override "get" there and
> return this QuerySet from get_full_query_set()
>
> /me went away to write a patch...

Hello!

I've managed to come to agreement with all talking voices in my head and
come up with a patch[1] that seems sane to me. The solution is explained
in the ticket an I ask someone of the triagers or core devs to take a
look at it when you have time.

Thanks!

[1]: http://code.djangoproject.com/ticket/8119

Justin Bronn

unread,
Aug 5, 2008, 10:56:25 AM8/5/08
to Django developers
> I've managed to come to agreement with all talking voices in my head and
> come up with a patch[1] that seems sane to me. The solution is explained
> in the ticket an I ask someone of the triagers or core devs to take a
> look at it when you have time.

I closed #8119 as a duplicate and moved your patch and comments to
#7904, which is the original ticket that magneto opened for this
issue. My first impression is that your patch is more thought out than
my attribute approach.

Any thoughts from the core devs?

-Justin

magneto

unread,
Aug 5, 2008, 11:06:31 AM8/5/08
to Django developers

> I closed #8119 as a duplicate and moved your patch and comments to
> #7904, which is the original ticket that magneto opened for this
> issue. My first impression is that your patch is more thought out than
> my attribute approach.
>
> Any thoughts from the core devs?

That patch at least maintains some standard OO subclassing. The only
thing that's kinda weird to me is the naming of them. 'get_full' in a
'get' context seems like it should be be 'get_queryset_for_one', and
likewise 'get_full_query_set' in a 'filter' context sounds like a
'get_queryset_for_many' (or 'get_default_queryset').

bo

Ivan Sagalaev

unread,
Aug 6, 2008, 3:30:08 AM8/6/08
to django-d...@googlegroups.com
magneto wrote:
> That patch at least maintains some standard OO subclassing. The only
> thing that's kinda weird to me is the naming of them. 'get_full' in a
> 'get' context seems like it should be be 'get_queryset_for_one', and
> likewise 'get_full_query_set' in a 'filter' context sounds like a
> 'get_queryset_for_many' (or 'get_default_queryset').

I'm not very fond of "get_full" either :-). The reason is since I have
"get_full_query_set" (which is OK since we already have
get_empty_query_set) then "get_full" should point that it's different
from "get" by accessing "get_full_query_set". But in the context of
calling it looks strange indeed. For alternatives:

- get_unfiltered
- unfiltered_get
- get_from_full
- get_for_parent (but it restricts its usage)

?

James Bennett

unread,
Aug 6, 2008, 3:58:54 AM8/6/08
to django-d...@googlegroups.com
On Wed, Aug 6, 2008 at 2:30 AM, Ivan Sagalaev
<man...@softwaremaniacs.org> wrote:
> I'm not very fond of "get_full" either :-). The reason is since I have
> "get_full_query_set" (which is OK since we already have
> get_empty_query_set) then "get_full" should point that it's different
> from "get" by accessing "get_full_query_set". But in the context of
> calling it looks strange indeed. For alternatives:
>
> - get_unfiltered
> - unfiltered_get
> - get_from_full
> - get_for_parent (but it restricts its usage)
>
> ?


super(SomeManagerClass, self).get_query_set()

And you're done.


--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Ivan Sagalaev

unread,
Aug 6, 2008, 12:20:06 PM8/6/08
to django-d...@googlegroups.com
James Bennett wrote:
>> I'm not very fond of "get_full" either :-). The reason is since I have
>> "get_full_query_set" (which is OK since we already have
>> get_empty_query_set) then "get_full" should point that it's different
>> from "get" by accessing "get_full_query_set". But in the context of
>> calling it looks strange indeed. For alternatives:
>>
>> - get_unfiltered
>> - unfiltered_get
>> - get_from_full
>> - get_for_parent (but it restricts its usage)
>>
>> ?
>
>
> super(SomeManagerClass, self).get_query_set()
>
> And you're done.

No, James, you're missing the point here. This will bypass custom
manager entirely and will use queryset instance directly. This is how it
works now (after #7666) and this is what we want to fix. We want to be
able to override a method on a manager that will be called when looking
up parent object. In the old days before #7666 it was {{{get}}}. Now it
won't work because {{{get}}} uses a queryeset pre-filtered by custom
{{{get_query_set}}}. This is why I'm introducing a new method that a)
uses unfiltered queryset and b) can be overridden.

Waylan Limberg

unread,
Aug 6, 2008, 12:45:36 PM8/6/08
to django-d...@googlegroups.com
Um, doesn't changeset [8212] (from #7904) address this? Just wondering
why this discussion is still going when the core devs have already
made a decision on what approach they're taking and implemented it. Is
there a use case that is still not addressed?

--
----
Waylan Limberg
way...@gmail.com

Ivan Sagalaev

unread,
Aug 6, 2008, 3:09:04 PM8/6/08
to django-d...@googlegroups.com
Waylan Limberg wrote:
> Um, doesn't changeset [8212] (from #7904) address this? Just wondering
> why this discussion is still going when the core devs have already
> made a decision on what approach they're taking and implemented it. Is
> there a use case that is still not addressed?

For some reason I didn't receive email from Trac saying about this
commit... Anyway, this looks strange because Jacob has applied the first
patch (by Justin Bronn) instead of the second one (by me) though
everyone (including Justin) seem to agree that the second approach is
better.

Jacob, can you explain the reason? My own concerns about the approach
with switching attribute are explained in the ticket[1].

[1]: http://code.djangoproject.com/ticket/7904#comment:5

Jacob Kaplan-Moss

unread,
Aug 6, 2008, 3:25:32 PM8/6/08
to django-d...@googlegroups.com
On Wed, Aug 6, 2008 at 2:09 PM, Ivan Sagalaev
<man...@softwaremaniacs.org> wrote:
> Jacob, can you explain the reason? My own concerns about the approach
> with switching attribute are explained in the ticket[1].

I'm not thrilled with adding more methods onto the manager; further,
what happens if someone filters get_full_query_set()? Do we then need
to add a get_really_full_query_set_no_I_mean_it_this_time()? The same
reasoning applies to adding methods to QuerySet itself.

Further, what, exactly, does "full" mean? Someone who's followed this
issue likely understands well enough, but most people won't; it's Yet
Another API Method to learn.

Finally, this is at best a stop-gap: the current handling of related
managers is badly designed, and due for a refactor. So I did the
simplest thing that could possibly work so that we can get 1.0 out on
time.

Jacob

Ivan Sagalaev

unread,
Aug 6, 2008, 5:01:40 PM8/6/08
to django-d...@googlegroups.com
Jacob Kaplan-Moss wrote:
> I'm not thrilled with adding more methods onto the manager; further,
> what happens if someone filters get_full_query_set()?

Not to argue but for the sake of completeness... It's the same situation
if someone, say, returns a string from get_query_set. In other words,
yes, if someone doesn't follow documentation Bad Staff happens :-)

> Finally, this is at best a stop-gap: the current handling of related
> managers is badly designed, and due for a refactor.

Fair enough. Thanks for the explanation...

Reply all
Reply to author
Forward
0 new messages