M2M Intermediary Usage Question

5 views
Skip to first unread message

Joshua Uziel

unread,
Jul 30, 2008, 2:55:55 PM7/30/08
to django...@googlegroups.com
Got a question about the new M2M Intermediary bits from #6095... pertinent model bits:

class Document(models.Model):
    ....
    members = models.ManyToManyField(User, through='Role')
    ....

class Role(models.Model):
    ROLES = (('owner', 'Owner'), ('author', 'Author'), ('reviewer', 'Reviewer'))
    document = models.ForeignKey(Document)
    user = models.ForeignKey(User)
    role = models.CharField(max_length=8, choices=ROLES)

So, if I want to get all users who are a member of a document, in the old world without that "through="

>>> doc.role_set.all()
[<Role: uzi:owner:This is a test>, <Role: juziel:author:This is a test>]
>>> [role.user for role in doc.role_set.all()]
[<User: uzi>, <User: juziel>]

but now I can

>>> doc.members.all()
[<User: uzi>, <User: juziel>]

So far so good.  What if I only want users who are owners?  The old way:

>>> doc.role_set.filter(role="owner")
[<Role: uzi:owner:This is a test>]
>>> [role.user for role in doc.role_set.filter(role="owner")]
[<User: uzi>]

I would think that the new way would be something like this for owners:

>>> doc.members.filter(role__role="owner")
[<User: uzi>, <User: uzi>, <User: uzi>]

or this for authors:

>>> doc.members.filter(role__role="author")
[<User: uzi>, <User: juziel>]

Looking at all roles I have in my test data:

>>> Role.objects.all()
[<Role: uzi:owner:This is a test>, <Role: uzi:author:Another test>, <Role: uzi:owner:FooBar>, <Role: juziel:author:This is a test>, <Role: uzi:owner:BazQux>]

So what's happening is that I'm getting all User objects where the role is "owner" in the first or "author" in the second.  Instead, I need to do:

>>> doc.members.filter(role__role="owner", document=doc)
[<User: uzi>]
>>> doc.members.filter(role__role="author", document=doc)
[<User: juziel>]

I just think the previous queries make more sense... shouldn't filter() return a subset of all()?  If this is a bug, by all means, let me know and I'll file it.  I just wanted to make sure I wasn't misunderstanding things first.  Thanks!

Malcolm Tredinnick

unread,
Jul 30, 2008, 4:30:37 PM7/30/08
to django...@googlegroups.com

I think what's happening here is that you're stumbling over a really
subtle bug that I only noticed yesterday. It's not really related to
"through", since I saw it with many-to-many relations prior to that
field.

I may off on your particular case, since I haven't actually typed in the
models and looked at the query, but if you look at the SQL output
(my_queryset.query.as_sql()), I think you'll see it joining twice on the
intermediate table. Which is actually a good thing normally, but wrong
in this case. Here are the ugly details..

Firstly, you have to know about the difference between one filter() call
with multiple conditions and two chained filter() calls for multi-valued
(many-to-many, reverse many-to-one, etc) relations. See [1] for the
description from the Django docs. There are two legitimate desired
behaviours here, so Django provides a way to do both without adding
burdensome API, but it's a little tricky until you try it out.

The trick (problem!) here is that a related manager is really just a
queryset under the covers. So doc.members is a filter (on the doc_id
being equal to some specific value) across a reverse many-to-one
relation. That is, there are potentially multiple members for a single
doc instance. So when you write doc.members.filter(role__role=...),
Django is treating the filter() call as a separate distinct filter() in
the sense of the above documentation. Thus it's joining to a new copy of
the Role table.

What I need to do is make the first filter() after a related manager
"sticky" in the sense that it's treated as the same filter call, since
that's sort of the natural behaviour. This isn't entirely trivial to do
because querysets don't really know "where they've come from", but I've
been thinking about a solution as I wander around today and I think I
can make something work in the next day or two.

Your timing here is spectacular. Like I said, I had no idea this was an
issue 24 hours ago, but saw it in some work I was doing for a client as
we were wondering why they were seeing some duplicated results. Thus,
I'm on the case. If you'd reported this yesterday, I would have just
looked confused. Today I can at least say I know what's happening, but I
don't have a solution yet.

Regards,
Malcolm

P.S. Whilst writing this reply, I opened #8046 so that people can track
progress on this issue if they want. It's pretty high on my TODO list,
since it's tricky enough to be hard for people to debug if they
encounter it.


Joshua Uziel

unread,
Jul 30, 2008, 5:01:09 PM7/30/08
to django...@googlegroups.com
Oh good, so I'm not crazy.  I ran into this yesterday as well and decided to play with it a bit more before firing off an email.  Thanks Malcolm!  You missed sending the link, but I found what you meant to send, just in case anyone else is curious (and for future people searching archives):

  http://www.djangoproject.com/documentation/db-api/#spanning-multi-valued-relationships

Let me know if you'd like me to test anything on my side.  BTW, it was great to meet you and chat at the Sausalito Sprint... I hope we get to do so again soon...
Reply all
Reply to author
Forward
0 new messages