[Django] #35587: Add QuerySet.partition(*args, **kwargs)

10 views
Skip to first unread message

Django

unread,
Jul 9, 2024, 4:49:13 PM7/9/24
to django-...@googlegroups.com
#35587: Add QuerySet.partition(*args, **kwargs)
-------------------------------------+-------------------------------------
Reporter: micahcantor | Type: New
| feature
Status: new | Component: Database
| layer (models, ORM)
Version: 5.0 | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
A common task with a Django model is to partition the model instances into
two sets, ones that is selected by some filters, and ones that are not.
Naively, the following utility script can accomplish this with
QuerySet.filter() and QuerySet.exclude()

{{{
from django.db.models import QuerySet
from django.db.models.manager import BaseManager

def partition(self, *args, **kwargs):
filtered = self.filter(*args, **kwargs)
excluded = self.exclude(*args, **kwargs)
return filtered, excluded

QuerySet.partition = partition
BaseManager.partition = partition
}}}

For instance, if we have a Book model, we can divide it into those that
are fiction and nonfiction.

{{{
fiction, nonfiction = Book.objects.partition(genre="fiction")
}}}

Obtaining two separate QuerySets is often helpful if we want add further
filters, ordering, or prefetches to one set but not the other.

Adding this method to Django would be a helpful utility, and could also be
implemented more efficiently than my own naive implementation. It would be
difficult for me to suggest a better implementation without a deeper
understanding of the implementations of filter() and exclude().
--
Ticket URL: <https://code.djangoproject.com/ticket/35587>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jul 9, 2024, 5:15:28 PM7/9/24
to django-...@googlegroups.com
#35587: Add QuerySet.partition(*args, **kwargs)
-------------------------------------+-------------------------------------
Reporter: Micah Cantor | Owner: (none)
Type: New feature | Status: closed
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* resolution: => wontfix
* status: new => closed

Comment:

I don't think it's worth extending the `Queryset` API with a method that
can be emulated through various method and would entertain the idea that
the returned set of objects will always be mutually exclusive. This is not
a guarantee that the ORM can provide for a few reasons.

First the querysets are going to reach to the database serially and thus
they won't be executed against the same ''snapshot'' so an object could be
changed in a way that makes it appear in both partitions. Secondly, while
the ORM goes at great length to make `exclude` the complement of `filter`
[https://code.djangoproject.com/query?description=~exclude&status=assigned&status=new&order=id&desc=1
it has a few know bugs] which could also manifest themselves in these
scenarios.

You are likely better off with a single query that uses an annotation as
the Python-level predicate for partitioning

{{{#!python
def partition(self, *args, **kwargs):
queryset = self.annotate(_partition_predicate=Q(*args, **kwargs))
predicate = attrgetter("_partition_predicate")
return filter(predicate, queryset), filterfalse(predicate, queryset)
}}}

But that doesn't allow chaining which for the aforementioned reasons I
believe is not achievable.
--
Ticket URL: <https://code.djangoproject.com/ticket/35587#comment:1>
Reply all
Reply to author
Forward
0 new messages