[Django] #30863: Queryset __repr__ can overload a database server in some cases

8 views
Skip to first unread message

Django

unread,
Oct 9, 2019, 1:34:51 PM10/9/19
to django-...@googlegroups.com
#30863: Queryset __repr__ can overload a database server in some cases
-------------------------------------+-------------------------------------
Reporter: Matt | Owner: nobody
Johnson |
Type: | Status: new
Uncategorized |
Component: Database | Version: 2.2
layer (models, ORM) | Keywords: queryset repr
Severity: Normal | __repr__
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 1
UI/UX: 0 |
-------------------------------------+-------------------------------------
Consider a model like this:


{{{
class Result(models.Model):
# A Result object represents someone who took a quiz
result_id = models.AutoField(primary_key=True, ...)
quiz = models.ForeignKey("Quiz", ...) # assume this boils down to an
integer field
name = models.CharField(...)

Meta:
ordering = ['name']
}}}

Assume it has hundreds of millions of records, and no index on the "name"
column.

Typical usage might be something like
{{{
Result.objects.filter(quiz_id=123)
}}}

Now consider a bug in the usage, like:

{{{
Result.objects.filter(quiz_id="somestring") # notice we used a string to
filter
}}}
Django will throw an exception (rightfully so).

As part of the usual error reporting process in debug mode, Django may
eventually call repr() on the "base" queryset (that is essentially
Result.objects.all()).

QuerySet.__repr__ tries to be helpful by printing the first 21 results of
the evaluated query. Because the base queryset orders by the un-indexed
"name" column, this can easily overload the database when it does "SELECT
... FROM Result ORDER BY name LIMIT 21" (trying to sort hundreds of
millions of rows by an unindexed column)

Even with debug mode turned off, some error reporting tools like Sentry
will call repr on the queryset, creating the same problem in production.

I suggest not showing any query data in Queryset.__repr__.

--
Ticket URL: <https://code.djangoproject.com/ticket/30863>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 10, 2019, 2:05:35 AM10/10/19
to django-...@googlegroups.com
#30863: Queryset __repr__ can overload a database server in some cases
-------------------------------------+-------------------------------------
Reporter: Matt Johnson | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution: duplicate
Keywords: queryset repr | Triage Stage:
__repr__ | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* status: new => closed
* type: Uncategorized => Cleanup/optimization
* resolution: => duplicate


Comment:

Duplicate of #20393 (see
[https://code.djangoproject.com/ticket/20393#comment:5 comment]).

--
Ticket URL: <https://code.djangoproject.com/ticket/30863#comment:1>

Reply all
Reply to author
Forward
0 new messages