The above changes generates a “SELECT QUOTE(?), ..., QUOTE(?)” query with
as many values as items in the array. This will hit the SQLite limit
SQLITE_MAX_COLUMN which defaults to 2000 and can only be increased up to
32767.
Before this change we were only limited by SQLITE_MAX_VARIABLE_NUMBER
which is lower by default (999) but which can be increased to a much
higher value. In Debian for example we use a value of 250000.
While the latter limit was unreasonably low (it just costs a bit of memory
and you can expect to have many parameters), the former limit make much
more sense IMO and I don't expect distributions to override the upstream
value.
I'm not sure what's the best way forward. Possibly process parameters by
batch of 2000.
(This bug has been initially reported in https://bugs.debian.org/809211 )
--
Ticket URL: <https://code.djangoproject.com/ticket/26063>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
I forgot to put the link documenting the above limits:
https://sqlite.org/limits.html
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:1>
* severity: Normal => Release blocker
* stage: Unreviewed => Accepted
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:2>
* owner: nobody => aaugustin
* status: new => assigned
Comment:
Yes, we should batch the escaping function.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:3>
Comment (by aaugustin):
I'm stuck writing tests for this.
The problem only occurs when SQLITE_MAX_VARIABLE_NUMBER has been changed
to be greated than SQLITE_MAX_COLUMN, which isn't the default, and there's
no way to introspect these values -- the Python bindings don't expose the
`get_limit` API of sqlite.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:4>
Comment (by aaugustin):
I ended up exercising directly a private API, which isn't great, but
should suffice to prevent regressions.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:5>
* has_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:6>
Comment (by shaib):
Replying to [comment:4 aaugustin]:
> I'm stuck writing tests for this.
>
> The problem only occurs when SQLITE_MAX_VARIABLE_NUMBER has been changed
to be greated than SQLITE_MAX_COLUMN, which isn't the default, and there's
no way to introspect these values -- the Python bindings don't expose the
`get_limit` API of sqlite.
It is enough, I think, to try to generate a query with 2001 variables and
see if it works. That may be less accurate for the reasons, but it checks
the symptoms, and doesn't need private APIs. `cursor.execute('select %s' +
sum(2000*[' + %s']), 2001*[1])` or something like that.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:7>
* cc: d@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:8>
Comment (by tobald):
Aymeric Augustin wrote:
> If the reporter could test it and confirm that it fixes the issue, that
would be great.
I get an error futher:
{{{
File "/usr/lib/python2.7/dist-packages/django/db/models/query.py",
line 258, in __iter__
self._fetch_all()
File "/usr/lib/python2.7/dist-packages/django/db/models/query.py",
line 1074, in _fetch_all
self._result_cache = list(self.iterator())
File "/usr/lib/python2.7/dist-packages/django/db/models/query.py",
line 52, in __iter__
results = compiler.execute_sql()
File "/usr/lib/python2.7/dist-
packages/django/db/models/sql/compiler.py", line 852, in execute_sql
cursor.execute(sql, params)
File "/usr/lib/python2.7/dist-packages/django/db/backends/utils.py",
line 83, in execute
sql = self.db.ops.last_executed_query(self.cursor, sql, params)
File "/usr/lib/python2.7/dist-
packages/django/db/backends/sqlite3/operations.py", line 141, in
last_executed_query
return sql % params
TypeError: not enough arguments for format string
}}}
Here params is a list of 25000 integers.
Thanks,
Christophe
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:9>
Comment (by aaugustin):
Shai: the problem in that case is that there's no way to tell apart these
two cases:
- `cursor.execute()` raised `OperationalError` because
`SQLITE_MAX_VARIABLE_NUMBER = 999` (the default value) and the query
contains more than 999 parameters
- `cursor.execute()` raised `OperationalError` while attempting to
generate the representation of the last query because of the bug reported
in this ticket
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:10>
Comment (by aaugustin):
Well we could look at the exception message. The former gives "too many
SQL variables"; I assume the latter gives something else.
Unfortunately I don't have access to a Debian system affected by this non-
standard configuration right now, so I cannot reproduce the issue easily.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:11>
* needs_better_patch: 0 => 1
Comment:
Marking the patch as needing improvement based on tobald's comment.
Unfortunately I cannot determine what happens based on code inspection.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:12>
Comment (by shaib):
Replying to [comment:10 aaugustin]:
> Shai: the problem in that case is that there's no way to tell apart
these two cases:
>
> - `cursor.execute()` raised `OperationalError` because
`SQLITE_MAX_VARIABLE_NUMBER = 999` (the default value) and the query
contains more than 999 parameters
> - `cursor.execute()` raised `OperationalError` while attempting to
generate the representation of the last query because of the bug reported
in this ticket
...but the statement I gave only has one column. It will never fail over
the bug, only on number of variables.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:13>
Comment (by aaugustin):
Let's assume a statement with C columns and P parameters. With the current
implementation, on SQLite, when `DEBUG = True`:
(1) The statement itself is executed: C columns, P parameters. This
requires `C ≤ SQLITE_MAX_COLUMN` and `P ≤ SQLITE_MAX_VARIABLE_NUMBER`.
(2) Another statement is executed to escape the parameters: P columns, P
parameters. This requires `P ≤ MIN(SQLITE_MAX_COLUMN,
SQLITE_MAX_VARIABLE_NUMBER)`
The default values of SQLite are `SQLITE_MAX_COLUMN = 2000 >
SQLITE_MAX_VARIABLE_NUMBER = 999`. With these values,
`MIN(SQLITE_MAX_COLUMN, SQLITE_MAX_VARIABLE_NUMBER) =
SQLITE_MAX_VARIABLE_NUMBER`. As a consequence, any statement that will go
through (1) without raising an exception will also go through (2) without
raising an exception.
In your example, C = 1 and P = 2000. This will fail at step (1) with
SQLite's default values. It will reproduce the original reporter's
problem, but only on platforms where `SQLITE_MAX_VARIABLE_NUMBER` has been
increased to over 2000.
We need the test to pass or be skipped on all platforms, though.
I see two ways to achieve this:
- detecting the values of `SQLITE_MAX_COLUMN` and
`SQLITE_MAX_VARIABLE_NUMBER` and ajusting accordingly — which doesn't seem
possible in Python
- writing a test just for (2), without going through (1) — which I did.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:14>
Comment (by shaib):
My statement, as far as I understand, does not reproduce the OP's problem,
because their problem is caused by C > 2000 and my statement has C = 1. It
can be used to verify if statements with P>2000 are accepted (it has
P=2001); so,
* if the statement failed declare the test skipped or an expected failure.
* If it succeeded, go on to test with the OP's ORM query
You can skip or xfail a test from within its code by raising the
appropriate exception, AFAIK.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:15>
Comment (by aaugustin):
The OP mentions `Model.objects.filter(foo__in=array)` and says `params is
a list of 25000 integers.`
I understand that as C = <number of fields in Model> and P = 25000.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:16>
Comment (by rhertzog):
You are both correct AFAIK.
But if you do what shaib suggest, then the unit test is only useful on
systems where SQLITE_MAX_VARIABLE_NUMBER has been increased. I would
suggest that maybe two tests are warranted, one to verify that the the
quoting function does it by batches (whatever configuration is in place)
and one to test we can have many values in a real query run through the
ORM if sqlite does accept many parameters.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:17>
Comment (by shaib):
Replying to [comment:16 aaugustin]:
> The OP mentions `Model.objects.filter(foo__in=array)` and says `params
is a list of 25000 integers`.
>
> I understand that as C = <number of fields in Model> and P = 25000.
Right. But the quoting query, with the current code, turns that into
C=P=25000.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:18>
Comment (by rhertzog):
Replying to [comment:11 aaugustin]:
> Well we could look at the exception message. The former gives "too many
SQL variables"; I assume the latter gives something else.
>
> Unfortunately I don't have access to a Debian system affected by this
non-standard configuration right now, so I cannot reproduce the issue
easily.
The Debian bug report linked has the stacktrace, it ends with this:
{{{
File "/usr/lib/python2.7/dist-
packages/django/db/backends/sqlite3/operations.py", line 128, in
last_executed_query
params = self._quote_params_for_last_executed_query(params)
File "/usr/lib/python2.7/dist-
packages/django/db/backends/sqlite3/operations.py", line 117, in
_quote_params_for_last_executed_query
return cursor.execute(sql, params).fetchone()
OperationalError: too many columns in result set
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:19>
Comment (by aaugustin):
Shai: exactly — that's why we can't tell an `OperationalError` when
running the query because P > SQLITE_MAX_VARIABLE_NUMBER from an
`OperationalError` while quoting the params because P > SQLITE_MAX_COLUMN,
which in turn is why I ended up testing the private API that does just the
quoting.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:20>
Comment (by rhertzog):
Replying to [comment:12 aaugustin]:
> Marking the patch as needing improvement based on tobald's comment.
>
> Unfortunately I cannot determine what happens based on code inspection.
I can reproduce the problem too. I don't know what happens but when I try
to run a query like described here (with 2000 items in the array), and
when I add some print statement, I see in last_executed_query that the
"sql" variable only contains 1024 "%s" whereas the "params" variable
correctly contains the 2000 quoted parameters.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:21>
Comment (by aaugustin):
Thanks Raphael. There's a very stupid mistake in my code :-( I'll fix it
tonight.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:22>
Comment (by rhertzog):
Found it:
{{{
diff --git a/django/db/backends/sqlite3/operations.py
b/django/db/backends/sqlite3/operations.py
index 0d6f084..f56f72e 100644
--- a/django/db/backends/sqlite3/operations.py
+++ b/django/db/backends/sqlite3/operations.py
@@ -114,7 +114,7 @@ class DatabaseOperations(BaseDatabaseOperations):
# limits are in effect and split the work in batches if needed.
BATCH_SIZE = 999
if len(params) > BATCH_SIZE:
- results = []
+ results = tuple()
for index in range(0, len(params), BATCH_SIZE):
chunk = params[index:index + BATCH_SIZE]
results +=
self._quote_params_for_last_executed_query(chunk)
}}}
But I still don't understand why I see only 1024 "%s"... and why it
doesn't fail even with the above fix.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:23>
Comment (by rhertzog):
Forget the point about having only 1024 %s. It looks like my command line
to count them did not work as I expected...
{{{tr -dc %|wc -c}}} and pasting, press enter, type CTRL D gives back
1024. But the same with {{{tr -dc % <<END | wc -c}}} gives the correct
value... possibly some line buffering issue from the shell. Bah.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:24>
Comment (by shaib):
Is there a limit on the length of a query's SQL text in SQLite or its
Python driver?
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:25>
Comment (by rhertzog):
Yes, SQLITE_MAX_SQL_LENGTH which defaults to 1000000.
See https://sqlite.org/limits.html
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:26>
Comment (by shaib):
Replying to [comment:20 aaugustin]:
> Shai: exactly — that's why we can't tell an `OperationalError` when
running the query because P > SQLITE_MAX_VARIABLE_NUMBER from an
`OperationalError` while quoting the params because P > SQLITE_MAX_COLUMN,
which in turn is why I ended up testing the private API that does just the
quoting.
We can if we run the "big P, small C" query on a raw cursor rather than a
Django cursor.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:27>
Comment (by rhertzog):
Hi, it looks like you haven't updated your pull request yet. What's the
status?
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:28>
Comment (by aaugustin):
The status is "still intending to work on it when I find time"... I'd like
to fix it in 1.9.2, which should be released in February.
(Also my theory of the "very stupid mistake" turned out to be wrong, so I
still don't know why my patch doesn't work.)
If you're interested in improving the patch, just assign this ticket to
yourself so we don't duplicate effort. Thanks!
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:29>
* needs_better_patch: 1 => 0
Comment:
I amended my pull request but I don't really have time to install a Debian
to test it.
If you don't mind testing that version, that would be really helpful.
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:30>
Comment (by rhertzog):
I tested the updated patch on Debian and it works fine for me now. Thanks
for your work!
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:31>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"f91b5a7e4b2ac436b1fe3065b23217543d0f54c1" f91b5a7e]:
{{{
#!CommitTicketReference repository=""
revision="f91b5a7e4b2ac436b1fe3065b23217543d0f54c1"
Fixed #26063 -- Crash when passing > 2000 params.
If SQLITE_MAX_VARIABLE_NUMBER (default = 999) is changed at compile time
to be greater than SQLITE_MAX_COLUMN (default = 2000), which Debian does
by setting the former to 250000, Django raised an exception on queries
containing more than 2000 parameters when DEBUG = True.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:32>
Comment (by Aymeric Augustin <aymeric.augustin@…>):
In [changeset:"cfe4ba6e905eefc7e53de37226e0f342a68dc47b" cfe4ba6]:
{{{
#!CommitTicketReference repository=""
revision="cfe4ba6e905eefc7e53de37226e0f342a68dc47b"
[1.9.x] Fixed #26063 -- Crash when passing > 2000 params.
If SQLITE_MAX_VARIABLE_NUMBER (default = 999) is changed at compile time
to be greater than SQLITE_MAX_COLUMN (default = 2000), which Debian does
by setting the former to 250000, Django raised an exception on queries
containing more than 2000 parameters when DEBUG = True.
Backport of f91b5a7e4b from master
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26063#comment:33>