Apologies for the slightly lengthy email.
I'm working on getting ticket #2705 (Add optional FOR UPDATE clause to QuerySets) into shape, primarily by adding tests to the patch. I'm writing a test for a backend's FOR UPDATE NOWAIT, and have come across some interesting behaviour where DatabaseError appears to be swallowed when converting a queryset to a list. This doesn't happen when simply iterating over a queryset.
My gut feeling is that this boils down to this vastly simplified demonstration of how list() works:
>>> class Foo(object):
... def __len__(self):
... print 'len called'
... raise ValueError
... def __iter__(self):
... return iter([1,2,3])
...
>>> a = Foo()
>>> list(a)
len called
[1, 2, 3]
Here, you can see that when converting to a list, Python calls __len__, and if that raises an exception, discards it and goes on to call __iter__.
So - my hypothesis (unproved, as I could benefit from someone with deeper ORM knowledge) is that the call to list() in my original test case calls QuerySet.__len__(), which ends up raising a DatabaseError (caused by an underlying database lock, the behaviour I'm actually testing for). Python's subsequent call to QuerySet.__iter__() succeeds, but ends up returning an empty iterator due to some pre-existing state *handwaving here*.
It's the handwaving bit I'm not sure about :). Does that hypothesis sound plausible? It seems to be borne out by the snippet below, where I've removed the underlying table:
>>> from piston.models import Token
>>> list(Token.objects.all())
[]
>>> Token.objects.create()
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/manager.py", line 138, in create
return self.get_query_set().create(**kwargs)
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/query.py", line 352, in create
obj.save(force_insert=True, using=self.db)
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/base.py", line 434, in save
self.save_base(using=using, force_insert=force_insert, force_update=force_update)
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/base.py", line 527, in save_base
result = manager._insert(values, return_id=update_pk, using=using)
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/manager.py", line 198, in _insert
return insert_query(self.model, values, **kwargs)
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/query.py", line 1492, in insert_query
return query.get_compiler(using=using).execute_sql(return_id)
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/sql/compiler.py", line 787, in execute_sql
cursor = super(SQLInsertCompiler, self).execute_sql(None)
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/sql/compiler.py", line 731, in execute_sql
cursor.execute(sql, params)
File "/Users/dan/virtual/ism/django/parts/django/django/db/backends/util.py", line 15, in execute
return self.cursor.execute(sql, params)
File "/Users/dan/virtual/ism/django/parts/django/django/db/backends/mysql/base.py", line 86, in execute
return self.cursor.execute(query, args)
File "/Users/dan/.buildout/eggs/MySQL_python-1.2.3-py2.6-macosx-10.6-universal.egg/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/Users/dan/.buildout/eggs/MySQL_python-1.2.3-py2.6-macosx-10.6-universal.egg/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
DatabaseError: (1146, "Table 'ismdj.piston_token' doesn't exist")
... and of course:
>>> for token in Token.objects.all():
... print token
...
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/query.py", line 106, in _result_iter
self._fill_cache()
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/query.py", line 773, in _fill_cache
self._result_cache.append(self._iter.next())
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/query.py", line 269, in iterator
for row in compiler.results_iter():
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/sql/compiler.py", line 676, in results_iter
for rows in self.execute_sql(MULTI):
File "/Users/dan/virtual/ism/django/parts/django/django/db/models/sql/compiler.py", line 731, in execute_sql
cursor.execute(sql, params)
File "/Users/dan/virtual/ism/django/parts/django/django/db/backends/util.py", line 15, in execute
return self.cursor.execute(sql, params)
File "/Users/dan/virtual/ism/django/parts/django/django/db/backends/mysql/base.py", line 86, in execute
return self.cursor.execute(query, args)
File "/Users/dan/.buildout/eggs/MySQL_python-1.2.3-py2.6-macosx-10.6-universal.egg/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/Users/dan/.buildout/eggs/MySQL_python-1.2.3-py2.6-macosx-10.6-universal.egg/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
DatabaseError: (1146, "Table 'ismdj.piston_token' doesn't exist")
Is this expected behaviour, where an exception is only raised when models are iterated over?
For reference, here's a rough, slightly abbreviated version of the test code I have so far for the patch I'm working on. The interesting code is marked with a # XXX, but the context may be useful (I'm trying to look for a DatabaseError that will be returned by the database due to being unable to get a lock):
from models import Person
class SelectForUpdateTests(TransactionTestCase):
def setUp(self):
connection._rollback()
connection._enter_transaction_management(True)
self.new_connections = ConnectionHandler(settings.DATABASES)
self.person = Person.objects.create(name='Reinhardt')
def start_blocking_transaction(self):
self.new_connection = self.new_connections[DEFAULT_DB_ALIAS]
self.new_connection._enter_transaction_management(True)
self.cursor = self.new_connection.cursor()
sql = 'SELECT * FROM %(db_table)s %(for_update)s;' % {
'db_table': Person._meta.db_table,
'for_update': self.new_connection.ops.for_update_sql(),
}
self.cursor.execute(sql, ())
result = self.cursor.fetchone()
def end_blocking_transaction(self):
self.new_connection._rollback()
self.new_connection.close()
self.new_connection._leave_transaction_management(True)
def get_exception(self):
from django.db.models.sql.query import LockNotAvailable
return LockNotAvailable
def run_select_for_update(self, status, nowait=False):
status.append('started')
try:
connection._rollback()
people = Person.objects.all().select_for_update(nowait=nowait)
# for person in people:
# person.name = 'Fred'
# person.save()
# XXX
people = list(people)
connection._commit()
status.append('finished')
except Exception, e:
status.append(e)
@skipUnlessDBFeature('has_select_for_update_nowait')
def test_nowait_raises_error_on_block(self):
"""
If nowait is specified, we expect an error to be raised rather
than blocking.
"""
self.start_blocking_transaction()
status = []
thread = threading.Thread(
target=self.run_select_for_update,
args=(status,),
kwargs={'nowait': True},
)
thread.start()
# We should find the thread threw an exception
time.sleep(1)
self.end_blocking_transaction()
thread.join()
self.assertTrue(isinstance(status[-1], self.get_exception()))
The interesting part is the code marked with # XXX. As it stands, that line produces a simple empty list. Commenting that and uncommenting the 'for person in people' stanza causes a DatabaseError to be (correctly) raised.
This is a problem for this patch, as it's difficult to tell the difference between no records returned, and the database being unable to get a row lock in the case for FOR UPDATE NOWAIT.
Cheers,
Dan
--
Dan Fairs | dan....@gmail.com | www.fezconsulting.com
On 12/02/2010 03:43 PM, Dan Fairs wrote:
> My gut feeling is that this boils down to this vastly simplified demonstration of how list() works:
>
>>>> class Foo(object):
> ... def __len__(self):
> ... print 'len called'
> ... raise ValueError
> ... def __iter__(self):
> ... return iter([1,2,3])
> ...
>>>> a = Foo()
>>>> list(a)
> len called
> [1, 2, 3]
>
> Here, you can see that when converting to a list, Python calls __len__, and if that raises an exception, discards it and goes on to call __iter__.
>
> So - my hypothesis (unproved, as I could benefit from someone with deeper ORM knowledge) is that the call to list() in my original test case calls QuerySet.__len__(), which ends up raising a DatabaseError (caused by an underlying database lock, the behaviour I'm actually testing for). Python's subsequent call to QuerySet.__iter__() succeeds, but ends up returning an empty iterator due to some pre-existing state *handwaving here*.
>
> It's the handwaving bit I'm not sure about :). Does that hypothesis sound plausible? It seems to be borne out by the snippet below, where I've removed the underlying table:
I ran into this issue, too. My workaround was to place a
if hasattr(self._iter, '__len__'):
len(self._iter)
and the same for self.generator in QuerySet.__len__.
Btw, there's a bug report for this from 2009:
http://bugs.python.org/issue1242657
I was wondering whether the bug still exists in Python 2.x because I
take it only AttributeError and TypeError should be ignored while
calling __len__ in list() -- if that's the case, the bug is definitely
still present.
Anyway, I propose to call len() on those iterators, if possible, before
calling list() because otherwise all bugs in backends will be swallowed.
Jonas
> On 12/02/2010 03:43 PM, Dan Fairs wrote:
>> My gut feeling is that this boils down to this vastly simplified demonstration of how list() works:
>>
>>>>> class Foo(object):
>> ... def __len__(self):
>> ... print 'len called'
>> ... raise ValueError
>> ... def __iter__(self):
>> ... return iter([1,2,3])
>> ...
>>>>> a = Foo()
>>>>> list(a)
>> len called
>> [1, 2, 3]
>>
>> Here, you can see that when converting to a list, Python calls __len__, and if that raises an exception, discards it and goes on to call __iter__.
>>
>> So - my hypothesis (unproved, as I could benefit from someone with deeper ORM knowledge) is that the call to list() in my original test case calls QuerySet.__len__(), which ends up raising a DatabaseError (caused by an underlying database lock, the behaviour I'm actually testing for). Python's subsequent call to QuerySet.__iter__() succeeds, but ends up returning an empty iterator due to some pre-existing state *handwaving here*.
>>
>> It's the handwaving bit I'm not sure about :). Does that hypothesis sound plausible? It seems to be borne out by the snippet below, where I've removed the underlying table:
>
> I ran into this issue, too. My workaround was to place a
>
> if hasattr(self._iter, '__len__'):
> len(self._iter)
>
> and the same for self.generator in QuerySet.__len__.
>
> Btw, there's a bug report for this from 2009: http://bugs.python.org/issue1242657
>
Right - I remember that regression actually.
> I was wondering whether the bug still exists in Python 2.x because I take it only AttributeError and TypeError should be ignored while calling __len__ in list() -- if that's the case, the bug is definitely still present.
True - I'll give it a try in 2.7 when I have a moment to grab and build it. That doesn't help us that much though, as 2.6 (which I'm running on, 2.6.1 to be precise) is a supported version.
>
> Anyway, I propose to call len() on those iterators, if possible, before calling list() because otherwise all bugs in backends will be swallowed.
That makes sense. I'd considered proposing storing any exception raised in __len__ on an instance variable to be re-raised later; but your solution is better, as you'd get better tracebacks.
Should I raise a bug for this behaviour? Working up a test and patch should be straightforward for this one, if we agree it's a bug that needs fixing (even if it's a workaround for the underlying Python issue).
<snip>
I tested it on my machine and it raises ValueError on:
* Python 2.4.6
* Python 2.5.5
* Python 2.6.6
* Python 2.7.0+
* Python 3.1.2
If you have version 2.6.1, you should upgrade. I don't think Django
should support a Python bug that was fixed almost 2 years ago.
--
Łukasz Rekucki
Cool - the Python I'm using is a Mac OS X Snow Leopard system Python. I'll rebuild my dev environment with a newer Python.
I just checked, it doesn't exist in 2.7...
>> Anyway, I propose to call len() on those iterators, if possible, before calling list() because otherwise all bugs in backends will be swallowed.
>
> That makes sense. I'd considered proposing storing any exception raised in __len__ on an instance variable to be re-raised later; but your solution is better, as you'd get better tracebacks.
>
> Should I raise a bug for this behaviour? Working up a test and patch should be straightforward for this one, if we agree it's a bug that needs fixing (even if it's a workaround for the underlying Python issue).
... so even though it's not a Python bug but a Python "feature" I find
it rather annyoing that list(x) swallows all TypeErrors in x.__len__ --
think about all the cases in that TypeErrors are raised! Wrong number of
arguments, unsupported objects passed as argument e.g. to str/dict/list
methods, ...
Regarding the fact that the calls to list() in QuerySet.__len__ is the
central point to backend invocation I'm strongly for special-handling
this case.
Jonas