[Q] Why the following emits SQL for slot access?

34 views
Skip to first unread message

Ladislav Lenart

unread,
Sep 27, 2012, 1:55:04 PM9/27/12
to sqlal...@googlegroups.com
Hello again.

I have the following test query:

def test_slot_access_after_query(self):
q = self.session.query(Foo, Bar).with_labels()
q = q.filter(Foo.bar_id == Bar.id)
rows = q.all()
assert len(rows) > 0
with self.assert_no_sql_while():
for (each_foo, _each_bar) in rows:
each.bar.data # <-- FAILS HERE

SQLAlchemy emits SQL to fetch each.bar's data. I do not understand why. I
expected that:
* Bar is already cached in the session.
* each.bar first checks if a bar with the appropriate id is present in the
session and if so, return it.

What am I doing wrong?


Thank you,

Ladislav Lenart

Michael Bayer

unread,
Sep 27, 2012, 2:06:08 PM9/27/12
to sqlal...@googlegroups.com
not sure, is "each.bar" a typo and you meant "_each_bar" ? or are you trying to fetch a relationship along Foo-> Bar ? if its a relationship and the linkage is not a simple many-to-one, then it emits SQL.


Ladislav Lenart

unread,
Sep 27, 2012, 2:31:42 PM9/27/12
to sqlal...@googlegroups.com, Michael Bayer
Hello.

> not sure, is "each.bar" a typo and you meant "_each_bar" ?

No, I mean each.bar.


> or are you trying to fetch a relationship along Foo-> Bar ? if its a
relationship and the linkage is not a simple many-to-one, then it emits SQL.

Yes, I am trying to fetch Foo -> Bar relationship. It is a one-way many-to-one
relationship: Foo has bar_id FK to bar (id).


I still don't understand what I see, but I figured out the cause.

This works:

def test_access_ok(self):
q = self.session.query(Foo, Bar).with_labels()
q = q.filter(Foo.bar_id == Bar.id)
rows = q.all() # IMPORTANT LINE
assert len(rows) > 0
with self.assert_no_sql_while():
for (each_foo, _each_bar) in rows:
each.bar.data

This does not:

def test_access_ko(self):
q = self.session.query(Foo, Bar).with_labels()
q = q.filter(Foo.bar_id == Bar.id)
rows = list(unwrap(q)) # IMPORTANT LINE
assert len(rows) > 0
with self.assert_no_sql_while():
for each in rows:
each.bar.data # <-- FAILS HERE

unwrap is defined as

def unwrap(query):
for each in query:
if isinstance(each, tuple) and len(each) > 0:
yield each[0]
else:
yield each

I wanted to use unwrap to minimize the impact of the queries I have to modify
slightly to overcome SA 0.7 limitation of of_type() in combination with a table
inheritance (we spoke about this in another thread). To make SA fetch everything
I need, the query must return more results (tuples of instances instead of the
instances directly), so I can joinedload from them too. With unwrap in place,
the rest of the query processing can remain intact.

Do you have any idea why the unwrap version does not work?

Thank you,

Ladislav Lenart

Ladislav Lenart

unread,
Sep 27, 2012, 3:17:24 PM9/27/12
to sqlal...@googlegroups.com, Michael Bayer
Update.

Te unwrap version works only if the original result of q.all() is kept around:

def test_access(self):
q = self.session.query(Foo, Bar).with_labels()
q = q.filter(Foo.bar_id == Bar.id)
# Only this combination works:
_rows = q.all()
rows = unwrap(_rows)
assert len(rows) > 0
with self.assert_no_sql_while():
for each in rows:
each.bar.data

I have also rewritten unwrap to:

def unwrap(items):
def f(each):
if isinstance(each, tuple) and len(each) > 0:
return each[0]
else:
return each
return map(f, items)

because it is now completely useless as a generator.

I am completely lost now. Does this have to do something with weak references
and too eager garbage collection? I would expect that if a query fetches some
objects, session will keep them all until explicitly removed from it to
eliminate successive SQL queries.


Ladislav Lenart

Michael Bayer

unread,
Sep 27, 2012, 4:20:03 PM9/27/12
to sqlal...@googlegroups.com
the Session does not strongly reference items that have no pending changes on them. This allows code such as this:

for criterion in crit:
objects = session.query(Cls).filter(criterion).all()
_process_objects(objects)

above, if the Session strongly referenced all objects until explicitly removed, logic like the above would cause memory to grow unbounded.

Of course, if the objects were referred to by other objects, such as members of a collection or inter-object reference, they get a strong reference via that collection, but in your example there is no linkage between "Foo" and "Bar" until you actually invoke the path between two particular instances.

cPython uses referencing counting for garbage collection, so assuming an object has no reference cycles, it is garbage collected immediately as soon as its strong reference count goes to zero.




Ladislav Lenart

unread,
Sep 27, 2012, 4:37:59 PM9/27/12
to sqlal...@googlegroups.com, Michael Bayer
Hello.

Thank you for the explanation. I was relatively close :-)

I understand the rationale but don't like the impact it has:
If I query the database for something, I expect it to be available via slot
access afterwards, but unless I hold onto the results, it won't be.

I guess this is one of those tricky bits one has to be caution about :-)

Ladislav Lenart

Michael Bayer

unread,
Sep 27, 2012, 7:34:14 PM9/27/12
to sqlal...@googlegroups.com

On Sep 27, 2012, at 4:37 PM, Ladislav Lenart wrote:

> Hello.
>
> Thank you for the explanation. I was relatively close :-)
>
> I understand the rationale but don't like the impact it has:
> If I query the database for something, I expect it to be available via slot
> access afterwards, but unless I hold onto the results, it won't be.
>
> I guess this is one of those tricky bits one has to be caution about :-)

typically when I really want things to be "cached", I do this:

cache = dict((obj.id, obj) for obj in session.query(Class))

with those strong references set up, I control the scope of objects as though the Session weren't really involved.

The Session has an option for a "strong referencing" identity map, which is a holdover from the old days when the whole "weak" idea wasn't available yet. I've tried to deprecate this approach though there are some users still using it. It should be in the docs if you look for it.
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>

lena...@volny.cz

unread,
Sep 28, 2012, 6:59:53 AM9/28/12
to mik...@zzzcomputing.com, sqlal...@googlegroups.com
Hello.

Thank you for the tips. I stick to my simple approach for the time being:

_rows = q.all()
rows = unwrap(_rows)

I need the cache within the context of one method only.

Ladislav Lenart
Reply all
Reply to author
Forward
0 new messages