Text vs. PGText possible bug

David Gardner

ungelesen,

24.09.2009, 20:21:0924.09.09

an sqlal...@googlegroups.com

Ran across something that I suspect might be a bug. If I define my
table like:

asset_table = Table('asset', metadata,
Column('path', Text, primary_key=True,
server_default=FetchedValue(),
server_onupdate=FetchedValue()),
autoload=True)

Then anytime I query for an asset and eagerload a related table the
backref on the related table isn't populated, causing a second query to
the DB.
If instead I define that column of type PGText then the backrefs are
populated properly. I attached a test which is a simplified version of
my table mappings.

Attached is a test of this behavior. The output when the column is
defined as Text or String looks like:
testshow/eps/201/s01/t01
testshow/chr/test/test
2009-09-24 17:17:03,214 INFO sqlalchemy.engine.base.Engine.0x...1f10
SELECT asset.updated AS asset_updated, asset.name AS asset_name,
asset.type AS asset_type, asset.path AS asset_path, asset.parent AS
asset_parent, asset.is_file AS asset_is_file, asset.created_by AS
asset_created_by
FROM asset
WHERE asset.path = %(param_1)s
2009-09-24 17:17:03,214 INFO sqlalchemy.engine.base.Engine.0x...1f10
{'param_1': 'testshow/eps/201/s01/t01'}
testshow/eps/201/s01/t01

When defined as PGText the output is:
testshow/eps/201/s01/t01
testshow/chr/test/test
testshow/eps/201/s01/t01

--
David Gardner
Pipeline Tools Programmer
Jim Henson Creature Shop
dgar...@creatureshop.com

eager_loadtest.py

Michael Bayer

ungelesen,

24.09.2009, 21:05:4924.09.09

an sqlal...@googlegroups.com

On Sep 24, 2009, at 8:21 PM, David Gardner wrote:

> Ran across something that I suspect might be a bug. If I define my
> table like:
>
> asset_table = Table('asset', metadata,
> Column('path', Text, primary_key=True,
> server_default=FetchedValue(),
> server_onupdate=FetchedValue()),
> autoload=True)
>
> Then anytime I query for an asset and eagerload a related table the
> backref on the related table isn't populated, causing a second query
> to
> the DB.
> If instead I define that column of type PGText then the backrefs are
> populated properly. I attached a test which is a simplified version of
> my table mappings.

what does the logging output say if you turn on logging.INFO for the
"sqlalchemy.orm" logger ? that would illustrate some things about the
join conditions.

For me to test this I'd have to build up some table names and guess
what you have for those defaults....can you share your table
definitions ?

>
> Attached is a test of this behavior. The output when the column is
> defined as Text or String looks like:
> testshow/eps/201/s01/t01
> testshow/chr/test/test
> 2009-09-24 17:17:03,214 INFO sqlalchemy.engine.base.Engine.0x...1f10
> SELECT asset.updated AS asset_updated, asset.name AS asset_name,
> asset.type AS asset_type, asset.path AS asset_path, asset.parent AS
> asset_parent, asset.is_file AS asset_is_file, asset.created_by AS
> asset_created_by
> FROM asset
> WHERE asset.path = %(param_1)s
> 2009-09-24 17:17:03,214 INFO sqlalchemy.engine.base.Engine.0x...1f10
> {'param_1': 'testshow/eps/201/s01/t01'}
> testshow/eps/201/s01/t01
>
>
> When defined as PGText the output is:
> testshow/eps/201/s01/t01
> testshow/chr/test/test
> testshow/eps/201/s01/t01
>
>
> --
> David Gardner
> Pipeline Tools Programmer
> Jim Henson Creature Shop
> dgar...@creatureshop.com
>
>
> >

> import sys
> from sqlalchemy import *
> from sqlalchemy.orm import *
> from sqlalchemy.types import *
> from sqlalchemy.databases.postgres import PGText
>
> DB_HOST = 'localhost'
> DB_NAME = 'test_db'
> DB_USER = 'testuser'
> DB_PASS = 'testpass'
> db_uri = 'postgres://%s:%s@%s/%s' % (DB_USER,DB_PASS,DB_HOST,DB_NAME)
>
> db = create_engine (db_uri)
> metadata = MetaData(db)
>
> class Asset(object):
> pass
>
> class AssetRelation(object):
> pass
>
> #asset_table = Table('asset', metadata,autoload=True)
>
> #asset_table = Table('asset', metadata,
> # Column('path', Text, primary_key=True,
> # server_default=FetchedValue(),
> # server_onupdate=FetchedValue()),
> # autoload=True)

>
>
> asset_table = Table('asset', metadata,

> Column('path', PGText, primary_key=True,
> server_default=FetchedValue(),
> server_onupdate=FetchedValue()),
> autoload=True)
>
> relation_table = Table('relation',metadata, autoload=True)
>
> asset_mapper = mapper(Asset, asset_table,
> properties = {
> 'Related' : relation(AssetRelation, backref='Source',
> primaryjoin
> =
> asset_table.c.path
> =
> =
> relation_table.c.src_asset
> ,order_by=relation_table.c.target_asset,lazy=True)
> })
>
> mapper(AssetRelation, relation_table, properties = {
> 'Target' : relation(Asset, backref='Relatee',
> primaryjoin=asset_table.c.path==relation_table.c.target_asset,
> viewonly=True,lazy=False)
> })
>
> session=create_session()
> a=session.query(Asset).options(eagerload(Asset.Related)).get
> ('testshow/eps/201/s01/t01')
> db.echo=True
> print a.path
> r=a.Related[0]
> print r.target_asset
> b=r.Source
> print b.path
> session.close()
>
> sys.exit(0)

David Gardner

ungelesen,

25.09.2009, 13:10:5425.09.09

an sqlal...@googlegroups.com

I re-worked my test a little bit to use logging, and to switch table definitions based on a command line argument.
I was able to trim down my schema to just the two tables in question, although the behavior is the same regardless
of the related table, and attached is that schema along with some data to make the test work.

Also I'm using SA 0.5.6, Postgres 8.4.1, psycopg2 2.0.12, python 2.5.4 on Debian Squeeze.

dgardner@cssun32 ~/assetdb/one-time:$ python eager_loadtest.py
using PGText
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Source lazy loading clause asset.path = %(param_1)s
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Source will use query.get() to optimize instance loads
INFO:sqlalchemy.orm.strategies.LazyLoader:Asset.Related lazy loading clause %(param_1)s = relation.src_asset
INFO:sqlalchemy.orm.strategies.LazyLoader:Asset.Relatee lazy loading clause %(param_1)s = relation.target_asset
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Target lazy loading clause asset.path = %(param_1)s
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Target will use query.get() to optimize instance loads
testshow/eps/201/s01/t01
testshow/chr/test/test
testshow/eps/201/s01/t01
dgardner@cssun32 ~/assetdb/one-time:$ python eager_loadtest.py text
using Text
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Source lazy loading clause asset.path = %(param_1)s
INFO:sqlalchemy.orm.strategies.LazyLoader:Asset.Related lazy loading clause %(param_1)s = relation.src_asset
INFO:sqlalchemy.orm.strategies.LazyLoader:Asset.Relatee lazy loading clause %(param_1)s = relation.target_asset
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Target lazy loading clause asset.path = %(param_1)s
testshow/eps/201/s01/t01
testshow/chr/test/test
testshow/eps/201/s01/t01
dgardner@cssun32 ~/assetdb/one-time:$ python eager_loadtest.py auto
using auto
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Source lazy loading clause asset.path = %(param_1)s
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Source will use query.get() to optimize instance loads
INFO:sqlalchemy.orm.strategies.LazyLoader:Asset.Related lazy loading clause %(param_1)s = relation.src_asset
INFO:sqlalchemy.orm.strategies.LazyLoader:Asset.Relatee lazy loading clause %(param_1)s = relation.target_asset
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Target lazy loading clause asset.path = %(param_1)s
INFO:sqlalchemy.orm.strategies.LazyLoader:AssetRelation.Target will use query.get() to optimize instance loads
testshow/eps/201/s01/t01
testshow/chr/test/test
testshow/eps/201/s01/t01

eager_loadtest.py

testdb_schema.sql

testdb_data.sql

Michael Bayer

ungelesen,

25.09.2009, 14:42:2325.09.09

an sqlal...@googlegroups.com

David Gardner wrote:
> I re-worked my test a little bit to use logging, and to switch table
> definitions based on a command line argument.
> I was able to trim down my schema to just the two tables in question,
> although the behavior is the same regardless
> of the related table, and attached is that schema along with some data
> to make the test work.

I have a theory what this is. Can you try this patch against 0.5.6:

Index: lib/sqlalchemy/sql/expression.py
===================================================================
--- lib/sqlalchemy/sql/expression.py (revision 6377)
+++ lib/sqlalchemy/sql/expression.py (working copy)
@@ -2011,7 +2011,11 @@
the same type.

"""
- return isinstance(other, _BindParamClause) and
other.type.__class__ == self.type.__class__ and self.value == other.value
+ return isinstance(other, _BindParamClause) and \
+ len(set(other.type.__class__.__mro__).intersection(
+ self.type.__class__.__mro__
+ )) > 3 \
+ and self.value == other.value

def __getstate__(self):
"""execute a deferred value for serialization purposes."""

if that works let's make a trac ticket to add tests for this.

David Gardner

ungelesen,

25.09.2009, 15:45:1525.09.09

an sqlal...@googlegroups.com

Thanks for taking time on this one. That definitely fixes the problem. I
entered it in as Ticket #1556.
I am willing to write a test for this defect, but have no idea how to
check for this other than to turn
on engine.echo and spot it on the console.

> I have a theory what this is. Can you try this patch against 0.5.6:
>
> Index: lib/sqlalchemy/sql/expression.py
> ===================================================================
> --- lib/sqlalchemy/sql/expression.py (revision 6377)
> +++ lib/sqlalchemy/sql/expression.py (working copy)
> @@ -2011,7 +2011,11 @@
> the same type.
>
> """
> - return isinstance(other, _BindParamClause) and
> other.type.__class__ == self.type.__class__ and self.value == other.value
> + return isinstance(other, _BindParamClause) and \
> + len(set(other.type.__class__.__mro__).intersection(
> + self.type.__class__.__mro__
> + )) > 3 \
> + and self.value == other.value
>
> def __getstate__(self):
> """execute a deferred value for serialization purposes."""
>
>
> if that works let's make a trac ticket to add tests for this.
>
>
>
>
>

Michael Bayer

ungelesen,

25.09.2009, 15:54:4825.09.09

an sqlal...@googlegroups.com

David Gardner wrote:
>
> Thanks for taking time on this one. That definitely fixes the problem. I
> entered it in as Ticket #1556.
> I am willing to write a test for this defect, but have no idea how to
> check for this other than to turn
> on engine.echo and spot it on the console.

oh its fine I can do it. we have a testing style that is like this:

def go():
eq_(query.all(), [Foo(), Bar(), Bat()])
self.assert_sql_count(testing.db, go, 0)

i.e. run go(), do the eq_() assertion, then when complete assert that 0
SQL statements were executed.

David Gardner

ungelesen,

25.09.2009, 17:29:1025.09.09

an sqlal...@googlegroups.com

So I have applied the patch on my development box, but my production
server is still running 0.5.6 un-patched.
Can you think of any reason that defining the column as PGText won't
work as a temporary workaround?

> oh its fine I can do it. we have a testing style that is like this:
>
> def go():
> eq_(query.all(), [Foo(), Bar(), Bat()])
> self.assert_sql_count(testing.db, go, 0)
>
> i.e. run go(), do the eq_() assertion, then when complete assert that 0
> SQL statements were executed.
>
>

Allen antworten

Antwort an Autor

Weiterleiten