duplicate column name error when subquery() or join() a subclass

1,737 views
Skip to first unread message

Wubin

unread,
May 3, 2012, 4:05:14 PM5/3/12
to sqlalchemy
Hi, I created a class "Resource", and its subclass "BasePrAc" as
below:

class Resource(BaseClass.BaseClass):
__tablename__ = "resources"
id = Column("id", Integer, primary_key=True, key="id")

class BasePrAc(Resource.Resource):
__tablename__ = "base_products_accessories"
id = Column("id", Integer, ForeignKey("resources.id"),
primary_key=True)


And when I try to "query.join" BasePrAc or its subquery, it always
gives me the OperationalError(1060, "Duplicate column name 'id'"). I
checked the generated SQL statement, and found the subquery part shown
as below:

"..... INNER JOIN (SELECT resources.id AS id,
base_products_accessories.id AS id,
base_products_accessories.polymorphic_identity AS
polymorphic_identity, ...."

As you see, there are two "AS id" in the statement and causing the
error. I was wondering why it asks "resource.id" which I didn't
specify, and it would know that BasePrAc.id is equal to Resource.id
since BasePrAc class is the subclass of Resource class. Some suggested
to use "alias" for joining, and I tried but it didn't work. Besides, I
am hoping not to use alias everytime, so joining things could be more
generic.

I will appreciate if you have any advise for this situation.

Thank you very much, and have a nice day.
W

Michael Bayer

unread,
May 3, 2012, 4:28:01 PM5/3/12
to sqlal...@googlegroups.com
the test script below fails to reproduce this issue. Can you provide a fully reproducing test script as well as version information /DB backend ?

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base= declarative_base()

class Resource(Base):
__tablename__ = "resources"
id = Column("id", Integer, primary_key=True, key="id")
type = Column(String)
parent_id = Column(Integer, ForeignKey('parent.id'))
__mapper_args__ = {'polymorphic_on':type}

class BasePrAc(Resource):
__tablename__ = "base_products_accessories"
id = Column("id", Integer, ForeignKey("resources.id"), primary_key=True)
__mapper_args__ = {'polymorphic_identity':'bpa'}


class Parent(Base):
__tablename__ = "parent"
id = Column("id", Integer, primary_key=True)
resources = relationship(Resource)

e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
s.add_all([
Parent(
resources=[
BasePrAc(),
BasePrAc(),
]
)
])
print s.query(Parent, BasePrAc).join(BasePrAc).all()

output:

CREATE TABLE parent (
id INTEGER NOT NULL,
PRIMARY KEY (id)
)


2012-05-03 16:25:28,155 INFO sqlalchemy.engine.base.Engine ()
2012-05-03 16:25:28,155 INFO sqlalchemy.engine.base.Engine COMMIT
2012-05-03 16:25:28,155 INFO sqlalchemy.engine.base.Engine
CREATE TABLE resources (
id INTEGER NOT NULL,
type VARCHAR,
parent_id INTEGER,
PRIMARY KEY (id),
FOREIGN KEY(parent_id) REFERENCES parent (id)
)


2012-05-03 16:25:28,155 INFO sqlalchemy.engine.base.Engine ()
2012-05-03 16:25:28,156 INFO sqlalchemy.engine.base.Engine COMMIT
2012-05-03 16:25:28,156 INFO sqlalchemy.engine.base.Engine
CREATE TABLE base_products_accessories (
id INTEGER NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY(id) REFERENCES resources (id)
)


2012-05-03 16:25:28,156 INFO sqlalchemy.engine.base.Engine ()
2012-05-03 16:25:28,156 INFO sqlalchemy.engine.base.Engine COMMIT
2012-05-03 16:25:28,160 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2012-05-03 16:25:28,160 INFO sqlalchemy.engine.base.Engine INSERT INTO parent DEFAULT VALUES
2012-05-03 16:25:28,161 INFO sqlalchemy.engine.base.Engine ()
2012-05-03 16:25:28,161 INFO sqlalchemy.engine.base.Engine INSERT INTO resources (type, parent_id) VALUES (?, ?)
2012-05-03 16:25:28,162 INFO sqlalchemy.engine.base.Engine ('bpa', 1)
2012-05-03 16:25:28,162 INFO sqlalchemy.engine.base.Engine INSERT INTO resources (type, parent_id) VALUES (?, ?)
2012-05-03 16:25:28,162 INFO sqlalchemy.engine.base.Engine ('bpa', 1)
2012-05-03 16:25:28,162 INFO sqlalchemy.engine.base.Engine INSERT INTO base_products_accessories (id) VALUES (?)
2012-05-03 16:25:28,162 INFO sqlalchemy.engine.base.Engine ((1,), (2,))
2012-05-03 16:25:28,163 INFO sqlalchemy.engine.base.Engine SELECT parent.id AS parent_id, anon_1.base_products_accessories_id AS anon_1_base_products_accessories_id, anon_1.resources_id AS anon_1_resources_id, anon_1.resources_type AS anon_1_resources_type, anon_1.resources_parent_id AS anon_1_resources_parent_id
FROM parent JOIN (SELECT resources.id AS resources_id, resources.type AS resources_type, resources.parent_id AS resources_parent_id, base_products_accessories.id AS base_products_accessories_id
FROM resources JOIN base_products_accessories ON resources.id = base_products_accessories.id) AS anon_1 ON parent.id = anon_1.resources_parent_id
2012-05-03 16:25:28,164 INFO sqlalchemy.engine.base.Engine ()
[(<__main__.Parent object at 0x1014fddd0>, <__main__.BasePrAc object at 0x101604390>), (<__main__.Parent object at 0x1014fddd0>, <__main__.BasePrAc object at 0x101604490>)]
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>

Wu-bin Zhen

unread,
May 3, 2012, 7:00:16 PM5/3/12
to sqlal...@googlegroups.com
Hi Michael,

I modified your code and got the error with subquery(). I remember I used to have the same error with the join() method but couldn't reproduce it with this sample. The SQLAlchemy version is still 0.6.8.
I really appreciate your reply and help.

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base= declarative_base()

class BaseClass(object):
id = Column("id", Integer, primary_key=True, key="id")

class Resource(Base, BaseClass):
__tablename__ = "resources"

class BasePrAc(Resource):
__tablename__ = "base_products_accessories"
_polymorphicIdentity = Column("polymorphic_identity", String(20), key="polymorphicIdentity")
__mapper_args__ = {
'polymorphic_on': _polymorphicIdentity,
'polymorphic_identity': None
}
id = Column("id", Integer, ForeignKey("resources.id"), primary_key=True)

class Product(BasePrAc):
__tablename__ = "products"
__mapper_args__ = {
'polymorphic_identity': 'Product'
}
id = Column("id", Integer, ForeignKey("base_products_accessories.id"), primary_key=True)

class Accessory(BasePrAc):
__tablename__ = "accessories"
__mapper_args__ = {
'polymorphic_identity': 'Accessory'
}
id = Column("id", Integer, ForeignKey("base_products_accessories.id"), primary_key=True)

class Post(Base, BaseClass):
__tablename__ = "posts"
basePrAcId = Column(Integer, ForeignKey('base_products_accessories.id'))
basePrAc = relationship("BasePrAc", uselist=False, backref=backref("_posts", collection_class=set))

e = create_engine("mysql://root@localhost/sample?charset=utf8", echo=True)
Base.metadata.create_all(e)
s = Session(e)
s.add_all([
Post(
basePrAc=Product()
),
Post(
basePrAc=Accessory()
),
Post(
basePrAc=Accessory()
)
])
sq = s.query(BasePrAc).subquery()
q = s.query(Post).join((sq, sq.c.id == Post.basePrAcId)).all()
print q
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2012-05-03 18:42:03,539 INFO sqlalchemy.engine.base.Engine.0x...ae90 
CREATE TABLE resources (
id INTEGER NOT NULL AUTO_INCREMENT, 
PRIMARY KEY (id)
)


2012-05-03 18:42:03,539 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:03,657 INFO sqlalchemy.engine.base.Engine.0x...ae90 COMMIT
2012-05-03 18:42:03,658 INFO sqlalchemy.engine.base.Engine.0x...ae90 
CREATE TABLE base_products_accessories (
polymorphic_identity VARCHAR(20), 
id INTEGER NOT NULL, 
PRIMARY KEY (id), 
FOREIGN KEY(id) REFERENCES resources (id)
)


2012-05-03 18:42:03,658 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:03,769 INFO sqlalchemy.engine.base.Engine.0x...ae90 COMMIT
2012-05-03 18:42:03,770 INFO sqlalchemy.engine.base.Engine.0x...ae90 
CREATE TABLE accessories (
id INTEGER NOT NULL, 
PRIMARY KEY (id), 
FOREIGN KEY(id) REFERENCES base_products_accessories (id)
)


2012-05-03 18:42:03,770 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:03,893 INFO sqlalchemy.engine.base.Engine.0x...ae90 COMMIT
2012-05-03 18:42:03,893 INFO sqlalchemy.engine.base.Engine.0x...ae90 
CREATE TABLE posts (
id INTEGER NOT NULL AUTO_INCREMENT, 
`basePrAcId` INTEGER, 
PRIMARY KEY (id), 
FOREIGN KEY(`basePrAcId`) REFERENCES base_products_accessories (id)
)


2012-05-03 18:42:03,893 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:04,016 INFO sqlalchemy.engine.base.Engine.0x...ae90 COMMIT
2012-05-03 18:42:04,017 INFO sqlalchemy.engine.base.Engine.0x...ae90 
CREATE TABLE products (
id INTEGER NOT NULL, 
PRIMARY KEY (id), 
FOREIGN KEY(id) REFERENCES base_products_accessories (id)
)


2012-05-03 18:42:04,017 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:04,140 INFO sqlalchemy.engine.base.Engine.0x...ae90 COMMIT
/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/sql/expression.py:2104: SAWarning: Column 'id' on table <sqlalchemy.sql.expression.Select at 0x1bb1390; Select object> being replaced by another column with the same key.  Consider use_labels for select() statements.
  self[column.key] = column
2012-05-03 18:42:04,145 INFO sqlalchemy.engine.base.Engine.0x...ae90 BEGIN (implicit)
2012-05-03 18:42:04,146 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO resources () VALUES ()
2012-05-03 18:42:04,146 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:04,147 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO resources () VALUES ()
2012-05-03 18:42:04,147 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:04,147 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO resources () VALUES ()
2012-05-03 18:42:04,147 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
2012-05-03 18:42:04,148 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO base_products_accessories (polymorphic_identity, id) VALUES (%s, %s)
2012-05-03 18:42:04,148 INFO sqlalchemy.engine.base.Engine.0x...ae90 ('Product', 1L)
2012-05-03 18:42:04,148 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO base_products_accessories (polymorphic_identity, id) VALUES (%s, %s)
2012-05-03 18:42:04,148 INFO sqlalchemy.engine.base.Engine.0x...ae90 ('Accessory', 2L)
2012-05-03 18:42:04,149 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO base_products_accessories (polymorphic_identity, id) VALUES (%s, %s)
2012-05-03 18:42:04,149 INFO sqlalchemy.engine.base.Engine.0x...ae90 ('Accessory', 3L)
2012-05-03 18:42:04,149 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO accessories (id) VALUES (%s)
2012-05-03 18:42:04,149 INFO sqlalchemy.engine.base.Engine.0x...ae90 (2L,)
2012-05-03 18:42:04,150 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO accessories (id) VALUES (%s)
2012-05-03 18:42:04,150 INFO sqlalchemy.engine.base.Engine.0x...ae90 (3L,)
2012-05-03 18:42:04,150 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO products (id) VALUES (%s)
2012-05-03 18:42:04,150 INFO sqlalchemy.engine.base.Engine.0x...ae90 (1L,)
2012-05-03 18:42:04,151 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO posts (`basePrAcId`) VALUES (%s)
2012-05-03 18:42:04,151 INFO sqlalchemy.engine.base.Engine.0x...ae90 (1L,)
2012-05-03 18:42:04,152 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO posts (`basePrAcId`) VALUES (%s)
2012-05-03 18:42:04,152 INFO sqlalchemy.engine.base.Engine.0x...ae90 (2L,)
2012-05-03 18:42:04,152 INFO sqlalchemy.engine.base.Engine.0x...ae90 INSERT INTO posts (`basePrAcId`) VALUES (%s)
2012-05-03 18:42:04,152 INFO sqlalchemy.engine.base.Engine.0x...ae90 (3L,)
2012-05-03 18:42:04,154 INFO sqlalchemy.engine.base.Engine.0x...ae90 SELECT posts.id AS posts_id, posts.`basePrAcId` AS `posts_basePrAcId` 
FROM posts INNER JOIN (SELECT base_products_accessories.polymorphic_identity AS polymorphic_identity, resources.id AS id, base_products_accessories.id AS id 
FROM resources INNER JOIN base_products_accessories ON resources.id = base_products_accessories.id) AS anon_1 ON anon_1.id = posts.`basePrAcId`
2012-05-03 18:42:04,154 INFO sqlalchemy.engine.base.Engine.0x...ae90 ()
Traceback (most recent call last):
  File "sample-20120503.py", line 56, in <module>
    q = s.query(Post).join((sq, sq.c.id == Post.basePrAcId)).all()
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/orm/query.py", line 1611, in all
    return list(self)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/orm/query.py", line 1721, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/orm/query.py", line 1726, in _execute_and_instances
    mapper=self._mapper_zero_or_none())
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/orm/session.py", line 724, in execute
    clause, params or {})
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/engine/base.py", line 1191, in execute
    params)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/engine/base.py", line 1271, in _execute_clauseelement
    return self.__execute_context(context)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/engine/base.py", line 1302, in __execute_context
    context.parameters[0], context=context)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/engine/base.py", line 1401, in _cursor_execute
    context)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/engine/base.py", line 1394, in _cursor_execute
    context)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.8-py2.7.egg/sqlalchemy/engine/default.py", line 299, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
sqlalchemy.exc.OperationalError: (OperationalError) (1060, "Duplicate column name 'id'") 'SELECT posts.id AS posts_id, posts.`basePrAcId` AS `posts_basePrAcId` \nFROM posts INNER JOIN (SELECT base_products_accessories.polymorphic_identity AS polymorphic_identity, resources.id AS id, base_products_accessories.id AS id \nFROM resources INNER JOIN base_products_accessories ON resources.id = base_products_accessories.id) AS anon_1 ON anon_1.id = posts.`basePrAcId`' ()
------------------------------------------------------------------------------------------------------------------------------

Michael Bayer

unread,
May 3, 2012, 8:15:13 PM5/3/12
to sqlal...@googlegroups.com

On May 3, 2012, at 7:00 PM, Wu-bin Zhen wrote:

OK that test illustrates what's going on, thanks for that.


>
> As you see, there are two "AS id" in the statement and causing the
> error. I was wondering why it asks "resource.id" which I didn't
> specify, and it would know that BasePrAc.id is equal to Resource.id
> since BasePrAc class is the subclass of Resource class. Some suggested
> to use "alias" for joining, and I tried but it didn't work. Besides, I
> am hoping not to use alias everytime, so joining things could be more
> generic.

It's true, the column loader here is actually aware of both columns at the same time.   It's very easy to make it render one or the other and no others, such as via a patch like this:

diff -r 572d4ebbca4b lib/sqlalchemy/orm/strategies.py
--- a/lib/sqlalchemy/orm/strategies.py Sun Apr 29 18:53:29 2012 -0400
+++ b/lib/sqlalchemy/orm/strategies.py Thu May 03 19:27:30 2012 -0400
@@ -114,10 +114,11 @@
 
     def setup_query(self, context, entity, path, reduced_path, 
                             adapter, column_collection, **kwargs):
-        for c in self.columns:
+        for c in reversed(self.columns):
             if adapter:
                 c = adapter.columns[c]
             column_collection.append(c)
+            break
 
     def init_class_attribute(self, mapper):
         self.is_class_level = True

However with this, many tests fail.  There are lots of situations where columns from either the parent or child tables are expected.

The "id" column of both tables are considered as different within SQLAlchemy, and most ORM queries qualify the column names with the tablename so that they are separately targeted during automated joins and in the result set.

The hierarchy of mappers considers the primary key columns to be the columns within the base table only, so that polymorphic queries make sense.    So the column from the "base" table needs to be present.   

But then there's lots of cases where things refer to the sub-table column.  SQLAlchemy would need to have translation logic at all points to accommodate this, which becomes very awkward for edge cases like eager loading from a polymorphic relationship.   It also means things like "primaryjoin" conditions that refer to the non-base column need to be translated explicitly even if that's not what the user defined.

So this is a very complicated issue.   For now, two ways to work around:


1. put the tablename qualification back on:

sq = s.query(BasePrAc).with_labels().subquery()

But then explicit reference to the columns needs to include the tablename:

q = s.query(Post).join((sq, sq.c.base_products_accessories_id == Post.basePrAcId)).all()

Though in this case you can just join using the relationship:

q = s.query(Post).join((sq, Post.basePrAc)).all()

2. build a join using fold_equivalents:

This is something that could theoretically make it's way into Query with some complexity.   join() specifically supports "folding" the equivalently named and FK-linked columns.   It's a little tricky to get it into the mapped join but you can do it by hand:

sq = BasePrAc.__mapper__.mapped_table.select(fold_equivalents=True).alias()

above, fold_equivalents is only accepted by the select() that comes from a Join object, which in this case BasePrAc.__mapper__.mapped_table is.

















Michael Bayer

unread,
May 3, 2012, 8:27:48 PM5/3/12
to sqlal...@googlegroups.com

On May 3, 2012, at 8:15 PM, Michael Bayer wrote:

> 1. put the tablename qualification back on:
>
> sq = s.query(BasePrAc).with_labels().subquery()
>
> But then explicit reference to the columns needs to include the tablename:
>
> q = s.query(Post).join((sq, sq.c.base_products_accessories_id == Post.basePrAcId)).all()
>
> Though in this case you can just join using the relationship:
>
> q = s.query(Post).join((sq, Post.basePrAc)).all()
>
> 2. build a join using fold_equivalents:
>
> This is something that could theoretically make it's way into Query with some complexity. join() specifically supports "folding" the equivalently named and FK-linked columns. It's a little tricky to get it into the mapped join but you can do it by hand:
>
> sq = BasePrAc.__mapper__.mapped_table.select(fold_equivalents=True).alias()
>
> above, fold_equivalents is only accepted by the select() that comes from a Join object, which in this case BasePrAc.__mapper__.mapped_table is.

OK, here is better than using fold_equivalents. This might make it into 0.8 in some form:

from sqlalchemy.sql.util import reduce_columns

def reduced(query):
stmt = query.statement
return stmt.with_only_columns(reduce_columns(stmt.inner_columns))

sq = reduced(s.query(BasePrAc)).alias()

q = s.query(Post).join((sq, sq.c.id == Post.basePrAcId)).all()
print q


reduce_columns() will limit the set of columns in a list of columns to those that are "equivalent" based on foreign key.

Michael Bayer

unread,
May 3, 2012, 8:48:16 PM5/3/12
to sqlal...@googlegroups.com

On May 3, 2012, at 8:27 PM, Michael Bayer wrote:

>
> OK, here is better than using fold_equivalents. This might make it into 0.8 in some form:
>
> from sqlalchemy.sql.util import reduce_columns
>
> def reduced(query):
> stmt = query.statement
> return stmt.with_only_columns(reduce_columns(stmt.inner_columns))
>
> sq = reduced(s.query(BasePrAc)).alias()
>
> q = s.query(Post).join((sq, sq.c.id == Post.basePrAcId)).all()
> print q
>
>
> reduce_columns() will limit the set of columns in a list of columns to those that are "equivalent" based on foreign key.

this is ticket #1729 moved up to 0.8. It will look like this:

sq = s.query(BasePrAc).subquery(reduce_equivalents=True)

thanks for the background on this !
Reply all
Reply to author
Forward
0 new messages