Column name override with a relation requiring a primaryjoin

kremlan

unread,

May 28, 2008, 6:11:03 PM5/28/08

to sqlalchemy

My goal is to have a one-to-one relation defined using the same name
as the foreign key column underneath. I have 'contacts' table with
'created_by' and 'updated_by' columns which are FKs to contacts.id.

contacts = Table('contacts', meta,
Column('id', Integer, primary_key=True),
Column('first_name', String(25)),
Column('middle_name', String(25)),
Column('last_name', String(25)),
# etc...
Column('created_at', DateTime),
Column('updated_at', DateTime),
Column('created_by', Integer),
Column('updated_by', Integer),
ForeignKeyConstraint(['created_by'], ['contacts.id']),
ForeignKeyConstraint(['updated_by'], ['contacts.id'])
)

mapper(Contact, contacts, properties={
'_created_by': contacts.c.created_by,
'_updated_by': contacts.c.updated_by,
'created_by': relation(Contact,
primaryjoin=contacts.c.created_by==contacts.c.id, uselist=False),
'updated_by': relation(Contact,
primaryjoin=contacts.c.updated_by==contacts.c.id, uselist=False),
})

The primaryjoin is necessary due to the multiple FKs back to contacts.

This follows the pattern suggested here:
http://groups.google.pl/group/sqlalchemy/browse_thread/thread/e20bb32241ced699

No errors occur but the 'created_by' column in the table is not
actually updated upon flush/commit. Instead the _created_by attribute
is updated.

Michael Bayer

unread,

May 28, 2008, 6:16:32 PM5/28/08

to sqlal...@googlegroups.com

this is a many-to-one self-referential relationship. the
"remote_side" attribute is needed on both relations() to establish
this, and the uselist=False is not needed (its hiding the actual error
here). See the docs on "self-referential mappings".

kremlan

unread,

May 28, 2008, 7:16:35 PM5/28/08

to sqlalchemy

following the example in the mapper configuration documentation i've
come up with:

mapper(Contact, contacts,, properties={

'_created_by': contacts.c.created_by,
'_updated_by': contacts.c.updated_by,
'created_by': relation(Contact,
primaryjoin=contacts.c.created_by==contacts.c.id,

remote_side=[contacts.c.id]),
'updated_by': relation(Contact,
primaryjoin=contacts.c.updated_by==contacts.c.id,
backref=backref('updated_by_me',
primaryjoin=contacts.c.updated_by==contacts.c.id,
remote_side=[contacts.c.id])),
})

Both of these configurations set their respective _ properties rather
than update the actual database row.

Thank you very much for your help.

> >http://groups.google.pl/group/sqlalchemy/browse_thread/thread/e20bb32...

Michael Bayer

unread,

May 28, 2008, 8:47:20 PM5/28/08

to sqlal...@googlegroups.com

Here's a script which exercises your mapping against 0.4 - the
remote_side should be on the "many-to-one" side of "updated_by".

Addtionally, depending on what kind of combinations of Contact objects
you want to store, you might run into the scenario where ContactA
references ContactB, and ContactB references ContactA. Theres no way
to INSERT data like that without an UPDATE (assuming FK integrity and
no sequences), so SQLA wants you to use a flag called "post_update" if
that occurs. It needs the flag if the co-dependent items are being
UPDATEd too since it works out dependencies for UPDATE and INSERTs in
the same way...this is something that could perhaps be improved upon.
So the second half of the script illustrates that mapping as an
alternative.

cyclical_contacts.py

Brad Wells

unread,

May 29, 2008, 11:58:50 AM5/29/08

to sqlal...@googlegroups.com

Thank you very much for your help. The post_update=True addition is what I was missing.

The test case I was using was c1.created_by = c1 where created_by was previously null.
(c1 having previously been saved and flushed) This seems to be the one corner case where
post_update=True was necessary.

I have attached a script to demonstrate the problem I encountered. I don't know if this is
expected or known behavior or not, but it was certainly confusing to me. I would assume that saving
against a null vs non-null value should not result in different behavior.

Thank you again.

-brad

> cyclical_contacts.py
> 3KDownload

columns.py

Michael Bayer

unread,

May 29, 2008, 1:55:44 PM5/29/08

to sqlal...@googlegroups.com

On May 29, 2008, at 11:58 AM, Brad Wells wrote:

Thank you very much for your help. The post_update=True addition is what I was missing.

The test case I was using was c1.created_by = c1 where created_by was previously null.
(c1 having previously been saved and flushed) This seems to be the one corner case where
post_update=True was necessary.

I have attached a script to demonstrate the problem I encountered. I don't know if this is
expected or known behavior or not, but it was certainly confusing to me. I would assume that saving
against a null vs non-null value should not result in different behavior.

that exact issue is this:

INSERT INTO TABLE (id, data, parent_id) VALUES (x, y, z)

When using a database like SQLite or MySQL which uses an "identity" scheme to generate primary keys, we are not able to populate the "parent_id" column in the above statement; we don't yet know what "id" is. Therefore, an INSERT followed by an UPDATE with the newly generated ID is required. With databases that use sequences like PG or Oracle, SQLA abstracts the usage of the sequence so that it operates like an "identity" (the lowest common denominator of behavior) to the ORM, so the same issue exists. It's obvious that if parent_id is meant to be NULL, nothing would go wrong.

For that exact example, SQLA will not have a "circular dependency" error since the topological algorithm doesn't represent the same object more than once, but its likely that it should at some point be made to recognize that case. So the error isn't raised but the actual case fails.

Thank you again.

-brad

On May 28, 8:47 pm, Michael Bayer <mike...@zzzcomputing.com> wrote:
> Here's a script which exercises your mapping against 0.4 - the
> remote_side should be on the "many-to-one" side of "updated_by".
>
> Addtionally, depending on what kind of combinations of Contact objects
> you want to store, you might run into the scenario where ContactA
> references ContactB, and ContactB references ContactA. Theres no way
> to INSERT data like that without an UPDATE (assuming FK integrity and
> no sequences), so SQLA wants you to use a flag called "post_update" if
> that occurs. It needs the flag if the co-dependent items are being
> UPDATEd too since it works out dependencies for UPDATE and INSERTs in
> the same way...this is something that could perhaps be improved upon.
> So the second half of the script illustrates that mapping as an
> alternative.
>
> cyclical_contacts.py
> 3KDownload

from sqlalchemy import *
from sqlalchemy import exceptions
from sqlalchemy.orm import *

import logging
#logging.basicConfig()
#logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
#logging.getLogger('sqlalchemy.orm.properties.PropertyLoader').setLevel(logging.INFO)
#logging.getLogger('sqlalchemy.orm.strategies.LazyLoader').setLevel(logging.INFO)

meta = MetaData(create_engine('sqlite://'))

contacts = Table('contacts', meta,
Column('id', Integer, primary_key=True),

Column('name', String(50)),

  Column('created_by', Integer),
  Column('updated_by', Integer),
  ForeignKeyConstraint(['created_by'], ['contacts.id']),
  ForeignKeyConstraint(['updated_by'], ['contacts.id'])
)

meta.create_all()

class Contact(object):
   def __init__(self, name, created_by=None):
       self.name = name
       self.created_by = created_by

   def __eq__(self, other):
       return other.name == self.name

   def __repr__(self):
       return "Contact(created_by=%r, updated_by=%r)" % (self.created_by, self.updated_by)

# create a sample mapper with and without post_update=True
# updated_by uses post_update, created_by does not.

mapper(Contact, contacts, properties={
  '_created_by': contacts.c.created_by,
  '_updated_by': contacts.c.updated_by,
  'created_by': relation(Contact, primaryjoin=contacts.c.created_by==contacts.c.id,

          remote_side=[contacts.c.id],
   ),
  'updated_by': relation(Contact, primaryjoin=contacts.c.updated_by==contacts.c.id,
           remote_side=[contacts.c.id],
           post_update=True,
   )
})

sess = create_session()

# saving indivisually seems to be necessary to ensure ID ordering
c1 = Contact('c1')
sess.save(c1)
sess.flush()

c2 = Contact('c2')
sess.save(c2)
sess.flush()

# assert proper data
assert contacts.select().order_by(contacts.c.name).execute().fetchall() == [
   (1, 'c1', None, None),
   (2, 'c2', None, None),
]
sess.clear()

[c1, c2] = sess.query(Contact).order_by(Contact.name).all()

c1.created_by = c2
c1.updated_by = c2
sess.flush()

# saving against a null key with a non self referential key works as expected
assert contacts.select().order_by(contacts.c.name).execute().fetchall() == [
   (1, 'c1', 2, 2),
   (2, 'c2', None, None),
]
sess.clear()

[c1, c2] = sess.query(Contact).order_by(Contact.name).all()

c1.created_by = c1
c1.updated_by = c1
c2.created_by = c2
c2.updated_by = c2
sess.flush()

# updating a non-null key with a self referential key works as expected
# with and without post_update=True
#
# saving against a null key with a self referential key works only with
# post_update=True.
#
# This does not seem to be a true circular relationship as the resulting SQL
# can be one simple UPDATE.
#
# expected outcome (to me):
#   c1 created_by=1, updated_by=1
#   c2 created_by=2, updated_by=2
# actual outcome:
#   c1 created_by=1, updated_by=1
#   c2 created_by=None, updated_by=2
assert contacts.select().order_by(contacts.c.name).execute().fetchall() == [
   (1, 'c1', 1, 1),
   (2, 'c2', 2, 2),
]
sess.clear()

Reply all

Reply to author

Forward