0.4 to 0.5 upgrade and 'One-to-many relation fails with "unsaved, pending instance and is an orphan"' error

16 views
Skip to first unread message

Brad Wells

unread,
Jul 2, 2009, 7:21:42 PM7/2/09
to sqlalchemy
In the process of upgrading from 0.4 to 0.5 I've come across a
troubling issue. With the following setup:

####################################################################################

from sqlalchemy import Table, Column, Integer, String, MetaData,
create_engine, ForeignKey
from sqlalchemy.orm import relation, sessionmaker, scoped_session

engine = create_engine('sqlite:///mystuff.sqlite', echo=True)

Session = scoped_session(sessionmaker(autoflush=True,
transactional=True, bind=engine))
metadata = MetaData()
mapper = Session.mapper

time_zones = Table('time_zones', metadata,
Column('id', Integer, primary_key=True),
Column('name', String(35)))

contacts = Table('contacts', metadata,
Column('id', Integer, primary_key=True),
Column('display_as', String(35)),
Column('time_zone_id', Integer, ForeignKey
('time_zones.id')))

phone_numbers = Table('phone_numbers', metadata,
Column('id', Integer, primary_key=True),
Column('number', String(35)),
Column('contact_id', Integer, ForeignKey
('contacts.id')))

class TimeZone(object): pass
class Contact(object): pass
class PhoneNumber(object): pass

mapper(TimeZone, time_zones)
mapper(Contact, contacts, properties={
'time_zone': relation(TimeZone, backref='contacts'),
'phone_numbers': relation(PhoneNumber, backref='contact',
cascade='all, delete-orphan')
})
mapper(PhoneNumber, phone_numbers)

metadata.create_all(bind=engine)

##############################################################################

Under 0.4 the following code executes fine:


############

c = Contact(display_as='Fake, User')
c.time_zone = TimeZone.query.filter_by(name='Africa/Algiers').first()
ph = PhoneNumber(full='1234567890')
c.phone_numbers.append(ph)
Session.commit()

############

But under 0.5 I receive an orphaned object error (see below for full
output). I understand that the TimeZone query causes a flush in
between the creation of the Contact and of the PhoneNumber. Without
the flush in between (if the TimeZone query line is removed) SA 0.5 is
correctly able execute the sample script.

As per this thread (http://groups.google.com/group/sqlalchemy/
browse_thread/thread/6c71c61bc59223f?tvc=2) I see that a suggested
remedy is to change the relation to cascade='all' rather than
cascade='all, delete-orphan'.
I would prefer not to do this as it really does make no sense in this
case to have a PhoneNumber without a Contact. I could also set the
relation via 'ph.contact = contact' but I would prefer to not have to
comb all of our existing code for this new class of bug.

What doesn't make sense to me is why 0.4 was able to correctly delay
the insert of the new phone number record until after the query for
the collection and now 0.5 can't.

Thank you for the assistance
-Brad


Below is the full echo output of 0.4 and 0.5

0.4.8dev_r5095:
2009-07-02 18:51:17,391 INFO sqlalchemy.engine.base.Engine.0x..70
BEGIN
2009-07-02 18:51:17,391 INFO sqlalchemy.engine.base.Engine.0x..70
INSERT INTO contacts (display_as, time_zone_id) VALUES (?, ?)
2009-07-02 18:51:17,391 INFO sqlalchemy.engine.base.Engine.0x..70
['Fake, User', None]
2009-07-02 18:51:17,391 INFO sqlalchemy.engine.base.Engine.0x..70
SELECT time_zones.id AS time_zones_id, time_zones.name AS
time_zones_name FROM time_zones WHERE time_zones.name = ? ORDER BY
time_zones.oid LIMIT 1 OFFSET 0
2009-07-02 18:51:17,391 INFO sqlalchemy.engine.base.Engine.0x..70
['Africa/Algiers']
2009-07-02 18:51:17,407 INFO sqlalchemy.engine.base.Engine.0x..70
SELECT phone_numbers.id AS phone_numbers_id, phone_numbers.number AS
phone_numbers_number, phone_numbers.contact_id AS
phone_numbers_contact_id FROM phone_numbers WHERE
phone_numbers.contact_id = ? ORDER BY phone_numbers.oid
2009-07-02 18:51:17,407 INFO sqlalchemy.engine.base.Engine.0x..70 [4]
2009-07-02 18:51:17,407 INFO sqlalchemy.engine.base.Engine.0x..70
INSERT INTO phone_numbers (number, contact_id) VALUES (?, ?)
2009-07-02 18:51:17,407 INFO sqlalchemy.engine.base.Engine.0x..70
[None, 4]
2009-07-02 18:51:17,407 INFO sqlalchemy.engine.base.Engine.0x..70
COMMIT




0.5.4p2:
sa_test.py:8: SADeprecationWarning: The 'transactional' argument to
sessionmaker() is deprecated; use autocommit=True|False instead.
Session = scoped_session(sessionmaker(autoflush=True,
transactional=True, bind=engine))
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
PRAGMA table_info("time_zones")
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
()
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
PRAGMA table_info("contacts")
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
()
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
PRAGMA table_info("phone_numbers")
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
()
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
BEGIN
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
INSERT INTO contacts (display_as, time_zone_id) VALUES (?, ?)
2009-07-02 18:54:41,658 INFO sqlalchemy.engine.base.Engine.0x...2f30
['Fake, User', None]
2009-07-02 18:54:41,674 INFO sqlalchemy.engine.base.Engine.0x...2f30
SELECT time_zones.id AS time_zones_id, time_zones.name AS
time_zones_name FROM time_zones WHERE time_zones.name = ? LIMIT 1
OFFSET 0
2009-07-02 18:54:41,674 INFO sqlalchemy.engine.base.Engine.0x...2f30
['Africa/Algiers']
Traceback (most recent call last):
File "sa_test.py", line 47, in <module>
c.phone_numbers.append(ph)
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\attributes.py", line 158,
in __get__
return self.impl.get(instance_state(instance), instance_dict
(instance))
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\attributes.py", line 374,
in get
value = callable_()
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\strategies.py", line 568,
in __call__
result = q.all()
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\query.py", line 1193, in a
ll
return list(self)
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\query.py", line 1286, in _
_iter__
self.session._autoflush()
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\session.py", line 899, in
_autoflush
self.flush()
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\session.py", line 1354, in
flush
self._flush(objects)
File "c:\program files\python\lib\site-packages\sqlalchemy-0.5.4p2-
py2.5.egg\sqlalchemy\orm\session.py", line 1414, in
_flush
mapperutil.state_str(state), path))
sqlalchemy.orm.exc.FlushError: Instance <PhoneNumber at 0xf8ae30> is
an unsaved, pending instance and is an orphan (is n
ot attached to any parent 'Contact' instance via that classes'
'phone_numbers' attribute)

Michael Bayer

unread,
Jul 3, 2009, 1:19:36 AM7/3/09
to sqlal...@googlegroups.com

On Jul 2, 2009, at 7:21 PM, Brad Wells wrote:

>
> But under 0.5 I receive an orphaned object error (see below for full
> output). I understand that the TimeZone query causes a flush in
> between the creation of the Contact and of the PhoneNumber. Without
> the flush in between (if the TimeZone query line is removed) SA 0.5 is
> correctly able execute the sample script.
>
> As per this thread (http://groups.google.com/group/sqlalchemy/
> browse_thread/thread/6c71c61bc59223f?tvc=2) I see that a suggested
> remedy is to change the relation to cascade='all' rather than
> cascade='all, delete-orphan'.
> I would prefer not to do this as it really does make no sense in this
> case to have a PhoneNumber without a Contact. I could also set the
> relation via 'ph.contact = contact' but I would prefer to not have to
> comb all of our existing code for this new class of bug.
>
> What doesn't make sense to me is why 0.4 was able to correctly delay
> the insert of the new phone number record until after the query for
> the collection and now 0.5 can't.
>

its because accessing the lazy collection on c.phone_numbers doesn't
trigger autoflush in 0.4, while it does in 0.5. Below is a test case
which fails only on 0.4 due to this behavior. 0.5 IMO is the one
which is more "correct", but its an unfortunate feature since this is
a frequent inconvenience. The issue demonstrated below is probably
not as common of a use case as that of just appending a non-
orphanable, pending object to a collection. But, an argument in favor
of 0.5's behavior is that the orphan error, while inconvenient,
complains loudly and is easily remedied (assuming one understands lazy
collections and autoflush), whereas the failure in 0.4 is silent.

a possible workaround would be to make the autoflush on the relation()
optional based on a configuration option, although that seems like yet
another obscure flag nobody would ever find out about and it also
dilutes the consistency of autoflush's behavior.

What I've advised to people in the past, and have also used myself, is
to temporarily disable autoflush on the session during code sections
that a lot of object manipulation is to occur, such as a form-to-model
population method in a web app. I made a decorator which
accomplishes this and it also fits nicely into a "with:" construct
(i.e. with no_autoflush():)".

using a "dynamic" relation (i.e. lazy="dynamic") would also alleviate
this problem since those collections don't request data unless they
are iterated. "dynamic" relations are the style of collection most
other Python ORMs use in all cases.

from sqlalchemy import *
from sqlalchemy.orm import *

engine = create_engine('sqlite:///', echo=True)

metadata = MetaData()
users = Table('users', metadata,


Column('id', Integer, primary_key=True),

Column('name', String(30), nullable=False),
)

addresses = Table('addresses', metadata,


Column('id', Integer, primary_key=True),

Column('user_id', None, ForeignKey('users.id')),
Column('email_address', String(50), nullable=False))

metadata.create_all(engine)
engine.execute(users.insert(),dict(id=8, name='ed'))

engine.execute(addresses.insert(),
[dict(id=x, user_id=y, email_address=z) for x, y, z in [
(2, 8, "e...@wood.com"),
(3, 8, "e...@bettyboop.com"),
(4, 8, "e...@lala.com"),
]])

class User(object):
pass
class Address(object):
pass

mapper(User, users, properties={
'addresses':relation(Address, order_by=addresses.c.email_address)
})
mapper(Address, addresses)

s = create_session(autoflush=True, transactional=True, bind=engine)

u = s.query(User).get(8)

ad2 = s.query(Address).get(2)
ad2.email_address = 'aaaaa'

assert [a.email_address for a in u.addresses] == ['aaaaa', 'e...@bettyboop.com
', 'e...@lala.com']

Michael Bayer

unread,
Jul 3, 2009, 10:14:49 AM7/3/09
to sqlal...@googlegroups.com
oh man, i missed the biggest part of this one, after I noticed that
you weren't doing anything to put the PhoneNumber in the session.
Don't use Session.mapper. That is the main problem you're having in
this specific case. Auto-adding free-standing objects to the session
is a bad idea and I've downplayed the crap out of this legacy feature
for years now.


On Jul 2, 2009, at 7:21 PM, Brad Wells wrote:

>

Brad Wells

unread,
Jul 8, 2009, 5:35:32 PM7/8/09
to sqlalchemy
Thank you for the detailed explanation of the problem and my possible
solutions.
Reply all
Reply to author
Forward
0 new messages