Updating using merge(), getting NULL foreign key reference

1,088 views
Skip to first unread message

Sam Magister

unread,
Jul 25, 2008, 4:03:20 PM7/25/08
to sqlalchemy
Hi,

I've been playing with SQLAlchemy v0.5b1 for the past few weeks and I
like it a _lot_. I'm stuck on one ORM detail which I'm hoping someone
might be kind enough to help me with.

I've constructed a pared down example (code below) to illustrate the
problem I'm having. I want to store an object representing a car part
(for example) using two tables: a 'part' table and a 'manufacturer'
table. Each row in the part table references a manufacturer by a
foreign key reference to the manufacturer table.

For both the part table and the manufacturer table, I want to have a
unique, auto-increment id column because there are other tables which
refer to each by foreign key reference.

Additionally, I want the values in each of these tables to be
constrained: There should be no two rows in the part table with the
same values of (part_number, fk_manufacturer). Similarly there should
be no two rows in the manufacturer table with the same values of
(name, city). These constraints model what I consider to be unique
parts and unique manufacturers.

I would like the interface to the Part() and Manufacturer() objects to
not have to worry about these constraints. Ideally a row in the db
would be updated or inserted depending on whether it conflicted with
the unique constraints. If it did, it would be updated, if not, it
would be inserted. This is the interface I've set up in the if name ==
'__main__' section of the code.

Presently I am enforcing these constraints in the __init__() methods
of both the Part() class and the Manufacturer() class. Every time a
Part or a Manufacturer object is created, I check to see if an
identical row already exists. If it does, the id is changed to that of
the existing row. If there is not, then nothing happens. My thought
was that by setting the id, this indicates to sqlalchemy that the row
should be updated, not inserted (if this is an incorrect assumption,
is there a way to indicate this to sqlalchemy?)

The problem that arises is upon updating a part: The fk_manufacturer
value gets set to NULL, rather than the foreign key of the already
existing manufacturer. Running the code below I get:

created
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-
packages/SQLAlchemy-0.5.0beta1-py2.5.egg/sqlalchemy/databases/mysql.py:
1487: Warning: Column 'fk_manufacturer' cannot be null
cursor.execute(statement, parameters)
updated

Any help with an elegant way to deal with this type of design pattern
would be greatly appreciated.

Thanks,

Sam

import sqlalchemy
from sqlalchemy import Table, Column, ForeignKey, Integer, String,
Text, create_engine, MetaData
from sqlalchemy.orm import mapper, relation
import meta
from meta import Session
from sqlalchemy.orm import sessionmaker, scoped_session

engine = create_engine('mysql://user:####@localhost:3306/new_db')
metadata = MetaData()
metadata.bind = engine

Session = scoped_session(sessionmaker(autoflush=False,
autocommit=False, bind=engine))

part_table = Table('part', metadata,\
Column('id', Integer, primary_key=True),\
Column('part_number', String(64)),\
Column('fk_manufacturer', Integer,
ForeignKey('manufacturer.id'), nullable=False),\
)

manufacturer_table = Table('manufacturer', metadata,\
Column('id', Integer, primary_key=True),\
Column('name', String(30)),\
Column('city', String(30)),\
Column('description', Text),\
)

class Part(object):
def __init__(self, part_number, manufacturer):
self.part_number = part_number
self.manufacturer = manufacturer
session = Session()
# Impose unique values of of (fk_manufacturer, part_number)
try:
existing_part = session.query(Part)\
.filter(Part.part_number==self.part_number)
\
.filter(Part.fk_manufacturer==self.manufacturer.id)
\
.one()
self.id = existing_part.id
except sqlalchemy.orm.exc.NoResultFound:
pass

class Manufacturer(object):
def __init__(self, name, city):
self.name = name
self.city = city
session = Session()
# Imposed unique values of (name, city)
try:
existing = session.query(Manufacturer)\
.filter(Manufacturer.name==self.name)\
.filter(Manufacturer.city==self.city)\
.one()
self.id = existing.id
except sqlalchemy.orm.exc.NoResultFound:
pass
session.close()

mapper(Manufacturer, manufacturer_table)

mapper(Part, part_table,\
properties={'manufacturer':relation(Manufacturer, cascade='save-
update, merge')})

if __name__ == '__main__':
metadata.create_all(meta.engine)
session = Session()
# set the values of the manfuacturer and the part
manufacturer_name = 'Applied Materials Inc.'
city = 'Portland'
part_number = '1234-4321'
# create manufacturer
m = Manufacturer(manufacturer_name, city)
m.description = 'A material company.'
# create part (works ok)
p = Part(part_number, m)
session.merge(p)
session.commit()
print 'created'
# update part (fails to correctly store the fk_manufacturer of the
original manufacturer. Has a NULL value instead)
m = Manufacturer(manufacturer_name, city)
m.description = 'An applied material company.'
p = Part(part_number, m)
session.merge(p)
session.commit()
print 'updated'


import sqlalchemy
from sqlalchemy import Table, Column, ForeignKey, Integer, String,
Text, create_engine, MetaData
from sqlalchemy.orm import mapper, relation
import meta
from meta import Session
from sqlalchemy.orm import sessionmaker, scoped_session

engine = create_engine('mysql://user:####@localhost:3306/new_db')
metadata = MetaData()
metadata.bind = engine

Session = scoped_session(sessionmaker(autoflush=False,
autocommit=False, bind=engine))

part_table = Table('part', metadata,\
Column('id', Integer, primary_key=True),\
Column('part_number', String(64)),\
Column('fk_manufacturer', Integer,
ForeignKey('manufacturer.id'), nullable=False),\
)

manufacturer_table = Table('manufacturer', metadata,\
Column('id', Integer, primary_key=True),\
Column('name', String(30)),\
Column('city', String(30)),\
Column('description', Text),\
)

class Part(object):
def __init__(self, part_number, manufacturer):
self.part_number = part_number
self.manufacturer = manufacturer
session = Session()
# Impose unique values of of (fk_manufacturer, part_number)
try:
existing_part = session.query(Part)\
.filter(Part.part_number==self.part_number)
\
.filter(Part.fk_manufacturer==self.manufacturer.id)
\
.one()
self.id = existing_part.id
except sqlalchemy.orm.exc.NoResultFound:
pass

class Manufacturer(object):
def __init__(self, name, city):
self.name = name
self.city = city
session = Session()
# Imposed unique values of (name, city)
try:
existing = session.query(Manufacturer)\
.filter(Manufacturer.name==self.name)\
.filter(Manufacturer.city==self.city)\
.one()
self.id = existing.id
except sqlalchemy.orm.exc.NoResultFound:
pass
session.close()

mapper(Manufacturer, manufacturer_table)

mapper(Part, part_table,\
properties={'manufacturer':relation(Manufacturer, cascade='save-
update, merge')})

if __name__ == '__main__':
metadata.create_all(meta.engine)
session = Session()
# set the values of the manfuacturer and the part
manufacturer_name = 'Applied Materials Inc.'
city = 'Portland'
part_number = '1234-4321'
# create manufacturer
m = Manufacturer(manufacturer_name, city)
m.description = 'A material company.'
# create part (works ok)
p = Part(part_number, m)
session.merge(p)
session.commit()
print 'created'
# update part (fails to correctly store the fk_manufacturer of the
original manufacturer. Has a NULL value instead)
m = Manufacturer(manufacturer_name, city)
m.description = 'An applied material company.'
p = Part(part_number, m)
session.merge(p)
session.commit()
print 'updated'

Michael Bayer

unread,
Jul 25, 2008, 5:39:09 PM7/25/08
to sqlal...@googlegroups.com

On Jul 25, 2008, at 4:03 PM, Sam Magister wrote:

>
> Presently I am enforcing these constraints in the __init__() methods
> of both the Part() class and the Manufacturer() class. Every time a
> Part or a Manufacturer object is created, I check to see if an
> identical row already exists. If it does, the id is changed to that of
> the existing row. If there is not, then nothing happens. My thought
> was that by setting the id, this indicates to sqlalchemy that the row
> should be updated, not inserted (if this is an incorrect assumption,
> is there a way to indicate this to sqlalchemy?)

unfortunately this is not how it works. An instance when loaded from
the database has state information associated with it which tells SQLA
that this is a "persistent" or "detached" instance, and not a
"pending" or "transient" instance (these four terms are described in
the session documentation). We don't currently have an API method
to flip an instance from "transient" to "persistent" in place without
going through the usual save mechanism. session.merge() is close
since it accepts a transient instance with a primary key identifier,
and returns to you the persistent version of it, copying all the state
from the instance you give to the persistent one.

Using either session.merge() or just a get() with the given id,
neither solve the issue of checking for the id within the __init__
method. Fortunately Python makes it super easy to control the
initialization process of an object, and for a case like this you can
use __new__():

class MyClass(object):
def __new__(cls, id=None):
if id:
obj = return Session.query(cls).get(id)
if obj:
return obj
return object.__new__(cls)

Above, saying MyClass() will give you a MyClass() instance. Saying
MyClass(7) will look up #7 in the database first and return that
instance if found, otherwise a new MyClass() object. You can enhance
this to take in a full list of data and merge it in with merge():

class MyClass(object):
def __new__(cls, id=None, **kwargs):
newinstance = object.__new__(cls)
if id:
obj = return Session.query(cls).get(id)
if obj:
return Session.merge(newinstance)

return newinstance

The object.__new__(cls) call does *not* require any arguments (and I
think won't accept them in more recent versions of Python), but the
__init__ method of the object still gets at the full set of id +
**kwargs when you call it. So this is an easy way to control the
initialization of an object such that you may or may not actually
construct an object locally.


Sam Magister

unread,
Jul 26, 2008, 2:15:32 PM7/26/08
to sqlalchemy
Michael,

Thanks very much. I wasn't familiar with the __new__ built in method.
After reading up, I've set up the table definitions and classes as
follows and it seems to work. (I also set up the multiple column
indexes for speed). I'm curious if this is a common design pattern so
that the interface with the classes themselves is dead-simple (all the
checks for the what constitutes a unique part/manufacturer happen
within the __new__ method and not in any external code that
manipulates them).

Best,

Sam

------------

part_table = Table('part', metadata,\
Column('id', Integer, primary_key=True),\
Column('part_number', String(64)),\
Column('fk_manufacturer', Integer,
ForeignKey('manufacturer.id'), nullable=False),\
)

Index('idx_part', part_table.c.part_number,
part_table.c.fk_manufacturer, unique=True)

manufacturer_table = Table('manufacturer', metadata,\
Column('id', Integer, primary_key=True),\
Column('name', String(30)),\
Column('city', String(30)),\
Column('description', Text),\
)

Index('idx_manufacturer', manufacturer_table.c.name,
manufacturer_table.c.city, unique=True)

class Part(object):
def __init__(self, part_number, manufacturer):
self.part_number = part_number
self.manufacturer = manufacturer

def __new__(cls, part_number=None, manufacturer=None):
if part_number and manufacturer:
try:
obj =
session.query(cls).filter(cls.part_number==part_number)\
.filter(cls.manufacturer==manufacturer)
\
.one()
return obj
except sqlalchemy.orm.exc.NoResultFound:
pass
return object.__new__(cls)

class Manufacturer(object):
def __init__(self, name, city):
self.name = name
self.city = city

def __new__(cls, name=None, city=None):
if name and city:
try:
obj = session.query(cls).filter(cls.name==name)\
.filter(cls.city==city)\
.one()
return obj
except sqlalchemy.orm.exc.NoResultFound:
pass
return object.__new__(cls)
Reply all
Reply to author
Forward
0 new messages