session merge sets missing children foreign keys to null

128 views
Skip to first unread message

Gabriel Smith

unread,
Dec 8, 2021, 10:38:33 AM12/8/21
to sqlalchemy
Hi, I'm a recent adopter of sqlalchemy, starting fresh with all the 2.0 stuff. Thanks so much for the entire teams hard work!

I have a small issue with the merge functionality and it could honestly just be a misunderstanding of the function from my own point of view. I have a fairly complicated database structure that with 15 nested children tables, some of which have thousands of read-only records. I'd prefer to not have to add all those records to the session in order to run a merge, but when I try to do that, the subsequent flush emits sql updates that set the child foreign keys to null.  

If there is a way to avoid those updates for orphaned children and just ignore them if they aren't included in the incoming data by primary key, that would really help me, if not, I can look into adding them all to the incoming entity so they'll be ignored.

MCVE:
```
from sqlalchemy import Column, Integer, String, ForeignKey, create_engine, select
from sqlalchemy.orm import registry, declarative_base, relationship, sessionmaker
from sqlalchemy.sql.sqltypes import Numeric
import unittest

Base = declarative_base()
Mapper_Registry = registry()

### MODELS

@Mapper_Registry.mapped
class Item:
    __tablename__ = 'item'
   
    id = Column(Integer, primary_key=True)
    model_number = Column(String)

    item_prices = relationship("Item_Price", back_populates="item", lazy="joined")

@Mapper_Registry.mapped
class Item_Price:
    __tablename__ = 'item_price'
   
    id = Column(Integer, primary_key=True)
    item_id = Column(Integer, ForeignKey('item.id'))
    price = Column(Numeric)

    item = relationship("Item", back_populates="item_prices", lazy="joined", viewonly=True)

### TESTS

class Test_OrphanRecordFKMerge(unittest.TestCase):
    engine = create_engine('sqlite:///:memory:', echo=True, echo_pool='debug', future=True)
    Session = sessionmaker(bind=engine)
    session = Session()

    def setUp(self):
        Base2 = Mapper_Registry.generate_base()
        Base2.metadata.create_all(self.engine)
        # Create a base item to run tests on
        t_item = Item()
        t_item.model_number = 'TestItem'
        t_price1 = Item_Price()
        t_price1.price = 1.00
        t_item.item_prices.append(t_price1)
        t_price2 = Item_Price()
        t_price2.price = 4.00
        t_item.item_prices.append(t_price2)

        self.session.add(t_item)
        self.session.commit()

    def tearDown(self):
        Base.metadata.drop_all(self.engine)

    def test_item_update(self):
        self.session.expunge_all()
        # Incoming item data from remote api or flat file
        incoming_item = Item()
        incoming_item.model_number = 'TestItem'
        incoming_price1 = Item_Price()
        incoming_price1.price = 777.00
        incoming_item.item_prices.append(incoming_price1)
        # Now we have an incoming item, we need to query the database for the existing item and reconcile the primary keys
        # so that it can be updated correctly
        persisted_item = self.session.execute(select(Item).where(Item.model_number == 'TestItem')).scalars().first()
        # let us imagine that the new price should not overwrite either old price
        self.session.merge(incoming_item)
        self.session.commit()
        self.session.expunge_all()
        final_result = self.session.execute(select(Item).where(Item.model_number == 'TestItem')).scalars().first()
        # the following test fails as both the other price records have had their foreign keys set to null after the merge
        # so the len(final_result.item_prices) == 1
        self.assertEqual(len(final_result.item_prices), 3)

if __name__ == '__main__':
    unittest.main()
```

Output:
```
2021-12-08 09:34:46,053 INFO sqlalchemy.engine.Engine UPDATE item_price SET item_id=? WHERE item_price.id = ?
2021-12-08 09:34:46,053 INFO sqlalchemy.engine.Engine [generated in 0.00056s] ((None, 1), (None, 2))
2021-12-08 09:34:46,055 INFO sqlalchemy.engine.Engine INSERT INTO item_price (item_id, price) VALUES (?, ?)
2021-12-08 09:34:46,055 INFO sqlalchemy.engine.Engine [cached since 0.03964s ago] (1, 777.0)
2021-12-08 09:34:46,057 INFO sqlalchemy.engine.Engine COMMIT
```

The update statement above is what I am trying to avoid, but I'd still like to use the merge functionality if possible.

Thanks for any guidance and for all you've put into this amazing library.

Mike Bayer

unread,
Dec 8, 2021, 2:48:45 PM12/8/21
to noreply-spamdigest via sqlalchemy


On Wed, Dec 8, 2021, at 10:38 AM, Gabriel Smith wrote:
Hi, I'm a recent adopter of sqlalchemy, starting fresh with all the 2.0 stuff. Thanks so much for the entire teams hard work!

great!


I have a small issue with the merge functionality and it could honestly just be a misunderstanding of the function from my own point of view. I have a fairly complicated database structure that with 15 nested children tables, some of which have thousands of read-only records. I'd prefer to not have to add all those records to the session in order to run a merge, but when I try to do that, the subsequent flush emits sql updates that set the child foreign keys to null.  

If there is a way to avoid those updates for orphaned children and just ignore them if they aren't included in the incoming data by primary key, that would really help me, if not, I can look into adding them all to the incoming entity so they'll be ignored.

yeah this is happening because of this line:

   incoming_item.item_prices.append(incoming_price1)

this means you are merging the item like this:

  Item(item_prices = [ItemPrice(...)])

noting that the other two ItemPrice objects are no longer in that collection, which is why they are seen as a net remove.    Merge doesn't add an incoming collection to the one which is already there, it sees the collection coming in as "this is the new collection" and replaces the old collection with the new one, so it would need to still have the other two objects present within it, either as the same persistent prices you already have or as additional transient objects with the correct primary key.    Below we can do it in the former style if we use a future style Session .

This is not a simple case to figure out, as you want to get those price objects from persistent and re-merge them, all the while not tripping up and accidentally adding the transient "incoming_item" into the session.   this is how we can do it:

   
    # first copy persisted_item's item_prices collection
    incoming_item.item_prices = persisted_item.item_prices
   
    # then append to that
    incoming_item.item_prices.append(incoming_price1)

but with "legacy" ORM behavior, this will get confused and raise an error since it will try to cascade "incoming_item" into the Session prematurely and then try to flush it.   There's a way to work around that with make_transient, but more succinctly, if we turn on 2.0 behavior for the Session, this unwanted cascade is prevented:

    Session = sessionmaker(bind=engine, future=True)

that way there isn't even that much overhead to re-merging those existing prices since they are already persistent in that session.

As you gave me a full, runnable script, I was able to make those adjustments and your assertion at the end succeeds.

great job with the MCVE and paying close attention to the docs as well as making our jobs easier!  


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.

Gabriel Smith

unread,
Dec 9, 2021, 11:21:25 AM12/9/21
to sqlalchemy

Thank you for the quick and clear answer that solves the test case.

I've applied the same approach to the issue in our current codebase and it completely fixed the issue.

You're the best!
Reply all
Reply to author
Forward
0 new messages