Soft Delete Pattern in SQLAlchemy

2,878 views
Skip to first unread message

Mahmoud Abdelkader

unread,
Sep 24, 2012, 1:48:49 PM9/24/12
to sqlal...@googlegroups.com
I wanted to know some views about the soft-delete (anti?) pattern and whether or not I'm going about it the right way in my code base.

We have a piece of code:

class NoDeletedQuery(Query):
    def __new__(cls, *args, **kwargs):
        if args and hasattr(args[0][0], "deleted_at"):
            return Query(*args, **kwargs).filter_by(deleted_at=None)
        else:
            return object.__new__(cls)


Session = scoped_session(sessionmaker(query_cls=NoDeletedQuery))

This is pretty obvious, just auto adds a filter that ignores the deleted at, but, I think this is a very hard thing to get right, especially with joins and whatnot. This works, but I would like to know what's the recommended idiom? I looked at this stack overflow question:


The mapper solution seems ideal, which would look like:

active_accts_q = (select([
    accounts_t
])
.where(accounts_t.c.deleted_at == None).alias()
)

class Account(Base):
    __table__ = active_accts_q

    def soft_delete(self): 
           self.deleted_at = func.clock_timestamp()
           

The problem I get here is that in my code, I do something like:

account.soft_delete()
Session.commit()

and I get a 

ObjectDeletedError: Instance '<Account at 0x102b2d390>' has been deleted, or its row is otherwise not present.

So, I'm guessing I'm just doing it wrong. Any suggestions? 

For a little context, we're using soft deletes and and for external purposes, we need to simulate a resource deletion but *still* have some attributes of the "deleted" row come back. There are other ways to solve this, which can be solved by triggers and the like, but I think updating a foreign key and adding a new column that represents a deleted view is too complicated for something that can be simplified by just having this "deleted_at" property.

Thanks!





Michael Bayer

unread,
Sep 24, 2012, 2:45:53 PM9/24/12
to sqlal...@googlegroups.com
On Sep 24, 2012, at 1:48 PM, Mahmoud Abdelkader wrote:

I wanted to know some views about the soft-delete (anti?) pattern and whether or not I'm going about it the right way in my code base.

We have a piece of code:

class NoDeletedQuery(Query):
    def __new__(cls, *args, **kwargs):
        if args and hasattr(args[0][0], "deleted_at"):
            return Query(*args, **kwargs).filter_by(deleted_at=None)
        else:
            return object.__new__(cls)


Session = scoped_session(sessionmaker(query_cls=NoDeletedQuery))

This is pretty obvious, just auto adds a filter that ignores the deleted at, but, I think this is a very hard thing to get right, especially with joins and whatnot. This works, but I would like to know what's the recommended idiom? I looked at this stack overflow question:



the recipe we have in this regard is similar and is here:


however, it is still pretty simplistic and doesn't produce the sub-selectable you prefer here.

A hybrid of both approaches can be achieved with select_from():

class SomeQuery(Query):
   def __new__(cls, *arg, **kw):
        if (<we should use deleted_at>):
            return Query(*arg, **kw).select_from(select([table]).where(table.c.deleted_at==None).alias())
       else:
            return object.__new__(cls)

that is, query(cls).select_from(selectable) is roughly equivalent to having a mapper() against "cls" that's mapped to "selectable".

Though the ORM still uses query() internally and I'm not sure that will get around the object deleted exception you're getting.     It's also not clear why you'd be getting that exception anyway, unless you are trying to access an object which corresponds to a row that's marked "deleted".   Only a full stack trace and preferably a simple reproducing case would show exactly why it's getting there.


Mahmoud Abdelkader

unread,
Sep 24, 2012, 8:26:18 PM9/24/12
to sqlal...@googlegroups.com
I ended up going with your recipe and this is my final result:

class NoDeletedQuery(Query):
    """
    Subclass query and provide a pre-fabricated WHERE clause that is
    applied to all queries.

    It uses the enable_assertions() method available in SA v0.5.6 and
    above to bypass the Query object's usual checks.


    """
    def get(self, ident):
        # override get() so that the flag is always checked in the
        # DB as opposed to pulling from the identity map. - this is optional.
        return Query.get(self.populate_existing(), ident)

    def __iter__(self):
        return Query.__iter__(self._criterion_filter())

    def from_self(self, *ent):
        # override from_self() to automatically apply
        # the criterion too.   this works with count() and
        # others.
        return Query.from_self(self._criterion_filter(), *ent)

    def _criterion_filter(self):
        # get the mapper here
        mzero = self._mapper_zero()
        # if its a mapped class and our criteria is satisfied..
        if mzero is not None and 'deleted_at' in mzero.mapped_table.c:
            table = mzero.mapped_table
            not_deleted_q = select([table]).where(table.c.deleted_at == None)
            # need an alias to use this as a mapper because some dbs don't
            # support anonymous sub queries and sqlalchemy enforces.
            not_deleted_q = not_deleted_q.alias()
            return self.enable_assertions(False).select_from(not_deleted_q)
        else:
            return self

Thanks for the tip!

I'm now encountering another issue -- how can I ignore this query class on certain relationships? You can pass the query_class keyword argument to the relationship, but it's only for dynamic relationships. How can I change this behavior?

Thanks again!


--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.

Michael Bayer

unread,
Sep 25, 2012, 10:19:50 AM9/25/12
to sqlal...@googlegroups.com
your query would have to detect when it's being called in that context.    there's several contexts for that including the lazy load and the subquery load, and I'm not sure there's any API supported way to really detect those, as well as other usages of Query such as the one you saw where it attempts to refresh a row.   Like even if the API sent some kind of signal in, you'd have to be aware of *all* such signals, be able to distinguish them, etc.

If it were me, I'd not get in the way of any of those and I'd keep things explicit on the "filter deleted" side - that is, session.query().my_special_option(), or session.our_query(), but I'm pretty sure you don't want to do that.

which leads us to a whole other approach which is to in fact use mapper() again like you did originally, using multiple mappers for the class - either like this:

non_deleted = mapper(MyClass, special_selectable, non_primary=True)

then "non_deleted" (or if you want to reverse it) becomes the target that you can send to relationship() or query():

stuff = relationship(non_deleted)


this kind of thing can also be done using multiple classes mapped individually.   but it seems like you're looking for a lot of cross-talk between the "deleted-OK" and "no deleted" series.   its a tricky problem in general.
Reply all
Reply to author
Forward
0 new messages