Automatic data partitioning using a custom Session class

47 views
Skip to first unread message

Paul Johnston

unread,
Dec 6, 2012, 11:59:42 AM12/6/12
to sqlal...@googlegroups.com
Hi,

I hope everyone's keeping well. It's been ages since I've been on the list. I do use SQLAlchemy from time to time, but now it generally works so well, that I don't have any questions to ask!

But I would appreciate some thoughts on the approach I've taken with a multi-tennant SaaS web app. It's a multichannel stock management system for online retailers. All the user data is attached to a merchant - products, variations, categories, orders, etc. It's important that one merchant cannot access data belonging to another merchant. When handling a request, the active merchant can be determined from the logged-in user, which is kept in thread local storage.

So I started with lots of code like:
    db.Order.query.filter_by(merchant_id = twa.get_user().merchant_id)
   
Now, this is fine, but it's repetitive, and it's risky for security - it just takes me to forget one filter_by merchant_id and we've got a security vulnerability.

So, what I wanted to do is create a custom session that will do this automatically. It needs to do two things:
 1) Any query object against an entity that has a merchant_id property is filtered on that
 2) Any new object that has a merchant_id property has the property automatically set
 
I don't think a session extension can do (1), so I created MySession subclassing Session, and passed this as class_ to sessionmaker. Here's my initial attempt at MySession:

    class MySession(sa.orm.Session):
        def query(self, *entities, **kwargs):
            query = super(MySession, self).query(*entities, **kwargs)
            for e in entities:
                if e.tables[0].name == 'user':
                    continue
                if e.has_property('merchant_id') and twa.get_user():
                    query = query.filter(e.class_.merchant_id == twa.get_user().merchant_id)
            return query

Now, I faced on major problem - seeing these errors:

    InvalidRequestError: Query.get() being called on a Query with existing criterion.

As a temporary workaround, I edited query.py and disabled the check that causes this. That's got me going for now, although obviously a proper fix is needed. I haven't actually attempted (2) yet, but I will be trying that shortly.

I'd really appreciate some feedback on this, particularly ideas to fix the InvalidRequestError. I think this is a very powerful technique that would be useful to many developers. Once my app is working I will see about writing a tutorial on the matter.

Many thanks,

Paul

Michael Bayer

unread,
Dec 6, 2012, 12:40:27 PM12/6/12
to sqlal...@googlegroups.com

On Dec 6, 2012, at 11:59 AM, Paul Johnston wrote:

> Hi,
>
> I hope everyone's keeping well. It's been ages since I've been on the list. I do use SQLAlchemy from time to time, but now it generally works so well, that I don't have any questions to ask!
>
> But I would appreciate some thoughts on the approach I've taken with a multi-tennant SaaS web app. It's a multichannel stock management system for online retailers. All the user data is attached to a merchant - products, variations, categories, orders, etc. It's important that one merchant cannot access data belonging to another merchant. When handling a request, the active merchant can be determined from the logged-in user, which is kept in thread local storage.
>
> So I started with lots of code like:
> db.Order.query.filter_by(merchant_id = twa.get_user().merchant_id)
>
> Now, this is fine, but it's repetitive, and it's risky for security - it just takes me to forget one filter_by merchant_id and we've got a security vulnerability.
>
> So, what I wanted to do is create a custom session that will do this automatically. It needs to do two things:
> 1) Any query object against an entity that has a merchant_id property is filtered on that
> 2) Any new object that has a merchant_id property has the property automatically set
>
> I don't think a session extension can do (1), so I created MySession subclassing Session, and passed this as class_ to sessionmaker. Here's my initial attempt at MySession:
>
> class MySession(sa.orm.Session):
> def query(self, *entities, **kwargs):
> query = super(MySession, self).query(*entities, **kwargs)
> for e in entities:
> if e.tables[0].name == 'user':
> continue
> if e.has_property('merchant_id') and twa.get_user():
> query = query.filter(e.class_.merchant_id == twa.get_user().merchant_id)
> return query
>
> Now, I faced on major problem - seeing these errors:
>
> InvalidRequestError: Query.get() being called on a Query with existing criterion.

we have a recipe that's all about the "built in filter" which also illustrates how to work around that existing criterion thing:

http://www.sqlalchemy.org/trac/wiki/UsageRecipes/PreFilteredQuery


Paul Johnston

unread,
Dec 7, 2012, 9:17:15 AM12/7/12
to sqlal...@googlegroups.com
Hi Mike,


we have a recipe that's all about the "built in filter" which also illustrates how to work around that existing criterion thing:
http://www.sqlalchemy.org/trac/wiki/UsageRecipes/PreFilteredQuery

Thanks for that - exactly what I needed. I'm still tweaking my app to use this approach, but it seems to be working really well.

When (if) I get round to writing a tutorial I'll let you know. Hope you're keeping well,

Paul
Reply all
Reply to author
Forward
0 new messages