Improving performance for ORM queries: skip IdentityMap / create detached instances

598 views
Skip to first unread message

Theo Nonsense

unread,
Dec 7, 2012, 3:01:20 PM12/7/12
to sqlal...@googlegroups.com
I like using the ORM to query and have results translated to objects.  I'm currently using declarative for mapping and I'm trying to figure out a good way to ignore the overhead of the IdentityMap and other ORM niceties when they're not needed.  Specifically, when dealing with a relatively low number of results (~30k), the overhead of mapper._instance(), identity.add(), mapper.populate_state(), and other functions that generally keep track of the state / changes of an object is out weighing the benefit for some cases.  In circumstances where the results are retrieved, but never need a database connection after that point, I'd like to be able to avoid that overhead.  In other words, I'd like to have simple objects with business logic that are used throughout the codebase, but I also want to use the ORM wherever it is not causing a performance bottleneck.  Something like:

Session.query(User).options(instrument_results=False, create_detached=True).all()

This way whenever there is some location where the results do not need to be tracked, I can just specify it in the query and not incur the cost of the tracking.  This is just an example.  I'm looking for ways to achieve the same goal.  They don't necessarily have to be args to .options().

- The first option is to use the ORM to build the queries, but issue them directly through the Core / Session.execute().  The results would then be translated to objects manually, which is the first dislike with this approach.  The ORM already knows how to create the objects.  Also, I'd like to use the same objects as the ORM ones so that the business logic can all live in the same place.  However, creating the ORM objects means that they'll be instrumented, which I'd like to avoid.

Also, I've seen a few other posts here and on StackExchange regarding the notion of read-only or long-lived objects, but none seem to be what I'm looking for.

- Custom Mapper / ClassManager / Instrumentation manager for immutable domain models - https://github.com/andreypopp/saimmutable

This approach is interesting, but doesn't seem to allow toggling for queries / loads that are performance bottlenecks.  I'd like to be able to only enable the quicker / simpler path when needed.  I suppose I could have a table mapped to two different classes through different mappers.  One mapper is the default one and one that ignores instrumentation.  It might be possible to make one a sub-class of the other or provide a mixin for business logic.  Or it might be possible to have one class mapped through two different mappers?  Even if this all worked, it would mean multiple classes for each table and doesn't avoid the overhead of the IdentityMap.


This approach incurs the cost of the IdentityMap and Instrumentation when translating the results.  By the time expunge() can be called, it is too late.

----

It seems that at the very least the Session would need a way to know to ignore the IdentityMap and the mapper would need a way to know to ignore instrumentation.  Any thoughts on how to elegantly solve the problem?  Is there a way to tell the Session to create detached instances, possibly through before_attach()?  Maybe it is possible to have a custom mapper that knows how to ignore instrumentation and identity mapping for specific results based on a flag?

Michael Bayer

unread,
Dec 7, 2012, 3:45:42 PM12/7/12
to sqlal...@googlegroups.com
On Dec 7, 2012, at 3:01 PM, Theo Nonsense wrote:

I like using the ORM to query and have results translated to objects.  I'm currently using declarative for mapping and I'm trying to figure out a good way to ignore the overhead of the IdentityMap and other ORM niceties when they're not needed.  Specifically, when dealing with a relatively low number of results (~30k), the overhead of mapper._instance(), identity.add(), mapper.populate_state(), and other functions that generally keep track of the state / changes of an object is out weighing the benefit for some cases.  In circumstances where the results are retrieved, but never need a database connection after that point, I'd like to be able to avoid that overhead.  In other words, I'd like to have simple objects with business logic that are used throughout the codebase, but I also want to use the ORM wherever it is not causing a performance bottleneck.  Something like:

Session.query(User).options(instrument_results=False, create_detached=True).all()

This way whenever there is some location where the results do not need to be tracked, I can just specify it in the query and not incur the cost of the tracking.  This is just an example.  I'm looking for ways to achieve the same goal.  They don't necessarily have to be args to .options().

- The first option is to use the ORM to build the queries, but issue them directly through the Core / Session.execute().  The results would then be translated to objects manually, which is the first dislike with this approach.  The ORM already knows how to create the objects.  Also, I'd like to use the same objects as the ORM ones so that the business logic can all live in the same place.  However, creating the ORM objects means that they'll be instrumented, which I'd like to avoid.

Also, I've seen a few other posts here and on StackExchange regarding the notion of read-only or long-lived objects, but none seem to be what I'm looking for.

- Custom Mapper / ClassManager / Instrumentation manager for immutable domain models - https://github.com/andreypopp/saimmutable

This approach is interesting, but doesn't seem to allow toggling for queries / loads that are performance bottlenecks.  I'd like to be able to only enable the quicker / simpler path when needed.  I suppose I could have a table mapped to two different classes through different mappers.  One mapper is the default one and one that ignores instrumentation.  It might be possible to make one a sub-class of the other or provide a mixin for business logic.  Or it might be possible to have one class mapped through two different mappers?  Even if this all worked, it would mean multiple classes for each table and doesn't avoid the overhead of the IdentityMap.


This approach incurs the cost of the IdentityMap and Instrumentation when translating the results.  By the time expunge() can be called, it is too late.

Well there's a bit of a contradiction here, you're saying, you don't want the identity map or mapper._instance() or any of that, but then you're saying, "the ORM already knows how to create the objects".    I'd advise a deep dive into the mechanics to learn intimately how that all works.  In particular, the identity map is extremely central to how relationship loading works, both eager loading where such a construct is required, as well as lazy loading, where it provides a critical performance boost by allowing objects that are already present to be used without any SQL, or at least without being loaded redundantly.  Instrumentation is required for lazy loading - without lazy loading, you'd need to ensure that all queries occur up front for all attributes.

If you don't like the performance hit of identity map, less effort would be, contribute one for us written in C.   Or see if pypy can help.   

I would note that Query can load individual columns, where you do get to skip all the overhead of object loads, and you get back a named-tuple-like object.  So if you don't care about relationship loading and just want tuple-like objects, that mechanism is there right now, and it wouldn't be much effort at all to add a helper that expands a given mapped object into it's individual per-column attributes.

It's really relationships that require a lot of the complexity to loading.     Other ORMs have the approach where a relationship attribute basically lazy-loads the related collection every time.   SQLAlchemy's approach saves on SQL as an already-loaded object caches its related collections and object references.
















----

It seems that at the very least the Session would need a way to know to ignore the IdentityMap and the mapper would need a way to know to ignore instrumentation.  Any thoughts on how to elegantly solve the problem?  Is there a way to tell the Session to create detached instances, possibly through before_attach()?  Maybe it is possible to have a custom mapper that knows how to ignore instrumentation and identity mapping for specific results based on a flag?

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/DL0FLhGQYKYJ.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.

Theo Nonsense

unread,
Dec 10, 2012, 6:33:39 PM12/10/12
to sqlal...@googlegroups.com
Thanks for the detailed reply!  Comments inline below.

On Friday, December 7, 2012 12:45:42 PM UTC-8, Michael Bayer wrote:

On Dec 7, 2012, at 3:01 PM, Theo Nonsense wrote:

I like using the ORM to query and have results translated to objects.  I'm currently using declarative for mapping and I'm trying to figure out a good way to ignore the overhead of the IdentityMap and other ORM niceties when they're not needed.  Specifically, when dealing with a relatively low number of results (~30k), the overhead of mapper._instance(), identity.add(), mapper.populate_state(), and other functions that generally keep track of the state / changes of an object is out weighing the benefit for some cases.  In circumstances where the results are retrieved, but never need a database connection after that point, I'd like to be able to avoid that overhead.  In other words, I'd like to have simple objects with business logic that are used throughout the codebase, but I also want to use the ORM wherever it is not causing a performance bottleneck.  Something like:

Session.query(User).options(instrument_results=False, create_detached=True).all()

This way whenever there is some location where the results do not need to be tracked, I can just specify it in the query and not incur the cost of the tracking.  This is just an example.  I'm looking for ways to achieve the same goal.  They don't necessarily have to be args to .options().

- The first option is to use the ORM to build the queries, but issue them directly through the Core / Session.execute().  The results would then be translated to objects manually, which is the first dislike with this approach.  The ORM already knows how to create the objects.  Also, I'd like to use the same objects as the ORM ones so that the business logic can all live in the same place.  However, creating the ORM objects means that they'll be instrumented, which I'd like to avoid.

Also, I've seen a few other posts here and on StackExchange regarding the notion of read-only or long-lived objects, but none seem to be what I'm looking for.

- Custom Mapper / ClassManager / Instrumentation manager for immutable domain models - https://github.com/andreypopp/saimmutable

This approach is interesting, but doesn't seem to allow toggling for queries / loads that are performance bottlenecks.  I'd like to be able to only enable the quicker / simpler path when needed.  I suppose I could have a table mapped to two different classes through different mappers.  One mapper is the default one and one that ignores instrumentation.  It might be possible to make one a sub-class of the other or provide a mixin for business logic.  Or it might be possible to have one class mapped through two different mappers?  Even if this all worked, it would mean multiple classes for each table and doesn't avoid the overhead of the IdentityMap.


This approach incurs the cost of the IdentityMap and Instrumentation when translating the results.  By the time expunge() can be called, it is too late.
 
Well there's a bit of a contradiction here, you're saying, you don't want the identity map or mapper._instance() or any of that, but then you're saying, "the ORM already knows how to create the objects".    I'd advise a deep dive into the mechanics to learn intimately how that all works.  In particular, the identity map is extremely central to how relationship loading works, both eager loading where such a construct is required, as well as lazy loading, where it provides a critical performance boost by allowing objects that are already present to be used without any SQL, or at least without being loaded redundantly.  Instrumentation is required for lazy loading - without lazy loading, you'd need to ensure that all queries occur up front for all attributes.

I spoke too broadly when mentioning parts that I'd like to avoid.  Mapper._instance() does a lot of work and I'd only like to have a flag to ignore some of that work, some of the time.  Specifically, I'm looking at some timing profiles and noticing that there is a significant amount of overall time spent in the following (decreasing order of time spent):

sqlalchemy/orm/identity.py:119(add)
sqlalchemy/orm/mapper.py:1996(_instance)
sqlalchemy/orm/state.py:42(__init__)
sqlalchemy/orm/mapper.py:1953(populate_state)
sqlalchemy/orm/attributes.py:160(__get__)
sqlalchemy/orm/state.py:401(commit_all)
sqlalchemy/orm/instrumentation.py:278(new_instance)
sqlalchemy/orm/identity.py:140(get)
...

The time for each call seems acceptable for a lower quantity of results given the ease and features of the ORM.  When the quantity gets larger, around 30k for the models I'm dealing with, the overall time is not always worth the extra time.

I should've also mentioned that eager loading all data that will be needed is acceptable and what I'm currently doing via the Core.  When using both ORM and Core, I have a more complex set of models.  Some base models that contain the business logic and the ORM models for persistence, which are sub-classes / mixins of the base models.  It seems unnecessary to maintain all of that, if I'm able to inform the ORM to create the base models in the cases where performance is needed.
 
If you don't like the performance hit of identity map, less effort would be, contribute one for us written in C.   Or see if pypy can help.  

Good point.  Both of these options should increase overall performance and not just for the case I'm talking about.  They may be enough to address the performance impact I'm concerned about.  However, I believe there is still a use case here and am interested in a solution.
 
I would note that Query can load individual columns, where you do get to skip all the overhead of object loads, and you get back a named-tuple-like object.  So if you don't care about relationship loading and just want tuple-like objects, that mechanism is there right now, and it wouldn't be much effort at all to add a helper that expands a given mapped object into it's individual per-column attributes.

I don't think tuple results are the right approach here.  It would seem that I'd need some tuple->object conversion if I'd want to be able to use the objects throughout the rest of the application, along with their attached business logic.  As soon as I create those objects, I'd incur the instrumentation hit.
 
It's really relationships that require a lot of the complexity to loading.     Other ORMs have the approach where a relationship attribute basically lazy-loads the related collection every time.   SQLAlchemy's approach saves on SQL as an already-loaded object caches its related collections and object references.

In a lot of circumstances, the ORM is wonderful.  In the cases where it isn't quite as performant, I'd like to be able to skip the pieces that won't be needed.

I'm not familiar enough to discuss the internals of _instance() and the IdentityMap.  That's the next item to investigate.  I'm hoping it is possible to create a custom Mapper / MapperProperty that can be dynamically changed on a per query invocation.

Example:

Base = declarative_base()

class Article(Base):
    __tablename__ = 'articles'
    article_id = Column(Integer, primary_key=True)
    article_title = Column(String)
    
    def __init__(self, article_id=None, article_title=None):
        self.article_id = article_id
        self.article_title = article_title
    
    def do_interesting(self):
        # do some interesting business logic here
        return
    
class Tag(Base):
    __tablename__ = 'tags'
    tag_id = Column(Integer, primary_key=True)
    tag_name = Column(String, unique=True)
    
    articles = relationship('Article', backref='tags')

# querying using ORM
###################################
articles = session.query(Article).filter(Article.article_title=='a cool title').all()

# querying using Core from ORM -- avoids IdentityMap, but not instrumentation
###################################
q_tbl = Article.__table__
query = q_table.select(q_table.c.article_title=='a cool title')
results = session.execute(query)

objs = []
for row in results:
    # custom code to create an Article from a row.
    # incurs penalty of instrumentation
    obj = Article(row[q_table.c.article_id], row[q_table.c.article_title])
    objs.append(obj)

# to avoid instrumentation and IdentityMap, we'd need unmapped classes
###################################

# unmapped article with business logic... i.e. do_interesting()
class SimpleArticle(object):
    def __init__(self, article_id=None, article_title=None):
        self.article_id = article_id
        self.article_title = article_title
    
    def do_interesting(self):
        # do some interesting business logic here
        return

# mapped article - mixin SimpleArticle
class Article(SimpleArticle, Base):
    __tablename__ = 'articles'
    article_id = Column(Integer, primary_key=True)
    article_title = Column(String)

# unmapped tag
class SimpleTag(object):
    def __init__(self, tag_id=None, article_title=None):
        self.tag_id = tag_id
        self.article_title = article_title

# mapped tag
class Tag(SimpleTag, Base):
    __tablename__ = 'tags'
    tag_id = Column(Integer, primary_key=True)
    tag_name = Column(String, unique=True)
    
    articles = relationship('Article', backref='tags')

# query...
q_tbl = Article.__table__
query = q_table.select(q_table.c.article_title=='a cool title')
results = session.execute(query)

objs = []
for row in results:
    # custom code to create a SimpleArticle from a row.
    obj = SimpleArticle(row[q_table.c.article_id], row[q_table.c.article_title])
    objs.append(obj)


# desired query using ORM with flag
###################################
query = session.query(Article).filter(Article.article_title=='a cool title')
query = query.options(instrument=False, detached=True)
articles = query.all()

# end example

Notice how the last one is almost identical to the normal ORM case with the added .options().  No extra unmapped classes to define or deal with.  No need to drop to Core and write the row conversions for each query.

Michael Bayer

unread,
Dec 10, 2012, 7:36:58 PM12/10/12
to sqlal...@googlegroups.com
On Dec 10, 2012, at 6:33 PM, Theo Nonsense wrote:

I spoke too broadly when mentioning parts that I'd like to avoid.  Mapper._instance() does a lot of work and I'd only like to have a flag to ignore some of that work, some of the time.  Specifically, I'm looking at some timing profiles and noticing that there is a significant amount of overall time spent in the following (decreasing order of time spent):

sqlalchemy/orm/identity.py:119(add)
sqlalchemy/orm/mapper.py:1996(_instance)
sqlalchemy/orm/state.py:42(__init__)
sqlalchemy/orm/mapper.py:1953(populate_state)
sqlalchemy/orm/attributes.py:160(__get__)
sqlalchemy/orm/state.py:401(commit_all)
sqlalchemy/orm/instrumentation.py:278(new_instance)
sqlalchemy/orm/identity.py:140(get)
...

The time for each call seems acceptable for a lower quantity of results given the ease and features of the ORM.  When the quantity gets larger, around 30k for the models I'm dealing with, the overall time is not always worth the extra time.

I should've also mentioned that eager loading all data that will be needed is acceptable and what I'm currently doing via the Core.  When using both ORM and Core, I have a more complex set of models.  Some base models that contain the business logic and the ORM models for persistence, which are sub-classes / mixins of the base models.  It seems unnecessary to maintain all of that, if I'm able to inform the ORM to create the base models in the cases where performance is needed.
 
If you don't like the performance hit of identity map, less effort would be, contribute one for us written in C.   Or see if pypy can help.  

Good point.  Both of these options should increase overall performance and not just for the case I'm talking about.  They may be enough to address the performance impact I'm concerned about.  However, I believe there is still a use case here and am interested in a solution.
 
I would note that Query can load individual columns, where you do get to skip all the overhead of object loads, and you get back a named-tuple-like object.  So if you don't care about relationship loading and just want tuple-like objects, that mechanism is there right now, and it wouldn't be much effort at all to add a helper that expands a given mapped object into it's individual per-column attributes.

I don't think tuple results are the right approach here.  It would seem that I'd need some tuple->object conversion if I'd want to be able to use the objects throughout the rest of the application, along with their attached business logic.  As soon as I create those objects, I'd incur the instrumentation hit.
 
It's really relationships that require a lot of the complexity to loading.     Other ORMs have the approach where a relationship attribute basically lazy-loads the related collection every time.   SQLAlchemy's approach saves on SQL as an already-loaded object caches its related collections and object references.

In a lot of circumstances, the ORM is wonderful.  In the cases where it isn't quite as performant, I'd like to be able to skip the pieces that won't be needed.

I'm not familiar enough to discuss the internals of _instance() and the IdentityMap.  That's the next item to investigate.  I'm hoping it is possible to create a custom Mapper / MapperProperty that can be dynamically changed on a per query invocation.


# desired query using ORM with flag
###################################
query = session.query(Article).filter(Article.article_title=='a cool title')
query = query.options(instrument=False, detached=True)
articles = query.all()

# end example

Notice how the last one is almost identical to the normal ORM case with the added .options().  No extra unmapped classes to define or deal with.  No need to drop to Core and write the row conversions for each query.


Well yes I'd certainly go with query.options() to enable such a feature.  

But as far as _instance() and all that, I hardly know where to start in just trying to explain the situation.   That function and all the workings of it is not something that took only a few days, or a few weeks, or even a few months to write.  For that whole system to work the way it does, took *years*.   It is one of the most complicated things I've ever had to come up with, and the loading approach has been rewritten from scratch many times over.    

Years ago, SQLAlchemy was challenged by a competing ORM called Storm.  Storm was a tiny ORM with only a fraction of SQLA's capability, and at that time it was much faster - it does not support eager loading or even any up-front collection loading of any kind, or class inheritance loading, or any other of a wide range of features our loading has to take into account.   I spent months refactoring SQLA's loading approach and began to approach the same loading speed of Storm, but by that time, they rewrote about 60% of Storm's internals in C.  While SQLAlchemy has some C code to speed up some sections, we don't have any in the ORM yet.  Storm's C code replaced the majority of their object loading routines.   I continued to spend months reworking, tweaking, redesigning how this system works, constantly trying to catch up to them.     I achieved dramatic speedups at many junctures - in 0.5 I doubled the speed of it, and in 0.7 I further sped things up quite a bit as well.   Some valuable links to see this are the "Tale of Three Profiles" (http://techspot.zzzeek.org/2010/12/12/a-tale-of-three-profiles/) as well as the AOSA chapter I wrote which features lots of detail on how this works ("Query and Loading Behavior" at http://www.aosabook.org/en/sqlalchemy.html).   As it stands today, we are again nearly as fast as Storm, with its fraction-of-a-featureset and predominantly C architecture, and in some cases SQLA is almost a little faster than Storm - and this is just for straight loads.   The whole concept of SQL queries saved due to eager loading and collection caching they can't compete with at all.

One big focus of 0.8 has been not so much loading speed but more fluency in loading - we can now load multiple collections of the same name on different subclasses of the same base class, all simultaneously, for example.   The refactoring to make this happen again took me weeks of effort.   

But I continue to try to find ways to make _instance() go faster.   In recent months I've had several rewrites where it was required for me to spend days rearranging the code completely in order to see if the approach would be faster, only to see that it wasn't, and those many days of effort went in the garbage.

So as it stands, _instance() is by leaps and bounds the most performance critical system in the whole library, and every line of code added to it makes it slower.   When you suggest that you'd like it to run in two entirely different modes, where a whole series of features are disabled, here are the *possible* implications of that, making sure you read my description of *possible* at the end:

A. there will be all kinds of generalizations and/or conditionals needed to be installed into _instance() and similar to make it run in two modes, inherently making it slower and much more complicated

B. we will need to create a whole series of tests all over again to re-test the altered loading contract all over again, as this new loading contract will be entirely different

C. the new loading system won't be able to handle many of the patterns users take for granted, such as eager loading, polymorphic loading

D. users will start using this new mode for "better performance" and immediately will begin requesting more and more features within this new mode that weren't part of it's original contract (since you didn't need them for your use case), which will add to my support burden, complicate _instance() even more, slow it down, and push it towards being a redundant "more than one way to do it" versus the current system.   You can say, "well just ignore them", but then here you are asking for something too :).

E. The public API and usage pattern of the ORM becomes much more difficult to explain, more complicated and confusing for new users and veteran users alike.

F. All new bugs and issues in this system I will probably have to support myself, as historically all new features, whether or not contributed by other users, end up being supported by me in any case.   Contributing authors typically hang around for about a year at most.  This is normal, but as the person who is left taking care of the house, any system that is added to the library is inherently something I have to support forever, or until i can get it removed.

Now with all of those *possible* outcomes, by *possible*, it means we would need to figure out exactly why each of these unwanted outcomes will *not* happen. 

Now, if there's some approach we can take such that we can actually get _instance() to open up in such a way that it's more composable without a vast set of complications, that would be very interesting. In particular, if there were a way such that the loading could be made "pluggable" such that you could produce a separate extension of your own which makes use of these hooks to produce a loading scheme that you like, that would be much more of a win - I would only need to maintain that API and it would be on your end to maintain the "alternate SQLAlchemy loader" system.   In fact that might be the best approach to take.  if your loader system turns out to be really spectacular and simple based on this "loading" API, then it's an easy add.   So even though I've got all those reservations above, there may be architectural ways to solve them.      But this is still a very tall order that is not a priority on my end right now.    I'm spending what time I have to slog through the day-to-day issues that are needed for upcoming releases, getting 0.8 out, and trying to figure out when I'm going to call it 1.0.  

















Reply all
Reply to author
Forward
0 new messages