u may think of such "defensive programming" approach.
ciao
svilen
>
> Loading these objects with any new or clear()ed session would
> drastically impact performance.
im assuming the performance increase is due to the reduced overhead in
creating objects ? From what I can see, you are still issuing SQL,
still processing result rows, still populating state in the given
instances....all of which is very time consuming. Plus, your scheme
actually wont even save object creation overhead since you can't
exactly do it that way...
>
> I'm currently using this recipe:
> Provide a MapperExtension with create_instance and maintain a custom
> cache for the immutable objects (additionally to the short-living
> identity maps of the sessions).
>
> class UniqueByPrikeyMapper(MapperExtension):
> def create_instance(self, mapper, selectcontext, row, class_):
> key = row[class_.c.id] # there's a primary key "id" always
> try:
> return get_instance_from_somewhere(class_, key=key)
> except KeyError: # instance not in the cache: put it there
> res = class_.__new__(class_)
> register_instance_somewhere(class_,
> key=row[instance.c.id])
> return res
>
> Now, any persistant instance gets cached fine when loaded. But freshly
> created objects need a separate treatment.
this wont work out of the box because a particular object instance can
only be in one Session at a time. For this kind of use case we
provide a method called "merge()" which takes a dont_load=True
argument to create local copies of objects in sessions.
>
> Since we use the auto generated "id" as key, any new instance needs to
> be "manually" put into the cache after is was flush()ed. I've fiddled
> with a SessionExtension/after_flush(), where the new instances can be
> found in flush_context.uow.new. This could be a place to put new
> instances into the cache althought is seems somewhat hackish.
why not session.new ? "context.uow.new" is gone in 0.5.
> Alternatively, we can re-load any instance after flush() to let
> create_instance() run, and throw away the original reference.
> An attempt to use populate_instance() for caching instances there (and
> only there) failed. populate_instance() is *not* run when the auto-
> generated id is populated after flush().
after_flush() is a decent place to do things like this (or even
after_insert()). populate_instance() corresponds to object state
loaded from the database but it would be wasteful and complex to rely
upon an expunge-reload scheme just for state management.
> Required would be the event
> semantics "always called when instance got sync'ed with the DB".
that is the populate_instance() method on the load side. On the save
side it's after_insert()/after_update().
> Someone has a better practice / comments / can see shortcomings ?
what is the specific overhead you are looking to reduce ? I don't yet
see much savings here (well, I don't see any, but its only 10:30 AM
for me). An ORM-external approach could reduce SQL and result
processing overhead a lot more vastly.
>
>> this wont work out of the box because a particular object instance
>> can
>> only be in one Session at a time. For this kind of use case we
>> provide a method called "merge()" which takes a dont_load=True
>> argument to create local copies of objects in sessions.
>
> Yes, the "custom identity map" cannot be a sessions identity map, it
> is a plain dict. Otherwise, the objects indeed would have to be merged
> in any new session. But copies of the objects, anyway, are not desired
> since identity is to be kept.
>>
so....since these objects are returned by create_instance(), that
means they are getting sent straight into a Session. I'm assuming you
didn't write your own Session, so what happens when the
populate_instance() step gets run on the objects , and several
concurrent threads are all issuing populate_instance() on those items
at the same time ? There is no guarantee within the load of objects
that there are no state changes - including internal state variables
like "state.runid" which definitely will not work with concurrent
modifications. Similarly, I dont see how very basic functions
necessary for SQLA's operation, such as object_session(), can possibly
work correctly here, since you are attempting to place the same object
in multiple sessions - an object's session is identified by a single
attribute placed upon the object's state (in 0.4 its on the object
itself).
Unless your app is using only one Session. Then it could work,
however it would be extremely difficult to take advantage of multiple
threads since you'd have to mutex virtually all access to that single
Session.
If there is some central core of "state" that you dont want to
replicate, it's still possible to have many copies of an object all
reference that same state using a proxying pattern. As far as hash
identity, the __hash__() and __cmp__() methods work fine in that
regard.