Querying for the existence of an object in the database

18 views
Skip to first unread message

pyplexed

unread,
Mar 31, 2008, 12:02:13 PM3/31/08
to sqlalchemy
The continuing adventures of a newbie SA user...

Could anyone tell me if there is a simple way of setting up a query
which takes an object as an argument and checks for the existence of
an object in the database?

I know I can construct a query hard coding attributes to filter on,
but I'm sure there is a better way to do this.

Ideally, the query would need to handle attributes containing multiple
objects expressing many to many relationships.

I hope the question makes sense.

Many thanks for any help.

Michael Bayer

unread,
Mar 31, 2008, 2:01:15 PM3/31/08
to sqlal...@googlegroups.com

On Mar 31, 2008, at 12:02 PM, pyplexed wrote:

>
> The continuing adventures of a newbie SA user...
>
> Could anyone tell me if there is a simple way of setting up a query
> which takes an object as an argument and checks for the existence of
> an object in the database?

by primary key ? assuming as below you don't have any prior
knowledge of the object's properties or type,
sess
.query
(obj.__class__).get(object_mapper(obj).primary_key_from_instance(obj))

> I know I can construct a query hard coding attributes to filter on,
> but I'm sure there is a better way to do this.

the above is a little ugly so you could make a function that does it,
given any object (or even make it a classmethod on a base class).

> Ideally, the query would need to handle attributes containing multiple
> objects expressing many to many relationships.

Im not entirely sure what this means. You'd like to query for an
object based on its many-to-many association to another object ? This
would look like:

sess
.query
(DesiredClass
).filter(DesiredClass.somerelation.contains(some_given_object)).all()

pyplexed

unread,
Apr 1, 2008, 2:09:50 AM4/1/08
to sqlalchemy
> On Mar 31, 2008, at 12:02 PM, pyplexed wrote:
>
>
>
> > The continuing adventures of a newbie SA user...
>
> > Could anyone tell me if there is a simple way of setting up a query
> > which takes an object as an argument and checks for the existence of
> > an object in the database?
>
> by primary key ? assuming as below you don't have any prior
> knowledge of the object's properties or type,
> sess
> .query
> (obj.__class__).get(object_mapper(obj).primary_key_from_instance(obj))

Actually, I would know the type, so maybe that makes it simpler.

I'm in a situation where I want to store data which is generated
externally to my system, and which is polled regularly but individual
items change relatively infrequently. It seemed sensible to try to
quickly spot the items which are already in the database (i.e. all
fields the same), as these require no further action.

I would then process any of the remaining items which have changed
singe the last data update, and create records for the new versions,
and finally insert any completely new items into the database.

>
> > I know I can construct a query hard coding attributes to filter on,
> > but I'm sure there is a better way to do this.
>
> the above is a little ugly so you could make a function that does it,
> given any object (or even make it a classmethod on a base class).
>
> > Ideally, the query would need to handle attributes containing multiple
> > objects expressing many to many relationships.
>
> Im not entirely sure what this means. You'd like to query for an
> object based on its many-to-many association to another object ? This
> would look like:
>
> sess
> .query
> (DesiredClass
> ).filter(DesiredClass.somerelation.contains(some_given_object)).all()

Sorry. I'm still learning the language here. As things stand, my code
just inserts the new records, including some items in tables
associated via many-to-many relationships.

I'd like to add the logic described above, and be able to query the
database using the same externally derived data object as I am
currently saving in the session and database, using it instead
(initially) as a query argument to see if the data it contains already
exists in the database - i.e. it contains no new information to add,
including any data in joined tables..

I wondered if there was a query syntax in SA which would take a full
or partially populated data object and return a match or matches on
the basis of the populated fields. (Null values would probably have to
be ignored, thinking about it).

Being a newbie, I'm not sure if I'm asking something obvious, or if
I'm asking for the Moon on a stick. :-)

I'll take a look at the suggestions you made. Many thanks for them.

Michael Bayer

unread,
Apr 1, 2008, 10:28:13 AM4/1/08
to sqlal...@googlegroups.com

On Apr 1, 2008, at 2:09 AM, pyplexed wrote:

>
>>
> Actually, I would know the type, so maybe that makes it simpler.
>
> I'm in a situation where I want to store data which is generated
> externally to my system, and which is polled regularly but individual
> items change relatively infrequently. It seemed sensible to try to
> quickly spot the items which are already in the database (i.e. all
> fields the same), as these require no further action.
>
> I would then process any of the remaining items which have changed
> singe the last data update, and create records for the new versions,
> and finally insert any completely new items into the database.
>
>>

It seems like you want to build a simple marshalling system which,
using a session, can convert data from your external system into a
mapped object graph. When doing this, you do the "query or create"
pattern you're hinting at, but thats about as hard as it gets. You
can fully repopulate all data each time without care for whether or
not its "changed" or not, the Session handles that all for you and
will only issue SQL for what's changed. I'd place the information
about how to "marshall" a particular class using classmethods, as below:

Session = scoped_session(....)

class MyClass(object):
@classmethod
def from_external_datasource(cls, arg1, arg2, arg3, arg4):
# locate existing record
obj = Session.query(cls).filter_by(arg1=arg1, arg2=arg2).first()

# create if not found
if not obj:
obj = MyClass(arg1, arg2)
Session.save(obj)

# populate "new" data
obj.arg3 = arg3
obj.arg4 = arg4

return obj

then you could loop through your external structure and build up the
collection using the load() method on each object. You dont need to
worry where the objects come from or if their data is the same or
different, that will all be sorted out when the session flushes. A
related collection, such as many-to-many, same idea:

@classmethod
def from_external_datasource(cls, arg1, arg2, arg3, arg4,
some_collection):
obj = <get as before>
obj.some_collection = [RelatedClass.from_external_datasource(s.id)
for s in some_collection]
return obj

Note that you can entirely rebuild the collection from scratch - the
Session will still only issue INSERTs or DELETEs for the items which
are different.

Michael Bayer

unread,
Apr 1, 2008, 12:51:13 PM4/1/08
to sqlal...@googlegroups.com

im actually making a change to merge() that also might help you with
this particular issue. stay tuned for that.


pyplexed

unread,
Apr 8, 2008, 6:33:12 AM4/8/08
to sqlalchemy
Thanks Michael. The classmethod code worked a treat.

I see merge() has changed in the new release of SA. Does this affect
the solution?

Michael Bayer

unread,
Apr 8, 2008, 10:39:49 AM4/8/08
to sqlal...@googlegroups.com
it does. you can now build up your structure of objects entirely
fresh and just merge() the whole thing; objects that have primary key
attributes set will be "looked up" in the database beforehand. if you
try that route, let me know how it works out.
Reply all
Reply to author
Forward
0 new messages