Load sqlalchemy orm model from dict

9,912 views
Skip to first unread message

Daniel Kraus

unread,
Jan 3, 2017, 3:10:54 AM1/3/17
to sqlal...@googlegroups.com
Hi,

how can I load a sqlalchemy orm model from a dictionary?

Let's say I have a `User` model with attributes `id`, `name`, `email`
and a relationship `languages`.

I have a dict with `id`, `name`, `email` from users in my cache,
but not `languages` that's rarely used.

Is it possible to create a `User` model from a `user_dict` that behaves
like I would have queried it with dbsession.query(User).get(42)?
What I mean in particular is that I want that an access to
`user.languages` creates a subquery and populates the attribute.

I now about the dogpile caching example which is similar but looks
way too much for what I want.

PS, I asked the same question on stackoverflow:
http://stackoverflow.com/questions/41158307/load-sqlalchemy-orm-model-from-dict

Maybe I didn't make clear what I want.
Please tell me if I should rephrase my question.

Thanks in advance,
Daniel

mike bayer

unread,
Jan 5, 2017, 10:32:57 AM1/5/17
to sqlal...@googlegroups.com


On 01/03/2017 03:10 AM, Daniel Kraus wrote:
> Hi,
>
> how can I load a sqlalchemy orm model from a dictionary?
>
> Let's say I have a `User` model with attributes `id`, `name`, `email`
> and a relationship `languages`.
>
> I have a dict with `id`, `name`, `email` from users in my cache,
> but not `languages` that's rarely used.
>
> Is it possible to create a `User` model from a `user_dict` that behaves
> like I would have queried it with dbsession.query(User).get(42)?

I will interpret this literally. This would be:

d = {"id": 1, "name": "x", "email": "y"}

u1 = User(**d)

then you'd persist u1 with session.add() / commit() and you're done.

> What I mean in particular is that I want that an access to
> `user.languages` creates a subquery and populates the attribute.

OK, so I can't imagine any connection between "I want a model from a
dict" and "I want user.languages to create a subquery", so again lets
take this one phrase literally.

"user.languages" would imply that "user" is an instance of User that's
already in memory. So "user.languages", being a relationship, will emit
a lazy load to the database to load the list of languages. It's not
exactly a subquery, it's a single SELECT statement.


>
> I now about the dogpile caching example which is similar but looks
> way too much for what I want.

right, so, this is yet another phrase that appears to have no connection
to the previous two phrases that also appear to have no connection :).
So yes, your SO question is unlikely to get any answers because it is
not apparent what you're looking for.


Here's a way that might make it clear - show some pseudocode. That is:


# "I have a dict"
d = {"id": 1, "name": "x", "email": "y"}


# I want to create a user model, something something about attributes,
subqueries

obj = <something something that shows what you want to do>

session.add(obj)

query = session.query(<something>).filter(<something something>)

assert result == <something something that shows what you want>


These are just some examples of things you can try filling in.

Daniel Kraus

unread,
Jan 5, 2017, 8:34:52 PM1/5/17
to sqlal...@googlegroups.com

Hi!

I don't even quote my old message since it's just confusing.
In my head the question made sense ;)

So I try again with a code example:

I have a class `User`:

class User(Base):
    id = Column(Integer, primary_key=True)
    name = Column(String(64))
    email = Column(String(64))

    languages = relationship('Language', secondary='user_languages')

I already have a lot of users stored in my DB.
And I know that I have, for example, this user in my DB:

user_dict = {
    'id': 23,
    'name': 'foo',
    'email': 'foo@bar',
}

So I have all the attributes but the relations.

Now I want to make a sqlalchemy `User` instance
and kind of register it in sqlalchemy's system
so I can get the `languages` if needed.

user = User(**user_dict)

# Now I can access the id, name email attributes
assert user.id == 23

# but since sqlalchemy thinks it's a new model it doesn't
# lazy load any relationships
assert len(user.languages) == 0
# I want here that the languages for the user with id 23 appear

# So I want that `user` is the same as when I would have done
user_from_db =  DBSession.query(User).get(23)
assert user == user_from_db

The use-case is that I have a big model with lots of complex
relationships but 90% of the time I don't need the data from those.
So I only want to cache the direct attributes plus what else I need
and then load those from the cache like above and be able to
use the sqlalchemy model like I would have queried it from the db.

Hope it makes a bit more sense now.
Sorry for the confusing first question and wasting your time.

Thanks,
Daniel

Jonathan Vanasco

unread,
Jan 6, 2017, 11:01:34 AM1/6/17
to sqlalchemy


On Thursday, January 5, 2017 at 8:34:52 PM UTC-5, Daniel Kraus wrote:

The use-case is that I have a big model with lots of complex

relationships but 90% of the time I don't need the data from those.


If I'm reading your question correctly, most of what sqlalchemy does (and excels at) is specifically keeping people from doing what you're trying to do.  It seems like you're trying to avoid all the work that is done to ensure data integrity across sessions and transactions.  (Which is a common need)

Read up on the `merge` session method (http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#merging)  The dogpile caching section is largely based on that (though goes beyond it).  

You would do something like this:

    user = User(**userdata) 
    user = session.merge(user)

That will merge the user object into the session (and return the merged object).

You will run into problems if your cached data is incomplete though -- AFAIK, there is no way to tell sqlalchemy that you've only loaded data for certain columns.  If you don't populate all the columns in your cache, but have it in the db, I have no idea how to get that info from the db.

mike bayer

unread,
Jan 6, 2017, 11:11:36 AM1/6/17
to sqlal...@googlegroups.com
you're looking for session.merge() but if you're looking to save on a
SELECT you might also want to send in load=False - and if you are
starting with a fresh (non-pickled) object you probably need to call
make_transient_to_detached first so that it acts like it was loaded from
the database first.
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

Daniel Kraus

unread,
Jan 6, 2017, 7:21:04 PM1/6/17
to sqlal...@googlegroups.com
Hi!

mike bayer <mik...@zzzcomputing.com> writes:
> you're looking for session.merge() but if you're looking to save on a
> SELECT you might also want to send in load=False - and if you are
> starting with a fresh (non-pickled) object you probably need to call
> make_transient_to_detached first so that it acts like it was loaded from
> the database first.

Nice. Works like I wanted :)

Thanks,
Daniel

Daniel Kraus

unread,
Jan 6, 2017, 7:47:17 PM1/6/17
to sqlal...@googlegroups.com

Jonathan Vanasco <jvan...@gmail.com> writes:
> On Thursday, January 5, 2017 at 8:34:52 PM UTC-5, Daniel Kraus wrote:
>>
>> The use-case is that I have a big model with lots of complex
>> relationships but 90% of the time I don't need the data from those.
>
> If I'm reading your question correctly, most of what sqlalchemy does (and
> excels at) is specifically keeping people from doing what you're trying to
> do.

I think you got me wrong then.

> It seems like you're trying to avoid all the work that is done to
> ensure data integrity across sessions and transactions. (Which is a common
> need)

Nope,
I just want to use a cache where I only store only the DB row for my
model and not all data from relation tables as well.

> Read up on the `merge` session
> method (http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#merging)
> The dogpile caching section is largely based on that (though goes beyond
> it).
>
> You would do something like this:
>
> user = User(**userdata)
> user = session.merge(user)
>
> That will merge the user object into the session (and return the merged
> object).

Thanks.


> You will run into problems if your cached data is incomplete though --
> AFAIK, there is no way to tell sqlalchemy that you've only loaded data for
> certain columns. If you don't populate all the columns in your cache, but
> have it in the db, I have no idea how to get that info from the db.

`session.merge` populates the missing attributes from the db.
But in my case, where I have all the data and know it's up to date,
I can even skip that one select and use the `load=False` parameter.
So for my example I can (thanks to Mike's reply):

make_transient_to_detached(user)
user = session.merge(user, load=False)

and after that I can do my `user.big_n_to_m_relation_data` and it will
correctly query the db.


Thanks,
Daniel

Chris Withers

unread,
Jan 7, 2017, 6:34:33 AM1/7/17
to sqlal...@googlegroups.com
In fact, if your dict is created from a User object in the first place,
I'd just keep that User object, detach it from the session and put it in
your cache...

Chris

Jonathan Vanasco

unread,
Jan 9, 2017, 10:11:20 AM1/9/17
to sqlalchemy


On Friday, January 6, 2017 at 7:47:17 PM UTC-5, Daniel Kraus wrote:
> If I'm reading your question correctly, most of what sqlalchemy does (and
> excels at) is specifically keeping people from doing what you're trying to
> do.

I think you got me wrong then.

> It seems like you're trying to avoid all the work that is done to
> ensure data integrity across sessions and transactions.  (Which is a common
> need)

Nope,
I just want to use a cache where I only store only the DB row for my
model and not all data from relation tables as well.

Yeah, that's the "dangerous" type of stuff I'm talking about. SqlAlchemy tries to keep you from doing that all the time, because that pattern can eventually lead to inconsistencies.  So much of Mike's various details are focused on maintaining database integrity, like how a `commit` defaults to clearing out the session and triggers a reload/new transaction.

We all end up needing to get around this stuff though.  I have many blocks like your eventual solution.  Something that I've found useful is marking 'cached' objects like yours with an attribute like "is_cached" or "is_readonly", and adding an event listener to raise errors if you try to modify them.  This helps stop errors with potentially stale data (which are likely to happen in a few months).


Reply all
Reply to author
Forward
0 new messages