Non-repeatable reads, vectorized refresh

245 views
Skip to first unread message

Evgenii

unread,
Aug 23, 2022, 1:50:12 PM8/23/22
to sqlalchemy

Hi there!

Please help me to understand:
I want to make two equal queries, that sends real queries to database each time:

session.query(InstrumentTable).get(instr_id)
<some other code>
session.query(InstrumentTable).get(instr_id)

The problem is that second query returns instance from identity_map.
I know how to hardcode it, but I want a beautiful solution.

  1. No, I cannot use refresh method, because it is not vectorized
    (this part of code is used in custom vectorized refresh method,
    because in our case each transaction is slow and therefore expensive) -
    having vectorized SA refresh is a perfect solution, but it does not exist.
  2. rollback make real second query, but it seems to be hardcode.
  3. session.identity_map._dict = {} hardcode too
  4. Opening another session hardcode too.
  5. Using one of AUTOCOMMIT, READ COMMITTED, READ UNCOMMITTED,
    REPEATABLE READ, SERIALIZABLE as isolation_level seems to be a true way,
    but none of them does not work as I expect.

Mike Bayer

unread,
Aug 23, 2022, 2:04:05 PM8/23/22
to noreply-spamdigest via sqlalchemy


On Tue, Aug 23, 2022, at 1:50 PM, Evgenii wrote:

Hi there!


Please help me to understand:
I want to make two equal queries, that sends real queries to database each time:


session.query(InstrumentTable).get(instr_id)
<some other code>
session.query(InstrumentTable).get(instr_id)


The problem is that second query returns instance from identity_map.
I know how to hardcode it, but I want a beautiful solution.


  1. No, I cannot use refresh method, because it is not vectorized
    (this part of code is used in custom vectorized refresh method,
    because in our case each transaction is slow and therefore expensive) -
    having vectorized SA refresh is a perfect solution, but it does not exist.

"vectorized".   googled it, seems to mean "Vectorization is the process of converting an algorithm from operating on a single value at a time to operating on a set of values at one time. "  OK.   which would mean that get() is also not "vectorized" since it also operates on one value at a time...so..not really sure what that means.

Anyway, don't use query.get(), use session.get() and pass populate_existing


session.get(InstrumentTable, id, populate_existing=True)




  1. rollback make real second query, but it seems to be hardcode.
  2. session.identity_map._dict = {} hardcode too
  3. Opening another session hardcode too.
  4. Using one of AUTOCOMMIT, READ COMMITTED, READ UNCOMMITTED,
    REPEATABLE READ, SERIALIZABLE as isolation_level seems to be a true way,
    but none of them does not work as I expect.


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.

Evgenii

unread,
Aug 23, 2022, 3:00:34 PM8/23/22
to sqlalchemy

Thanks a lot!
I used “vectorized” for sqlalchemy.orm.Session.refresh method and mean that it would be nice to use:

session.refresh(instances_list)

that make a single query to database. Instead of:

for el in instances_list:
    session.refresh(el)

that make N queries.

like:

res = session.query(InstrumentTable).filter(InstrumentTable.id.in_([id1, id2, ..., idn])).all()
# and sort res with given ids order

is much faster than (especially in low network connections):

instuments = [session.query(InstrumentTable).get(id_) for id_ in [id1, id2, ..., idn]]
вторник, 23 августа 2022 г. в 21:04:05 UTC+3, Mike Bayer:

Mike Bayer

unread,
Aug 23, 2022, 3:34:23 PM8/23/22
to noreply-spamdigest via sqlalchemy


On Tue, Aug 23, 2022, at 3:00 PM, Evgenii wrote:

Thanks a lot!
I used “vectorized” for sqlalchemy.orm.Session.refresh method and mean that it would be nice to use:

session.refresh(instances_list)

that make a single query to database. Instead of:



do this:

session.execute(
   select(Instance).where(Instance.id.in([list_of_ids]), execution_options={"populate_existing": True})
)

all objects refreshed



Evgenii

unread,
Aug 24, 2022, 6:32:46 AM8/24/22
to sqlalchemy

Thank you one more time!
Just in case I leave fixed expression here:

session.execute(
   select(Instance).where(Instance.id.in([list_of_ids])),
    execution_options={"populate_existing": True}
)

Is there any way to set populate_existing = True for engine or sessionmaker?

вторник, 23 августа 2022 г. в 22:34:23 UTC+3, Mike Bayer:

Mike Bayer

unread,
Aug 24, 2022, 9:53:26 AM8/24/22
to noreply-spamdigest via sqlalchemy


On Wed, Aug 24, 2022, at 6:32 AM, Evgenii wrote:

Thank you one more time!
Just in case I leave fixed expression here:

session.execute(
   select(Instance).where(Instance.id.in([list_of_ids])),
    execution_options={"populate_existing": True}
)

Is there any way to set populate_existing = True for engine or sessionmaker?


not right now, there is an issue to maybe have session-wide execution options.      the option can be set on an engine but I'm not sure it takes effect for the ORM when set only at the engine level.


Evgenii

unread,
Aug 29, 2022, 10:08:16 AM8/29/22
to sqlalchemy

What I’m doing wrong?

class ModelObjectTable(BaseTable, BaseOwner):
    __tablename__ = 'model_object'

    model_id = Column(Integer, ForeignKey('model.id', ondelete='cascade'))

    # other columns

    model = relationship('ModelTable', uselist=False, lazy='joined',
                         join_depth=1)

    def __init__(self):
        ...

mo_list = session.execute(select(ModelObjectTable).where(
    ModelObjectTable.id.in_(id_list)),
    execution_options={'populate_existing': True}).all()

get this error:

Traceback (most recent call last):
  File "/home/user/anaconda3/envs/python37/lib/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/home/user/anaconda3/envs/python37/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 417, in _allrows
    for made_row in made_rows
  File "/home/user/anaconda3/envs/python37/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 417, in <listcomp>
    for made_row in made_rows
  File "/home/user/anaconda3/envs/python37/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", line 194, in require_unique
    "The unique() method must be invoked on this Result, "
sqlalchemy.exc.InvalidRequestError: The unique() method must be invoked on this Result, as it contains results that include joined eager loads against collections
среда, 24 августа 2022 г. в 16:53:26 UTC+3, Mike Bayer:
Reply all
Reply to author
Forward
0 new messages