garbage collection of instances() generator causing psycopg2.ProgrammingError: named cursor isn't valid anymore

David Vitek

unread,

Feb 16, 2022, 4:41:22 PM2/16/22

to sqlal...@googlegroups.com

Hi all,

I have a situation where we are using transactions and postgres' server side cursors. The troublesome sequence is something like:

x = s.query(...)
next(x)
if bail_out:
s.commit()
del x

I'm not entirely sure that this simple example can reproduce the problem; the true code is more complex, but hopefully this helps paint a picture of the general control flow.

The problem I'm running into has to do with what python does to garbage collect an ongoing coroutine. The coroutine is ongoing since we have not exhausted x (assume the query returns many rows). Python causes the coroutine to internally raise GeneratorExit when it is destroyed, which in this case triggers an except block that attempts to close the cursor. The coroutine is instances() from sqlalchemy's loading.py. The del x line causes the coroutine's refcount to drop to 0, which causes the GeneratorExit to raise and get caught by:

# from loading.py
def instances(cursor, context):
...
except Exception:
with util.safe_reraise():
cursor.close()

Upon attempting to close the cursor, postgres complains "psycopg2.ProgrammingError: named cursor isn't valid anymore" because it was closed when the enclosing transaction closed.

We consider it a good thing that the cursor closes when the transaction closes, but we find the exception raised when doing GC undesirable.

I'm wondering if there is any way to make cursors more cognizant of the transactions that they live in, so that they might realize they are already closed when their surrounding transaction has closed?

In the meantime, we are working around this problem by checking whether the session in the query object inside instances() is already dead:

# from loading.py
def instances(cursor, context):
...
except Exception:
with util.safe_reraise():
if query.session.is_active: # <--- New line
cursor.close()

The main shortcoming of this solution is that it only helps this particular cursor.

While we could try to do things like move the "del x" line up before the commit(), this isn't a real fix since python doesn't promise to GC things promptly. Explicitly closing the cursor would be another option, but it's not readily available to the sqlalchemy client when using the ORM interface. Even if it were, I imagine we'd forget to do it from time to time.

Any thoughts?

________________________________
The information contained in this e-mail and any attachments from GrammaTech, Inc may contain confidential and/or proprietary information, and is intended only for the named recipient to whom it was originally addressed. If you are not the intended recipient, any disclosure, distribution, or copying of this e-mail or its attachments is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by return e-mail and permanently delete the e-mail and any attachments.

Mike Bayer

unread,

Feb 16, 2022, 6:15:07 PM2/16/22

to noreply-spamdigest via sqlalchemy

hi there -

this issue does not sound familiar , I guess you are trying to remove the Query iterator before it's exhausted and therefore problems are occurring.

I would need to see a stack trace to understand the issue. Also, you shouldn't use Query for cases like these, use modern 1.4 style statement executions which will return a Result object that is more robustly designed for different row fetching scenarios (though it still uses loading.py internally).

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/PH1P110MB128246BB30B8FC8846504AA0BB359%40PH1P110MB1282.NAMP110.PROD.OUTLOOK.COM.

Mike Bayer

unread,

Feb 17, 2022, 12:18:19 AM2/17/22

to noreply-spamdigest via sqlalchemy

Here's a script that gets as close as possible to what you describe. this uses a psycopg2 server side cursor, queries for 300 rows and fetches in groups of 10 at a time, cutting off the result right at the 15th batch. it doesn't show any problem, but im not sure what else you have going on that triggers it. so see if you can this program to illustrate the problem.

from sqlalchemy import Column

from sqlalchemy import create_engine

from sqlalchemy import Integer

from sqlalchemy import String

from sqlalchemy.orm import declarative_base

from sqlalchemy.orm import Session

Base = declarative_base()

class A(Base):

__tablename__ = 'a'

id = Column(Integer, primary_key=True)

data = Column(String)

e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)

Base.metadata.drop_all(e)

Base.metadata.create_all(e)

s = Session(e)

s.add_all([A(data=f"{i}") for i in range(300)])

s.commit()

# SS cursor, will fetch every 10 rows

x = iter(s.query(A).yield_per(10))

for i in range(157):

o = next(x)

s.commit()

del x

import gc

gc.collect()

print("done")

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/3adb7c7b-074a-4536-b779-771cbaffc6b9%40www.fastmail.com.

Reply all

Reply to author

Forward