Best practices for cleaning up SQLAlchemy engine and sessionmaker instances?

42 views
Skip to first unread message

Vincent Chan

unread,
Jan 21, 2020, 1:31:09 PM1/21/20
to sqlalchemy
According to SQLAlchemy documentation, engine and sessionmaker instances should be created in the application's global scope:

When do I make a sessionmaker?
Just one time, somewhere in your application’s global scope. It should be looked upon as part of your application’s configuration. If your application has three .py files in a package, you could, for example, place the sessionmaker line in your __init__.py file; from that point on your other modules say “from mypackage import Session”. That way, everyone else just uses Session(), and the configuration of that session is controlled by that central point.

Questions:

1. What is the best practice for cleaning up SQLAlchemy engine and sessionmaker instances? Please refer to my example below, while I could call engine.dispose() in main.py, it does not seem like a good practice to clean up a global object from a different module (database.py) in main.py, is there a better way to do it?

2. Do we need to clean up sessionmaker instances? It seems there is no method for closing the sessionmaker instance (sessionmaker.close_all() is deprecated, and session.close_all_sessions() is a session instance method and not sessionmaker method.)

Example:

- I created the engine and sessionmaker object in a module called database.py:

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from contextlib import contextmanager


DB_ENGINE
= create_engine(DB_CONNECTION_STRING, pool_size=5, max_overflow=10)
DB_SESSION
= sessionmaker(bind=DB_ENGINE, autocommit=False, autoflush=True, expire_on_commit=False)


@contextmanager
def db_session(db_session_factory):
   
"""Provide a transactional scope around a series of operations."""
    session
= db_session_factory()
   
try:
       
yield session
        session
.commit()
   
except:
        session
.rollback()
       
raise
   
finally:
        session
.close()



- In my main application main.py, I import the module and use the engine and sessionmaker instances as follows. I cleaned up the engine instance at the end of __main__.

from multiprocessing.pool import ThreadPool
from database import DB_ENGINE, DB_SESSION, db_session


def worker_func(data):
   
with db_session(DB_SESSION) as session:
       
[...database operations using session object...]


if __name__ == '__main__':
   
try:
        data
= [1,2,3,4,5]
       
with ThreadPool(processes=5) as thread_pool:
            results
= thread_pool.map(worker_func, data)
   
except:
       
raise
   
finally:
       
# Cleanup
        DB_ENGINE
.dispose()

Thank you very much!

Jonathan Vanasco

unread,
Jan 21, 2020, 6:04:40 PM1/21/20
to sqlalchemy
There are some sections in the docs that may help you understand things a bit more.

1. Connections- Disposal https://docs.sqlalchemy.org/en/13/core/connections.html#engine-disposal
2. Connections- Threadlocal https://docs.sqlalchemy.org/en/13/core/connections.html#using-the-threadlocal-execution-strategy
3. Connections FAQ - https://docs.sqlalchemy.org/en/13/faq/connections.html
4. Sessions FAQ - https://docs.sqlalchemy.org/en/13/faq/sessions.html

Generally you only need to call `dispose` when you're dealing with multiple processes or forks - as it's designed to help get around situations where you have database connections that are not safe to cross boundaries.

Personally, I prefer to wrap the database intialization in a function.  using your example... something more like this:


database.py 
    DB_ENGINE = None
    DB_SESSION 
= None

    def initialize_db():
       global 
DB_ENGINE, DB_SESSION

       DB_ENGINE 
= create_engine(DB_CONNECTION_STRING, pool_size=5, max_overflow=10)
       DB_SESSION = sessionmaker(bind=DB_ENGINE, autocommit=False, autoflush=True, expire_on_commit=False)


This way you can import the file without establishing a db connection at that moment, and can then control exactly when the connection is made.

My usage patterns are a bit different than yours, so I'm usually stashing everything in a dict that handles the application context... and not using globals.  this seemed easier to explain though.
Reply all
Reply to author
Forward
0 new messages