Delay in accessing table with 2 enum columns using asyncpg + ORM

236 views
Skip to first unread message

Michaël Van de Steene

unread,
Mar 25, 2021, 8:31:13 AM3/25/21
to sqlalchemy
Hi everyone,

We recently started using sqlalchemy in combination with asyncpg and are observing some behaviour we can't quite figure out. I hope this is the right place to get help, if it would be better addressed elsewhere please let me know.

To frame the issue, we have a table with several enum columns. The first time we access this table, there is a roughly 600 ms delay before any results are returned. This seems to apply both for insert and select operations. After that first access, everything seems speedy returning in just a few milliseconds.

I've created a short example application to show this problem:
import asyncio
import enum
from sqlalchemy import Enum, Column, Integer
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

Base = declarative_base()


class A(enum.Enum):
    ONE = 1
    TWO = 2


class B(enum.Enum):
    THREE = 3
    FOUR = 4


class C(enum.Enum):
    FIVE = 5
    SIX = 6


class RecordA(Base):
    __tablename__ = "TableA"

    id = Column(Integer, primary_key=True, autoincrement=True)
    a = Column(Enum(A))


class RecordB(Base):
    __tablename__ = "TableB"

    id = Column(Integer, primary_key=True, autoincrement=True)
    b = Column(Enum(B))
    c = Column(Enum(C))


async def main(db_url):
    engine = create_async_engine(db_url, echo="debug")
    Session = sessionmaker(engine, expire_on_commit=False, class_=AsyncSession)

    async with engine.begin() as conn:
        await conn.run_sync(Base.metadata.drop_all)
        await conn.run_sync(Base.metadata.create_all)

    async with Session() as session:
        async with session.begin():
            session.add(RecordA(a=A.ONE))
        async with session.begin():
            session.add(RecordB(b=B.THREE, c=C.FIVE))
        async with session.begin():
            session.add(RecordB(b=B.FOUR, c=C.SIX))


asyncio.run(main(
    "postgresql+asyncpg://postgres:exa...@host.docker.internal/postgres"))

The logging of the three insert operations shows:
2021-03-25 12:14:10,224 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 12:14:10,225 INFO sqlalchemy.engine.Engine INSERT INTO "TableA" (a) VALUES (%s) RETURNING "TableA".id
2021-03-25 12:14:10,225 INFO sqlalchemy.engine.Engine [generated in 0.00011s] ('ONE',)
2021-03-25 12:14:10,296 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 12:14:10,296 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 12:14:10,296 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 12:14:10,299 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 12:14:10,299 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (%s, %s) RETURNING "TableB".id
2021-03-25 12:14:10,299 INFO sqlalchemy.engine.Engine [generated in 0.00012s] ('THREE', 'FIVE')
2021-03-25 12:14:10,906 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 12:14:10,906 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 12:14:10,906 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 12:14:10,924 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 12:14:10,924 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (%s, %s) RETURNING "TableB".id
2021-03-25 12:14:10,924 INFO sqlalchemy.engine.Engine [cached since 0.625s ago] ('FOUR', 'SIX')
2021-03-25 12:14:10,926 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 12:14:10,926 DEBUG sqlalchemy.engine.Engine Row (2,)
2021-03-25 12:14:10,926 INFO sqlalchemy.engine.Engine COMMIT

I've highlighted the timestamps showing the 600 ms delay in red.
It seems as if:
  • An insert with just 1 enum does not incur delay
  • An insert with 2 enums incurs delays
  • A subsequent insert using those same enums doesn't incur delay
I'm at a loss to explain this behaviour. As I mentioned we're quite new to SQLAlchemy. Is there anything we're doing wrong?

The output is generated using:
  • PostgreSQL 13.2
  • Python 3.9.2
  • SQLAlchemy 1.4.2
  • Asyncpg 0.22.0

Finally, run instructions using docker just in case it can help to quickly reproduce:
  1. Save the example as `asyncpg_enum.py` in the current directory
  2. Run docker run -d --name=postgres -e POSTGRES_PASSWORD="example" -p 5432:5432 postgres
  3. Run docker run -it -v ${PWD}/asyncpg_enum.py:/asyncpg_enum.py python bash -c 'pip install sqlalchemy asyncpg && python /asyncpg_enum.py'
Any insight or things to check would be appreciated.

Thanks,
Michael

Mike Bayer

unread,
Mar 25, 2021, 9:02:29 AM3/25/21
to noreply-spamdigest via sqlalchemy
Hi there -

thanks for the clear example.  I just ran it against a several PG databases, including a PG 13 and PG 12 on the local network, and I am not observing any delay of that magnitude, SQL output with timestamps follow.   

what OS are you running on ?       I would say you might want to try replicating these commands to plain asyncpg, but that will not necessarily produce the same sequence as we use prepared statements explicitly in all cases, or try running the equivalent commands with psycopg2 to see if there's some database-specific issue going on.   but you're running against a vanilla docker container so that's a little strange, try running the script from your workstation instead perhaps.

2021-03-25 08:56:24,722 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 08:56:24,723 INFO sqlalchemy.engine.Engine INSERT INTO "TableA" (a) VALUES (%s) RETURNING "TableA".id
2021-03-25 08:56:24,723 INFO sqlalchemy.engine.Engine [generated in 0.00023s] ('ONE',)
2021-03-25 08:56:24,754 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 08:56:24,754 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 08:56:24,754 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 08:56:24,758 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 08:56:24,759 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (%s, %s) RETURNING "TableB".id
2021-03-25 08:56:24,759 INFO sqlalchemy.engine.Engine [generated in 0.00019s] ('THREE', 'FIVE')
2021-03-25 08:56:24,767 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 08:56:24,767 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 08:56:24,768 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 08:56:24,770 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 08:56:24,771 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (%s, %s) RETURNING "TableB".id
2021-03-25 08:56:24,771 INFO sqlalchemy.engine.Engine [cached since 0.01149s ago] ('FOUR', 'SIX')
2021-03-25 08:56:24,772 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 08:56:24,772 DEBUG sqlalchemy.engine.Engine Row (2,)
2021-03-25 08:56:24,772 INFO sqlalchemy.engine.Engine COMMIT
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.

Michaël Van de Steene

unread,
Mar 25, 2021, 10:45:17 AM3/25/21
to sqlalchemy
Thanks for the quick reply! It does appear to be an environment issue on my end then.

I've already tried a few things:
  • Postgres 12
  • Python 3.7
So far no dice. This was all within docker though, I'll try a few more things outside of the docker environment and report back.

Mike Bayer

unread,
Mar 25, 2021, 10:57:17 AM3/25/21
to noreply-spamdigest via sqlalchemy
so with PG12 and python 3.7 you *do* or *do not* see the slowdown?

Michaël Van de Steene

unread,
Mar 25, 2021, 1:54:37 PM3/25/21
to sqlalchemy
Sorry for the slow reply, been dealing with a few other issues while also working on this.
To confirm your question: using PG12 and Python 3.7 I do see the slowdown.

Since then I've taken to some extra tests (all using SQLAlchemy 1.4.2 and asyncpg 0.2.2.0)
  1. PG12.6 (no docker) with Python 3.8 (in docker): 700 ms slowdown
  2. PG12.6 (no docker) with Python 3.9 (in docker): 700 ms slowdown
  3. PG12.6 (no docker) with Python 3.8.8 (no docker): 700 ms slowdown
    • I am also getting a 'OSError: [Errno 9] Bad file descriptor' error in this configuration. Seems as if there is still communication with the database when the asyncio loop is torn down.
      Figure this may be related to the compile process. I've installed it from source using ./configure --enable-optimizations
  4. PG12.6 (no docker) with Python 3.7.10 (no docker): 700 ms slowdown
In short, I am always seeing this delay on the second insert in the example script regardless of python version and use of docker.

In all these tests the PG instance was running on a Ubuntu 20.04.2 virtual machine.
The python (and docker) side of things was running in Debian 10 (in WSL2 on a windows machine, if that matters).

In an attempt to exclude the WSL2 environment, I also attempted to run (dockerised) python in the same Ubuntu 20.04.2 virtual machine but that too is showing the delay.

Mike Bayer

unread,
Mar 25, 2021, 2:25:11 PM3/25/21
to noreply-spamdigest via sqlalchemy
what happens if you run the asyncio.run() part in a loop?   is the 700 ms every time ?   

Michaël Van de Steene

unread,
Mar 25, 2021, 2:33:57 PM3/25/21
to sqlalchemy
what happens if you run the asyncio.run() part in a loop?   is the 700 ms every time ?  
Yes, it happens each time.

I also just ran the same test on a different device, a raspberry pi as that's what I had quickly available. Logging from that run is attached below.
There's a slowdown at two points, and it's substantially larger. This slowdown in the first query is also present in the original log, it's just less noticeable (70 ms).

Gives me the impression the slowdown is caused by some CPU bound process, as the raspberry pi's performance is a lot worse than my workstation.

2021-03-25 18:26:23,807 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 18:26:23,810 INFO sqlalchemy.engine.Engine INSERT INTO "TableA" (a) VALUES (%s) RETURNING "TableA".id
2021-03-25 18:26:23,811 INFO sqlalchemy.engine.Engine [generated in 0.00055s] ('ONE',)
2021-03-25 18:26:24,274 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 18:26:24,275 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 18:26:24,276 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 18:26:24,289 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 18:26:24,291 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (%s, %s) RETURNING "TableB".id
2021-03-25 18:26:24,291 INFO sqlalchemy.engine.Engine [generated in 0.00056s] ('THREE', 'FIVE')
2021-03-25 18:26:28,102 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 18:26:28,102 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 18:26:28,103 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 18:26:28,115 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 18:26:28,116 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (%s, %s) RETURNING "TableB".id
2021-03-25 18:26:28,116 INFO sqlalchemy.engine.Engine [cached since 3.825s ago] ('FOUR', 'SIX')
2021-03-25 18:26:28,119 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 18:26:28,119 DEBUG sqlalchemy.engine.Engine Row (2,)
2021-03-25 18:26:28,120 INFO sqlalchemy.engine.Engine COMMIT

Federico Caselli

unread,
Mar 25, 2021, 3:18:59 PM3/25/21
to sqlalchemy
Hi,

I've tried the script and I also cannot reproduce it:

py3.7 + pg 13.1 (non-docker) on windows

2021-03-25 19:35:15,977 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 19:35:15,979 INFO sqlalchemy.engine.Engine INSERT INTO "TableA" (a) VALUES (?) RETURNING "TableA".id
2021-03-25 19:35:15,980 INFO sqlalchemy.engine.Engine [generated in 0.00080s] ('ONE',)
2021-03-25 19:35:15,999 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 19:35:16,000 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 19:35:16,001 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 19:35:16,004 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 19:35:16,006 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (?, ?) RETURNING "TableB".id
2021-03-25 19:35:16,006 INFO sqlalchemy.engine.Engine [generated in 0.00070s] ('THREE', 'FIVE')
2021-03-25 19:35:16,014 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 19:35:16,014 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 19:35:16,015 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 19:35:16,016 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 19:35:16,017 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (?, ?) RETURNING "TableB".id
2021-03-25 19:35:16,017 INFO sqlalchemy.engine.Engine [cached since 0.0114s ago] ('FOUR', 'SIX')
2021-03-25 19:35:16,018 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 19:35:16,019 DEBUG sqlalchemy.engine.Engine Row (2,)
2021-03-25 19:35:16,020 INFO sqlalchemy.engine.Engine COMMIT

Federico Caselli

unread,
Mar 25, 2021, 3:23:01 PM3/25/21
to sqlalchemy
I can reproduce using postgres in docker. About 1s of delay

2021-03-25 20:22:12,488 INFO sqlalchemy.engine.Engine INSERT INTO "TableA" (a) VALUES (?) RETURNING "TableA".id
2021-03-25 20:22:12,488 INFO sqlalchemy.engine.Engine [generated in 0.00091s] ('ONE',)
2021-03-25 20:22:12,631 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 20:22:12,632 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 20:22:12,632 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 20:22:12,637 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 20:22:12,639 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (?, ?) RETURNING "TableB".id
2021-03-25 20:22:12,639 INFO sqlalchemy.engine.Engine [generated in 0.00055s] ('THREE', 'FIVE')
2021-03-25 20:22:13,692 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 20:22:13,693 DEBUG sqlalchemy.engine.Engine Row (1,)
2021-03-25 20:22:13,694 INFO sqlalchemy.engine.Engine COMMIT
2021-03-25 20:22:13,700 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-03-25 20:22:13,701 INFO sqlalchemy.engine.Engine INSERT INTO "TableB" (b, c) VALUES (?, ?) RETURNING "TableB".id
2021-03-25 20:22:13,701 INFO sqlalchemy.engine.Engine [cached since 1.063s ago] ('FOUR', 'SIX')
2021-03-25 20:22:13,705 DEBUG sqlalchemy.engine.Engine Col ('id',)
2021-03-25 20:22:13,706 DEBUG sqlalchemy.engine.Engine Row (2,)
2021-03-25 20:22:13,706 INFO sqlalchemy.engine.Engine COMMIT

Michaël Van de Steene

unread,
Mar 25, 2021, 3:28:25 PM3/25/21
to sqlalchemy
Thanks for the feedback! So that would imply the issue is in running postgres in docker, I guess?

I'm still somewhat confused by my tests earlier where I also had the problem when running against a postgres instance without docker in a Ubuntu VM.
Then again, that does also use a virtualised network adapter. Perhaps the issue is in that direction...

I'll redo my tests with postgres outside of docker to confirm that aspect.
Any other thoughts of things to check?

Federico Caselli

unread,
Mar 25, 2021, 3:38:29 PM3/25/21
to sqlalchemy
So, I've tried with psycpog2 and I have no issue in docker:
def main(db_url):
    engine = create_engine(db_url, echo="debug")
    Session = sessionmaker(engine, expire_on_commit=False)

    with engine.begin() as conn:
        Base.metadata.drop_all(conn)
        Base.metadata.create_all(conn)

    for i in range(100):
        with Session() as session:
            with session.begin():
                session.add(RecordA(a=A.ONE))
            with session.begin():
                session.add(RecordB(b=B.THREE, c=C.FIVE))
            with session.begin():
                session.add(RecordB(b=B.FOUR, c=C.SIX))
main("postgresql://scott:tiger@localhost/test")


Also using asyncpg directly does not have any issue (also with docker)
async def main(db_url):
    engine = create_async_engine(db_url, echo="debug")

    async with engine.begin() as conn:
        await conn.run_sync(Base.metadata.drop_all)
        await conn.run_sync(Base.metadata.create_all)
    conn = await asyncpg.connect(db_url.replace('postgresql+asyncpg','postgresql'))
    async with conn.transaction():
        await conn.execute('INSERT INTO "TableA" (a) VALUES ($1) RETURNING "TableA".id', A.ONE.name)
    async with conn.transaction():
        await conn.execute('INSERT INTO "TableB" (b, c) VALUES ($1, $2) RETURNING "TableB".id', B.THREE.name, C.FIVE.name)
    async with conn.transaction():
        await conn.execute('INSERT INTO "TableB" (b, c) VALUES ($1, $2) RETURNING "TableB".id', B.FOUR.name, C.SIX.name)
    await conn.close()
asyncio.run(main("postgresql+asyncpg://scott:tiger@localhost/test"))

It seems to be somehow connected to sqlalchemy, but I'm not really sure how. Ideas Mike?

Michaël Van de Steene

unread,
Mar 25, 2021, 4:04:34 PM3/25/21
to sqlalchemy
Not sure I can still add much value to the discussion at this point, but just to confirm: on a Hyper-V based Ubuntu VM I always see the delay when using SQLAlchemy's ORM in combination with asyncpg, also without docker.

Mike Bayer

unread,
Mar 25, 2021, 5:41:24 PM3/25/21
to noreply-spamdigest via sqlalchemy
you would need to use:

1. a transaction

2. prepared = prepare(sql)

3. await prepared.execute()

4. the prepare() and execute() need to be inside of an asyncio.Mutex()

try commenting out the asyncio.Mutex() in the SQLAlchemy source and see if that's where it's going wrong.

Federico Caselli

unread,
Mar 26, 2021, 2:30:11 PM3/26/21
to sqlalchemy
The issue is indeed of asyncpg with prepared statements. Here is a reproduction using asyncpg prepared statements

async def main(db_url):
    from datetime import datetime

    engine = create_async_engine(db_url, echo="debug")

    async with engine.begin() as conn:
        await conn.run_sync(Base.metadata.drop_all)
        await conn.run_sync(Base.metadata.create_all)
    conn = await asyncpg.connect(db_url.replace('postgresql+asyncpg','postgresql'))
    async with conn.transaction():
        print(datetime.now().isoformat(), 's1 prepare')
        prep = await conn.prepare('INSERT INTO "TableA" (a) VALUES ($1) RETURNING "TableA".id')
        print(datetime.now().isoformat(), 's1 fetch')
        await prep.fetch(A.ONE.name)
    async with conn.transaction():
        print(datetime.now().isoformat(), 's2 prepare')
        prep = await conn.prepare('INSERT INTO "TableB" (b, c) VALUES ($1, $2) RETURNING "TableB".id')
        print(datetime.now().isoformat(), 's2 fetch')
        await prep.fetch(B.THREE.name, C.FIVE.name)
    async with conn.transaction():
        print(datetime.now().isoformat(), 's3 prepare')
        prep = await conn.prepare('INSERT INTO "TableB" (b, c) VALUES ($1, $2) RETURNING "TableB".id')
        print(datetime.now().isoformat(), 's3 fetch')
        await prep.fetch(B.FOUR.name, C.SIX.name)
    await conn.close()

asyncio.run(main("postgresql+asyncpg://postgres:example@localhost"))

This logs:
2021-03-26 19:25:45,813 INFO sqlalchemy.engine.Engine COMMIT
2021-03-26T19:25:45.828939 s1 prepare
2021-03-26T19:25:45.963418 s1 fetch
2021-03-26T19:25:45.973970 s2 prepare
2021-03-26T19:25:46.999589 s2 fetch

2021-03-26T19:25:47.015053 s3 prepare
2021-03-26T19:25:47.019035 s3 fetch

Can you also reproduce with this example Michaël ?
I think you can open an issue on their issue tracker at https://github.com/MagicStack/asyncpg/issues for further assistence. Feel free to use the snipped I created.

Michaël Van de Steene

unread,
Mar 26, 2021, 4:05:21 PM3/26/21
to sqlalchemy
Yes, this also leads to reproduction on my end. Thank you for the help, much appreciated!

I'll reach out to the folks at asyncpg for further follow up.
Reply all
Reply to author
Forward
0 new messages