On Sat, Sep 2, 2017 at 6:34 AM, Zsolt Ero <
zsol...@gmail.com> wrote:
> After reading Mike's great blog post:
>
http://techspot.zzzeek.org/2015/02/15/asynchronous-python-and-databases/ and
> SO answer:
https://stackoverflow.com/a/16503103/518169 I wanted to use
> gevent / sqlalchemy / psycopg2 / gunicorn in a new application.
>
> However I have immediately backtracked after a long time Pyramid users
> shared his experience as:
>
>> I had a project using SQLAlchemy, Pyramid and gevent
>> (gevent.monkey.patch_all()).
>> Before engaging to this interesting experience, I suggest you prebook a
>> bed in an asylum. That is the level of problems you need to debug with
>> Python interpreter, stdlib, web server threading, etc. All of those had
>> subtle but hard to debug threading issues that took days and days to debug
>> when you no longer can trust that lower levels of your stack (web server,
>> database connections, etc.) correctly behaving under asyncio. It's
>> especially fun if the problems only appear under a production load.
>> It was very happy moment when I could finally pip uninstall gevent and
>> move back to well proven threading model.
>
>
> Now, I'm in a hole as there seems to be no way to use the stability and
> reliability of SQLAlchemy ORM + psycopg2 in any async way.
I don't know what that person is talking about. If you're using
gevent there is no "threading" and you would want to make sure the web
server container you're using is also not trying to put threads on top
of that. Per
http://docs.gunicorn.org/en/stable/settings.html, the
default for "threads" is 1. Openstack is still largely based on
using eventlet wtih monkeypatching. Even though Openstack services
like Keystone and Nova have migrated towards mod_wsgi, it is still
feasible to run implicit async in production - I'm not aware of
problems with stdlib, or Python interpreters. The main gotcha is
looking out for requests that are blocked either because the other
requests are doing too much CPU work, or because the connection pool
was not configured to allow enough database connections for the much
larger set of concurrent requests per process one usually is trying to
achieve with event-based (but that same problem can happen with
traditional threads too).
I maintain there is usually no point to it unless you are doing a
long-lived request model like websockets, and even then I'd possibly
look into separating out the "business transaction" parts of the
system from the "keep lots of long connections open to clients" aspect
(e.g. running them in different services). And additionally, when
people want to get into async as a means to avoid understanding how
threading works, that's a recipe for disaster too - I think the async
model is more difficult to understand than the threading one, in
contrast to what its proponents always argue, and my evidence for this
is how often I see people using async who don't even understand it has
nothing to do with the GIL or speed.
>
> Some projects like GINO
https://github.com/fantix/gino look interesting as
> it uses SQLAlchemy core and specializes in Postgresql, but I'm not sure I
> want to trust an ORM with 124 commits and 50 github stars.
>
> So I either have to use peewee + peewee_async (an other alpha version of a
> lib), or SQLAlchemy ORM in a blocking way.
There are multiple asyncio interfaces for SQLAlchemy as well, the one
I see the most is
https://github.com/aio-libs/aiopg.
Hence my question: what would
> happen if I try to use blocking Sqlalchemy ORM in an asyncio app? I mean, I
> need async because of slow endpoints, for example doing HTTP requests to 3rd
> parties.
Yeah I would offload that work into a queue and separate the CRUD part
of the application from the "make lots of slow backend requests" part
of it. Explicit async and ORMs are not compatible with each other
from a programming point of view - the philosophy of explicit async is
that no IO should ever occur without the programmer typing out a bunch
of boilerplate saying that it is OK to do IO, whereas the ORM's main
purpose is to transparently fetch and persist data *without*
boilerplate.
Additionally, if you mix up making lots of slow HTTP requests to
backends with your CRUD, you also have database transactions being
held open for a long time just waiting - that will require many more
database connections which are fairly expensive in Postgresql as well
as idle transactions locking rows, which you want to avoid.
> The database and ORM is really fast, at least for the common
> queries what you do in a REST API. So what would happen in theory if I'd try
> to use SQLAlchemy ORM from something like sanic?
>
https://github.com/channelcat/sanic
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
>
http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See
http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
sqlalchemy+...@googlegroups.com.
> To post to this group, send email to
sqlal...@googlegroups.com.
> Visit this group at
https://groups.google.com/group/sqlalchemy.
> For more options, visit
https://groups.google.com/d/optout.