DATABASES setting for async usage

163 views
Skip to first unread message

Andrew Wang

unread,
Jan 28, 2022, 8:42:43 PM1/28/22
to Django developers (Contributions to Django itself)
Hi, I'm intrigued in helping develop the async ORM engines (link to random PR with aiosqlite and base async engine). The biggest road block is having test cases for async engines only; for instance, introspection/test_async.py can't run because the test suite needs to first migrate using a sync engine as the default database connection then switch the default alias to using an async engine; additionally, for most of the test suite, it uses the sync engine. The current methods I'm thinking of:

1. Each async engine would get its own CI worker. I think it's a waste of resources and because it wouldn't help Django users who also need to write test cases with both an async and sync (for migrations) engine.
2. Async and sync engines for the same database are run at the same time. Every async engine requires a parallel sync engine, an alias to be specified in the engine's DATABASES options. This way, while testing, migrations can be performed with a designated sync engine. This is great for the end user who may be async-centric; this doesn't really resolve the current problem. Most of the Django test suite is designed for synchronous db engines, so the default alias database engine will switch around a bunch of times.
3. Implement a test decorator that switches the default alias connection.

Lemme know if that's confusing.

Thanks,
Andrew

Adam Johnson

unread,
Jan 31, 2022, 6:48:13 PM1/31/22
to Django developers (Contributions to Django itself)
Hi Andrew,

I'm afraid I don't know much about async, but I can point you at some recent changes. Andrew Godwin created a PR with the draft of the async ORM API. Carlton recently asked for tests: https://twitter.com/carltongibson/status/1486281689265545221 . Perhaps check out those PR's and see if you can contribute further?

Adam

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/4ca69f37-cc9e-40c3-a73b-cc9a125ebaecn%40googlegroups.com.

Andrew Wang

unread,
Jan 31, 2022, 9:10:04 PM1/31/22
to Django developers (Contributions to Django itself)

I'm thinking that for our purposes:

  1. Because several Django projects need a migration from sync to async which requires time and resources, sync engines can specify a new key called ASYNC_DATABASE_ALIAS whose value is a string that is a database alias in the DATABASES dict. The purpose of this is two-fold: A) ease Django project migrations to async and B) allow for testing on Django internal test suite and third-party packages. This also helps with backwards compatibility for the templating system such that when we call Model.objects.acreate, because of that a prefix, we can implicitly use the async engine.
  2. Projects that start off as async or have their default engine as async need an option called SYNC_DATABASE_ALIAS that is exactly like the ASYNC_DATABASE_ALIAS option. The purpose for this is to fill in the gap of DEP 9 by having a migration and schema engine changer since async engines aren't supposed to. In the database wrapper, we would grab the sync database's database wrapper's Schema class.

It'll look like this:

DATABASES = { 'default': { 'ENGINE': 'django.db.backends.aiosqlite', 'SYNC_DATABASE_ALIAS': 'sync' }, 'sync': { 'ENGINE': 'django.db.backends.sqlite3', }, }

or

DATABASES = { 'default': { 'ENGINE': 'django.db.backends.sqlite3', 'ASYNC_DATABASE_ALIAS': 'async' }, 'async': { 'ENGINE': 'django.db.backends.aiosqlite', }, }

I'm not sure if either of these options are possible due to sqlite being tested in memory though.

Carlton Gibson

unread,
Feb 1, 2022, 3:55:35 AM2/1/22
to Django developers (Contributions to Django itself)
Hey Andrew. 

Thanks for pushing this work. It's very cool! 

So... Andrew (Godwin)'s current PR blocks out the async interface for QuerySet. For now, it's a light-weight wrapper around the existing sync code, using sync_to_async to hand-off the DB operations to a thread pool executor. (Flavio Curella contributed initial tests last week, so I've currently asked that Simon and Mariusz have a look so we can assemble a TODO list to get it in.) 

In terms of your PR, swapping connections __seems__ troublesome. (Perhaps not but 😬.) Until (if ever) we can work async into the all the right places, I'd imagined we'd provide both (sync and async) interfaces for the code that needs it. So, for example, and just an initial thought, since you're coming from the bottom can we, to begin, add a sync interface to the cursor (execute() and aexecute() say, using async_to_sync) that would allow the schema editor (&co) to think it's still talking to a sync driver? 🤔

If you do want to swap connections (even if only to make progress now...):

-  Using an on-disk DB, specifying TEST["NAME"] maybe allow you to use the existing backend to create the tests, and then the async alias after that. (Exactly the flow into setup_databases() needs some thought there, but hard-coding a value or two should be enough for a PoC… 🤔)
- Searching for "sqlite multiple connections to in-memory database" doesn't come up empty. ...

More generally, I'd imagine setting the backend up as a third-party package would be the way to go, to allow space to experiment. Tim Graham's work on the CockroachDB backend has that run against the Django test suite, which is a good pattern I'd guess. Exactly where low-end driver work meets higher-end API like work, like Andrew's PR, I don't think we yet know. 

Kind Regards,

Carlton
Reply all
Reply to author
Forward
0 new messages