Database "execute hooks" for instrumentation

153 views
Skip to first unread message

Shai Berger

unread,
Apr 7, 2017, 9:05:14 AM4/7/17
to django-d...@googlegroups.com
Hello,

This is an idea that came up during the djangocon-europe conference: Add the
ability to install general instrumentation hooks around the database "execute"
and "executemany" calls.

Such hooks would allow all sorts of interesting features. For one, they could
replace the current special-case allowing assertNumQueries & friends to record
queries out of debug mode (it's an ugly hack, really), but they could also
support my imagined use case -- a context-manager which could prevent database
access during execution of some code (I'm thinking mostly of using it around
"render()" calls and serialization, to make sure all database access is being
done in the view).

My idea for implementation is to keep a thread-local stack of context-
managers, and have them wrap each call of "execute". We could actually even
use one such context-manager instead of the existing CursorDebugWrapper.

Thoughts?

Carl Meyer

unread,
Apr 7, 2017, 10:48:33 AM4/7/17
to django-d...@googlegroups.com
Hi Shai,

On 04/07/2017 06:02 AM, Shai Berger wrote:
> This is an idea that came up during the djangocon-europe conference: Add the
> ability to install general instrumentation hooks around the database "execute"
> and "executemany" calls.
>
> Such hooks would allow all sorts of interesting features. For one, they could
> replace the current special-case allowing assertNumQueries & friends to record
> queries out of debug mode (it's an ugly hack, really), but they could also
> support my imagined use case -- a context-manager which could prevent database
> access during execution of some code (I'm thinking mostly of using it around
> "render()" calls and serialization, to make sure all database access is being
> done in the view).

Another use-case is for preventing database access during tests unless
specifically requested by the test (e.g. pytest-django does this,
currently via monkeypatching).

> My idea for implementation is to keep a thread-local stack of context-
> managers, and have them wrap each call of "execute". We could actually even
> use one such context-manager instead of the existing CursorDebugWrapper.

Why a new thread-local instead of explicitly per-connection and stored
on the connection?

Carl

signature.asc

Shai Berger

unread,
Apr 7, 2017, 11:10:32 AM4/7/17
to django-d...@googlegroups.com
On Friday 07 April 2017 17:47:51 Carl Meyer wrote:
> Hi Shai,
>
> On 04/07/2017 06:02 AM, Shai Berger wrote:
> > This is an idea that came up during the djangocon-europe conference: Add
> > the ability to install general instrumentation hooks around the database
> > "execute" and "executemany" calls.
> >
> > Such hooks would allow all sorts of interesting features. For one, they
> > could replace the current special-case allowing assertNumQueries &
> > friends to record queries out of debug mode (it's an ugly hack, really),
> > but they could also support my imagined use case -- a context-manager
> > which could prevent database access during execution of some code (I'm
> > thinking mostly of using it around "render()" calls and serialization,
> > to make sure all database access is being done in the view).
>
> Another use-case is for preventing database access during tests unless
> specifically requested by the test (e.g. pytest-django does this,
> currently via monkeypatching).
>

Yep. This feels right.

> > My idea for implementation is to keep a thread-local stack of context-
> > managers, and have them wrap each call of "execute". We could actually
> > even use one such context-manager instead of the existing
> > CursorDebugWrapper.
>
> Why a new thread-local instead of explicitly per-connection and stored
> on the connection?
>

Sorry for implying that it would be a new thread-local, I just hadn't thought
it through when I wrote this. Of course it goes on the (already thread-local)
connection.

Shai.

Adam Johnson

unread,
Apr 13, 2017, 7:33:40 PM4/13/17
to django-d...@googlegroups.com
django-perf-rec would love this, it currently monkey patches connection.ops.last_executed_query to listen to all the queries going on.
--
Adam

Shai Berger

unread,
Sep 14, 2017, 6:20:57 AM9/14/17
to django-d...@googlegroups.com
In case you're interested and want to see this in 2.0, please help:

https://code.djangoproject.com/ticket/28595
https://github.com/django/django/pull/9078

On Friday 14 April 2017 02:33:06 Adam Johnson wrote:
> django-perf-rec <https://github.com/yplan/django-perf-rec> would love this,

Anssi Kääriäinen

unread,
Sep 15, 2017, 4:09:59 AM9/15/17
to Django developers (Contributions to Django itself)
Would it make sense to use the same technique used for HTTP request/response middleware? That is, the hook would look a bit like this:

def simple_execute_hook(execute):
# One-time configuration and initialization.
def execute_hook(sql, params, many, context):
# Code to be executed for each cursor.execute() call.
# If many = True, the final call will be execute_many.
# The context parameter might contain stuff like used
# connection.
execute(sql, params, many, context)
# Code to be executed after the SQL has been ran.
return execute_hook

You would then add the hook with the connection's context manager.

The reason I'm asking is that this way the coding style would be immediately familiar if you have used the request/response middlewares.

- Anssi

Shai Berger

unread,
Sep 15, 2017, 4:36:10 AM9/15/17
to django-d...@googlegroups.com
That's an interesting suggestion. At first look, it seems a nicer API than the
context manager. I'm a little worried about how errors would be handled,
though.

Shai Berger

unread,
Sep 15, 2017, 7:58:15 AM9/15/17
to django-d...@googlegroups.com
Well, of course, error handling was a red herring. It's up to the hook author.

But while looking deeper into it, I noted something else: The "One time
configuration" part doesn't really make sense. I'll explain:

Since the `execute` argument is really a method on a cursor object which,
potentially, doesn't even exist when the hook is installed, it needs to be
passed into simple_execute_hook() near invocation time, rather than at
registration time; and in fact, within the scope of one hook registration, we
may need to handle separate cursors. So, the external function must be called
again for every execution. Thus, the difference between code in the "one time
configuration" (actually, "each time configuration") part and code in the "code
to execute before query" part becomes arcane and hard to explain.

So, it becomes much more sensible to turn the hook into a wrapper, defined as:

def execute_wrapper(execute, sql, params, many, context):
# Code to be executed for each cursor.execute() call.
# If many = True, the final call will be execute_many.
# The context parameter might contain stuff like used
# connection.
result = execute(sql, params, many, context)
# Code to be executed after the SQL has been ran.
return result

Or even:

def execute_wrapper(execute, sql, params, many, context):
try:
# Code to be executed for each cursor.execute() call.
return execute(sql, params, many, context)
finally:
# Code to be executed after the SQL has been ran.

For this, I just need to figure out the currying of the execute parameter.

Shai
Reply all
Reply to author
Forward
0 new messages