Revisiting lazy middleware initialization

246 views
Skip to first unread message

David Evans

unread,
Mar 24, 2016, 12:39:50 PM3/24/16
to Django developers (Contributions to Django itself)
Hi all,

Currently, middleware is initialized lazily on serving the first request, rather than on application start. There may well have been good reasons for this historically, but I don't think they apply any longer. Lazy initialization is unhelpful if a middleware class throws an error (e.g to report a misconfiguration) because the application will appear to start successfully and only later report the error when a request is made.

I'd like to propose initializing middleware when `get_wsgi_application` is called. This solves the problem described above and, as far as I can tell, raises no backwards compatibility issues.

More details on all this below.


### 1. Specific example of the problem

I recently wrote an adapter for the WhiteNoise static file server so it could function as Django middleware as well as WSGI middleware (https://github.com/evansd/whitenoise). WhiteNoise may be unusual in doing a non-trivial amount of work on initialization, but it doesn't seem unreasonable. When used as WSGI middleware any errors are triggered immediately on start up, but not so when used as Django middleware. This makes for a worse developer experience and an increased chance of deployment errors.


### 2. Reasons previously given for lazy initialization

There was some brief discussion in this ticket 4 years ago:

The reason given there is that "resolving on first request makes most sense, especially for the case where you might not be serving requests at all". Presumably this refers to running non-http-related management commands. But in those cases we never instantiate a WSGI application anyway (wsgi.py is just never imported) so this is no reason not to initialize eagerly when constructing the WSGI application. (Of course, things may have been different 4 years ago.)

Another reason is given in the comments in django.core.handles.wsgi:
https://github.com/django/django/blob/3c1b572f1815c295878795b183b1957d0df2ca39/django/core/handlers/wsgi.py#L154

This says "Set up middleware if needed. We couldn't do this earlier, because settings weren't available". However `get_wsgi_application` (the only public WSGI API) now calls `django.setup()` before constructing the handler so settings are in fact available.


### 3. Proposed solution

My proposal is simply to amend `get_wsgi_application` as follows:

    def get_wsgi_application():
        django.setup(set_prefix=False)
        handler = WSGIHandler()
        handler.load_middleware()
        return handler

It's possible that this logic could be moved into the handler's __init__ method. This caused no problems with existing application when I tried it, however it did cause problems with the test suite which seems to rely on the old behaviour in places. The above proposal passes all existing tests as is.


### 4. Backwards compatibility issues

Middleware constructors have no means of accessing the request object or anything that depends on it. They are called right at the start of the handler's `__call__` method before the `request_started` signal is sent and before the `script_prefix` thread-local is set. Therefore it cannot matter, from the middleware class's perspective, whether it is instantiated before or after the first request comes in.


I'm aware this issue probably isn't high on anyone else's priority list, but I think it would count as a genuine -- if small -- improvement to Django.

Thanks,

Dave

Aymeric Augustin

unread,
Mar 25, 2016, 6:02:26 AM3/25/16
to django-d...@googlegroups.com
On 24 Mar 2016, at 17:39, David Evans <drh....@gmail.com> wrote:

Currently, middleware is initialized lazily on serving the first request, rather than on application start. There may well have been good reasons for this historically, but I don't think they apply any longer.

Indeed, since 1.7 and the app-loading refactor, Django does its best to crash when it starts rather than when it serves the first request. +1 to making this change.

-- 
Aymeric.

Florian Apolloner

unread,
Mar 25, 2016, 6:08:35 AM3/25/16
to Django developers (Contributions to Django itself)
+1 -- Patches welcome :D

David Evans

unread,
Mar 25, 2016, 7:12:47 AM3/25/16
to django-d...@googlegroups.com

Great!

Well the two line patch I suggested to `get_wsgi_application` solves the problem. But it still leaves the lazy loading mechanism in the code base. Getting rid of it would obviously be preferable, but is more complex. I'll work on producing a patch for that though.

Dave

--
You received this message because you are subscribed to a topic in the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/iIm7M6aUJmU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/3d63f9d8-8781-4b9e-8498-9cc7c091831b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

bliy...@rentlytics.com

unread,
Mar 27, 2016, 4:15:57 PM3/27/16
to Django developers (Contributions to Django itself)
I'm not too familiar with the code you're referencing, but I'm personally really annoyed by lazy loading.  It has a tendency to make selenium tests timeout inconsistently in CI, as well as give the impression to my bosses that the app is slow rather than just the first load which is usually what they see on new features.

-Ben

Fred Stluka

unread,
Mar 27, 2016, 6:42:36 PM3/27/16
to django-d...@googlegroups.com
Ben,

If lazy loading is causing you problems, here's good info on how to
force Django to load everything up front, by calling select_related()
and prefetch_related() in cases where you need to.  And also how to
make that the default via use_for_related_fields and custom managers:
- https://docs.djangoproject.com/en/1.9/topics/db/optimization/

--Fred
Fred Stluka -- mailto:fr...@bristle.com -- http://bristle.com/~fred/
Bristle Software, Inc -- http://bristle.com -- Glad to be of service!
Open Source: Without walls and fences, we need no Windows or Gates.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

Shai Berger

unread,
Mar 27, 2016, 6:49:33 PM3/27/16
to django-d...@googlegroups.com
Fred,

The thread (and Ben) is talking about "lazy load" in the sense of "wait until
the first request comes in, only then load things". This used to be the way all
of Django behaved until 1.7 and the app refactor. These days, it only applies
to some parts -- middleware is the topic of this thread, and templates is
another example that comes to mind (at least when cached -- otherwise they may
be loaded on every request).

HTH,
Shai.

On Monday 28 March 2016 01:42:23 Fred Stluka wrote:
> Ben,
>
> If lazy loading is causing you problems, here's good info on how to
> force Django to load everything up front, by calling select_related()
> and prefetch_related() in cases where you need to. And also how to
> make that the default via use_for_related_fields and custom managers:
> - https://docs.djangoproject.com/en/1.9/topics/db/optimization/
>
> --Fred
> ------------------------------------------------------------------------
> Fred Stluka -- mailto:fr...@bristle.com -- http://bristle.com/~fred/
> Bristle Software, Inc -- http://bristle.com -- Glad to be of service!
> Open Source: Without walls and fences, we need no Windows or Gates.
> ------------------------------------------------------------------------
> > <https://github.com/evansd/whitenoise>). WhiteNoise may be unusual
> > in doing a non-trivial amount of work on initialization, but it
> > doesn't seem unreasonable. When used as WSGI middleware any errors
> > are triggered immediately on start up, but not so when used as
> > Django middleware. This makes for a worse developer experience and
> > an increased chance of deployment errors.
> >
> >
> > ### 2. Reasons previously given for lazy initialization
> >
> > There was some brief discussion in this ticket 4 years ago:
> > https://code.djangoproject.com/ticket/18577
> > <https://code.djangoproject.com/ticket/18577>
> >
> > The reason given there is that "resolving on first request makes
> > most sense, especially for the case where you might not be serving
> > requests at all". Presumably this refers to running
> > non-http-related management commands. But in those cases we never
> > instantiate a WSGI application anyway (wsgi.py is just never
> > imported) so this is no reason not to initialize eagerly when
> > constructing the WSGI application. (Of course, things may have
> > been different 4 years ago.)
> >
> > Another reason is given in the comments in django.core.handles.wsgi:
> > https://github.com/django/django/blob/3c1b572f1815c295878795b183b1957
> > d0df2ca39/django/core/handlers/wsgi.py#L154
> > <https://github.com/django/django/blob/3c1b572f1815c295878795b183b19
> > 57d0df2ca39/django/core/handlers/wsgi.py#L154>

Ben Liyanage

unread,
Mar 27, 2016, 7:01:39 PM3/27/16
to django-d...@googlegroups.com
I'm not currently suffering from this.  In the past with .NET I have definitely suffered from some long memory loads of DLLs. 

The stuff I'm doing now is a lot of back end batch processing where how fast the initial response is is pretty negligible compared to a process that lasts for 10+ minutes.

Just weighing in on the general design principle of long initial lazy loads in frameworks.

-Ben

--
You received this message because you are subscribed to a topic in the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/iIm7M6aUJmU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-develop...@googlegroups.com.

To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

For more options, visit https://groups.google.com/d/optout.



--
Ben Liyanage | Software Engineer | Rentlytics, Inc.
Phone: (410) 336-2464 | Email: bliy...@rentlytics.com
1132 Howard Street, San Francisco CA 94107
Reply all
Reply to author
Forward
0 new messages