Refactoring the autoreloader

263 views
Skip to first unread message

Tom Forbes

unread,
Sep 29, 2017, 3:04:17 PM9/29/17
to django-d...@googlegroups.com
Hello,
I've been thinking on and off about how to improve the autoreloader implementation and I wanted to gather some feedback on potential solutions.

For some background, Django uses a fairly basic autoreload implementation that simply polls the last modified time for loaded Python files once a second. While this isn't the most efficient, it does work and has worked quite well for a long time. When running manage.py runserver, the autoreloader will launch a child "manage.py" with the same arguments and this child process actually runs Django and serves requests. To reload, the child process exits with exit code 3 and the parent restarts it. The code is some of the oldest in Django, with a fair bit of it not touched in 9-12 years.

While it works (and I'm a believer in "if it isn't broke don't fix it") there are some architectural and performance issues with the current code:

- Polling every second is not very efficient
- Detecting when the child process has exited during startup (i.e problem in settings.py) is problematic and the code is rather nasty
- i18n files are 'reloaded' when they change in a rather hacky way (resetting private attributes in another module)
- There is limited support for extending the current implementation, and there are cases during development where the parent autoreloader will terminate.

I don't want this email to be too long, so I'm going to summarize what I think would be a good approach to tackling these problems.

1. Refactor the current implementation by removing `pyinotify`, redundant python 2 checks and implement a 'file_changed' signal so other parts of Django can react to file changes (i.e the i18n module can handle resetting it's own state).
2. Add support for the "watchdog" library as a replacement for pyinotify. Watchdog implements file system notifications for all major platforms using a fairly simple API, so we can remove polling and have instant reloading. Also support Watchman, a notification Daemon from Facebook.
3. Add support for more advanced features, like proper handing of startup errors and socket sharing.

I've got a merge request that implements all three stages as a proof of concept, but I think it's far too much a change to be done at once and should be done carefully stage by stage. One and two are fairly simple to implement, but three requires see careful consideration as to the best approach (this message is long enough already, I don't want to describe them here).

Does anyone have any feedback on these ideas? Is it worth persuing even if the current implementation works ok-ish?

François Freitag

unread,
Oct 2, 2017, 7:58:02 PM10/2/17
to django-d...@googlegroups.com
Hi Tom,

Thank you for the great summary.

> 2. Add support for the "watchdog" library as a replacement for
> pyinotify. Watchdog implements file system notifications for all major
> platforms using a fairly simple API, so we can remove polling and have
> instant reloading. Also support Watchman, a notification Daemon from
> Facebook.

Filesystem polling is required for some setup, such as mounting code
using NFS or rsync, for example using vagrant synced folders[1].
Although it does not prevent from using the watchdog library, which
provides a PollingObserver[2], I think it's worth keeping that use case
in mind. The PR has a StatReloader which seem able to handle filesystem
polling, I would suggest either keeping it or delegating the polling to
watchdog.

[1] https://www.vagrantup.com/docs/synced-folders/nfs.html
[2]
https://pythonhosted.org/watchdog/api.html#module-watchdog.observers.polling

Cheers,
François
> --
> You received this message because you are subscribed to the Google
> Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-develop...@googlegroups.com
> <mailto:django-develop...@googlegroups.com>.
> To post to this group, send email to django-d...@googlegroups.com
> <mailto:django-d...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/CAFNZOJMT9qDk-4pKKXSJysEQCmd6CGxMZBYZs_7BQs_WbAqL6g%40mail.gmail.com
> <https://groups.google.com/d/msgid/django-developers/CAFNZOJMT9qDk-4pKKXSJysEQCmd6CGxMZBYZs_7BQs_WbAqL6g%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

Tom Forbes

unread,
Oct 7, 2017, 1:43:19 PM10/7/17
to django-d...@googlegroups.com
Thanks for the feedback! Watchdog does include a stat watcher, and I think we can delegate to that. Currently we turn on the file system notifications if the library is available, but we should add a '--stat' flag to disable this it you are running on filesystems that don't support this. At which point we can delegate to the watchdog stat watcher if available, else fallback to our current implementation.

There hasn't been much feedback on this, so I'm guessing there isn't much appetite for these changes. I think improving the current code is a good idea though, even without the filesystem notifications there is some work that could be done.

To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com <mailto:django-developers+unsubsc...@googlegroups.com>.
To post to this group, send email to django-developers@googlegroups.com <mailto:django-developers@googlegroups.com>.
--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

Aymeric Augustin

unread,
Oct 7, 2017, 2:24:31 PM10/7/17
to django-d...@googlegroups.com
Hello Tom,

I think the PR is on the right track. Now it needs to be taken beyond the "proof of concept" stage.

Once you have code, docs, and if possibly some tests — currently there are few due to the difficulty of testing auto reloading, especially edge cases — for steps 1 and 2, I can review changes again and help you get it merged.

Since this is a major overhaul, we'll have to think carefully about how the behavior changes and possible backwards incompatibilities. That said, the autoreloader is a dev tool. Changing it isn't going to break anyone's production. This makes the big-bang approach viable.

Thanks for working on this!

-- 
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.

Tim Allen

unread,
Oct 8, 2017, 7:34:09 AM10/8/17
to Django developers (Contributions to Django itself)
If we need some dogfooding with real-world candidates, I'd be happy to provide some. We have a project which currently takes ~20 minutes to start up (automated data models created by introspecting a database) that might be a good edge-case candidate.

Josh Smeaton

unread,
Oct 11, 2017, 7:42:41 PM10/11/17
to Django developers (Contributions to Django itself)
I'd just like to second (or third) an appetite for this or a similar change. Lately I've started to notice very poor performance (high cpu usage, slow shutdowns, etc) with the django dev server on my project. I've moved to runserver_plus/werkzeug with watchdog in the mean time but it'd be good to have similar improvements via django natively.

Tom Forbes

unread,
Dec 10, 2017, 1:35:52 PM12/10/17
to django-d...@googlegroups.com
Thank you all for your feedback! I have fixed up my PR (https://github.com/django/django/pull/8819) to add support for watchman-based reloading, and it seems to be working well locally. After further investigation I don't think using the Watchdog library is suitable for Django, it seems to be un-maintained right now (which is a real shame). Watchman seems to be faster even with constant reloading though, so I think it's fine to just add support for that.

If anyone is interested they could take the changes to `django/core/management/commands/runserver.py` and `django/utils/autoreload.py`, and apply them to their installations. You just need to install the `pywatchman` package + watchman itself, then run manage.py with '--watchman'. I'd love some real-world feedback as to if this improves reload speed and reduces CPU usage. Unfortunately Watchman won't work on NFS volumes or virtualbox folders, so if you're using Vagrant there isn't much that will help you here, which is also why we cannot make this a default option - watchman will appear to work, but not pick up any changes, if its run on such a volume.

I'll start work on adding documentation and tests next, but I'm quite lost as to how I should test the Watchman integration. Should I be testing using a real Watchman service, or should I mock it and hope that's enough?


To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages