Remove pyinotify autoreloader support

121 views
Skip to first unread message

Tom Forbes

unread,
Oct 6, 2018, 3:32:33 PM10/6/18
to django-d...@googlegroups.com

What do we think about removing the pyinotify functionality in the autoreloader? For context, if pyinotify is installed (Linux only) the autoreloader will attempt to use it to detect file changes over the standard polling-based approach. It is generally much more efficient but is not cross platform, and is not well documented currently IMO.

I’m hacking away at my attempt at refactoring the autoreloader (https://github.com/django/django/pull/8819) and I’ve made some good progress but I am worried about the lack of tests for pyinotify (there are none!). The current code in master works but it’s really hard to refactor in any meaningful way without them. I would very much like to explore adding support for watchdog (https://pythonhosted.org/watchdog/) instead of pyinotify and these changes I’m working on are a means to that end.

Wwatchdog is a library that wraps efficient platform-specific filesystem notifications in cross-platform way, and it includes an inotify implementation.

So I think there are three ways forward:

  1. Spend time and effort adding special tests for PyInotify (testing this stuff is not simple, especially with the current code)

  2. Remove the functionality with an eye to using something like watchdog in the near future

  3. Never really touch the autoreloader much (it is some of the oldest and nastiest code we have).

Does anyone have any strong opinions on this? Does anyone use the pyinotify speedup while developing?

Tom



charettes

unread,
Oct 6, 2018, 3:56:08 PM10/6/18
to Django developers (Contributions to Django itself)
While I understand the complexity of 1. I think shipping a version of Django without
an equivalent inotify replacement such as watchdog could be problematic.

From my personal experience when using Django's development server in Docker
containers sharing local volumes installing pyinotify resulted a significant performance
CPU and I/O improvement.

Simon

Tom Forbes

unread,
Oct 8, 2018, 7:04:20 AM10/8/18
to django-d...@googlegroups.com

Thanks for the feedback! In the pull request I have re-added support for pyinotify with tests, it was not as hard to write them as I believed. They still fail, but I’m working on that!

I’ve found an interesting module confusingly called watchgod. The author says[1] that with the new os.scandir() method added in python 3.5 a stat based method can scan a project of 850 files in about 24ms.

While this is watching a single directory tree and not the entirety of sys.modules I believe we could get similar speedups by being a bit more intelligent in our approach. For example we really don’t need to stat every single django.* module every second. I’m guessing that those change very rarely (unless you’re hacking on Django!), so could maybe stat them every 2–3 seconds or even not at all. We could reduce the stating time on site-packages modules as well for similar reasons.

If we do this right we could potentially get the overhead down to an acceptable level that we don’t necessarily need platform specific watchers.

  1. https://github.com/samuelcolvin/watchgod#why-no-inotify–kqueue–fsevent–winapi-support

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/62d4d8da-b5ca-4494-96ef-f58917b44e63%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tom Forbes

unread,
Nov 4, 2018, 10:14:12 AM11/4/18
to django-d...@googlegroups.com

I have done a lot more reading on this and I really feel pyinotify may not be required, or we could at least switch to a watchman based service for simplicity right away. Projects like uwsgi use a stat based approach, as do most other projects I’ve seen and it appears to work ok for them. watchman has the advantage of being easier to test, as we can mock the actual watchman service itself, and it supplies a sane, simple API across all supported platforms. However it suffers from some limitations I need to work around.

As an example of a fairly large project I took a look at Sentry. After running the test suite (and excluding test modules) there are 1,000 loaded Python modules under sentry.* and 4,000 total entries in sys.modules. Django accounts for about 600, by far the biggest third party module.

During development it seems likely that it’s not worth polling any builtin python modules at all, and drastically reducing the polling time for site packages including Django. That would reduce the number of files to poll by 75%. Perhaps we could even just skip watching site-packages entirely - I think it’s not very common to edit these in a usual development session, and if the user finds themselves editing them we could add a flag to include them in the watch list?

If anyone here has a fairly large Django app they work on could I request that they run this snippet at some point after it has booted and send me the output, to see if Sentry is representative?

from collections import Counter
counter = Counter(m.split('.')[0] for m in sys.modules.keys())
total = sum(counter.values())
project_total = counter['PROJECT NAME HERE']
print('Project:', project_total)
print('Total:', total)
print(counter.most_common(20))

Thanks,

Tom

Josh Smeaton

unread,
Nov 5, 2018, 6:22:54 PM11/5/18
to Django developers (Contributions to Django itself)
I ran a slightly modified version of your test code since we have many modules at "top level". "k3" is one of the top level modules that houses a bunch of apps. In total there are about 1340 python files (including recently squashed migrations and tests). Here's the results:

Project: 1575
Total: 7381
[(u'django', 890),
 (u'k3', 675),
 (u'nltk', 426),
 (u'newrelic', 294),
 (u'google', 172),
 (u'braintree', 171),
 (u'requests', 144),
 (u'rest_framework', 108),
 (u'boto', 101),
 (u'suds', 95),
 (u'numpy', 94),
 (u'core', 90),
 (u'weasyprint', 87),
 (u'PIL', 87),
 (u'docutils', 73),
 (u'cssutils', 70),
 (u'hyper', 68),
 (u'Crypto', 68),
 (u'email', 62),
 (u'oauthlib', 56),
 (u'cryptography', 55),
 (u'pygments', 55),
 (u'celery', 54),
 (u'oauth2client', 53),
 (u'elasticsearch', 50),
 (u'coreapi', 48),
 (u'jinja2', 47),
 ]
Reply all
Reply to author
Forward
0 new messages