App-loading reloaded

1,379 views
Skip to first unread message

Aymeric Augustin

unread,
Dec 15, 2013, 3:08:38 PM12/15/13
to django-d...@googlegroups.com
**Warning: Wall of Text** (and there’s no tl;dr).

History

The reference ticket for the project known as app-loading is https://code.djangoproject.com/ticket/3591.

Joseph Kocherhans attaches an initial patch when he files the ticket in February 2007. That patch changes INSTALLED_APPS to a list of “app directives”, which are simple data structures with a path (eg. “django.contrib.sites”), a label (eg. “sites”), and a verbose name for the admin (eg. _(“Sites”)). It updates various parts of Django that deal with INSTALLED_APPS to account for this change. Vinay Sajip completes this patch and keeps it up to date from May 2007 to May 2009.

In the discussion Johannes Dollinger points out that app definitions shouldn’t contain a verbose_name that’s only used by a contrib app, namely the admin. I disagree for two reasons. Third-party apps could use verbose_name to reference other apps in user interfaces. Model._meta.verbose_name lives just fine in core and there’s no obvious reason to treat applications differently.

Significant efforts occur at the DjangoCon US 2008 sprints but they’re ultimately unsuccessful. A post-mortem is available at: https://code.djangoproject.com/wiki/InstalledAppsRevision. It highlights several issues encountered when attempting to change the format of entries in INSTALLED_APPS.

At this point there’s a lot of bikeshedding about the format of INSTALLED_APPS. It’s a red herring but it’s a recurring theme in further discussions, especially GSoC proposals. Jannis Leidel will eventually propose a sane API in response to Arthur Koziel’s GSoC proposal.

Arthur Koziel developed an extensive patch during GSoC 2010. His proposal is at https://groups.google.com/forum/#!msg/django-developers/TMoght1IWHk/IUEHrOx5pFQJ and his patch at https://github.com/django/django/tree/soc2010/app-loading. Jannis Leidel, Travis Swicegood and Preston Holmes continued his work. Marc Tamlyn described the goals in https://gist.github.com/mjtamlyn/2897920. The most recent version of the branch is at https://github.com/ptone/django/commits/app-loading. Development stalled after DjangoCon US 2012.

While a lot of work went into that patch, unfortunately, it’s hardly a good starting point:
- it was written at various stages of the history of Django, which means it may contain incomplete refactorings due to later changes,
- many parts are a “work in progress” and it’s hard to tell the intent of the successive authors.
However, it contains tons of tests that should still be relevant.

Goals

Many related issues have been closed as duplicates of #3591. As a consequence, its scope has become frighteningly ambitious over the years. This is certainly one of the reasons why it hasn’t been resolved yet.

Since I have time and motivation to work on this project from Dec 10th to Dec 24th, I’ve sorted its goals from the most to the least important and selected three that will hopefully fit in this time frame:

1) Allow apps without a models module or package
2) Provide verbose_name for the admin
3) Provide a reliable initialization signal

In fact, they’re quite independent from one another.

Many more ideas were mentioned on the ticket; here’s a list of goals I’m not actively pursuing, roughly sorted from most to least likely to happen:

4) Allow changing app_label in case of name clashes
5) Allow changing db_prefix
6) Clarify the app cache population sequence
7) Enforce consistency between the contents of INSTALLED_APPS and the AppCache
8) Provide APIs to discover things at arbitrary Python or filesystems subpaths inside apps
9) AppAdmin — #7497
10) App-specific settings in application objects
11) Order of INSTALLED_APPS — #21018
12) Support deploying multiple copies of an app — most likely not worth solving, if possible at all

Once again, they’re quite independent from one another.

If some of these become trivial as a side effect of my work, fantastic. Otherwise, I’ll leave them for later. My work will certainly make them easier to achieve in the future.

I will follow-up with detailed emails for each of my three goals.

--
Aymeric.



Aymeric Augustin

unread,
Dec 16, 2013, 6:02:25 AM12/16/13
to django-d...@googlegroups.com
Merge request

I sent a pull request implementing my first goal: https://github.com/django/django/pull/2076. It allows creating apps without a models module, and that’s it. As far as I can tell it’s fully backwards-compatible. It even includes deprecation paths for some private APIs for extra safety.

It was reviewed by Florian Apolloner and Marc Tamlyn. They had some suggestions which I implemented and they gave it a +1. It was tested by Loic Bistuer. He found a small issue with migrations which I fixed.

I want to merge it soon because it contains huge refactorings that could go stale. Since it has some overlap with Andrew’s work on migrations, it’s safer to perform continuous integration by merging to master often.

I haven’t written documentation yet. It’s just a matter of removing all suggestions that Django apps must contain a models module and adding an entry in the release notes. I’ll do that in time for 1.7 alpha.

Even if failed to reach my other goals, this patch would be worth merging because it improves the design of the app cache and paves the way for interesting features.

If you see a good reason not to merge it, please tell me as soon as possible!

Notes for reviewers

Here’s a digest of the daily notes I sent to potential reviewers while I was working on this patch.

Day 1

My priorities are backwards-compatibility, minimalism, and reviewability. I want something to be merged even if it’s just the first step in a series of incremental improvements.

After spending some time studying Jannis’ and Preston’s latest patches [1] [2], sadly, I came to the conclusion that it would be inefficient for me to start by rebasing Preston’s branch on top of master. First, it’s quite outdated. Then, there’s just much code that I don’t understand to make rebasing a realistic option. Finally, some code was written to support goals that I’ve excluded from my scope, some is a work in progress, and some looks left over from previous iterations of the patch.

I’m sorry. I’ll still try to extract as much value as I can from past efforts. My plan is to get to a minimal working solution with incremental changes, then review Preston’s diff to see if I’ve missed something interesting, and finally steal the tests and docs in bulk. I know that TDD would be cleaner but I don’t have enough motivation and time to do it that way.

I started coding today: https://github.com/aaugustin/django/tree/yet-another-app-loading-branch

[1] https://code.djangoproject.com/attachment/ticket/3591/app-loading.2.diff
[2] https://github.com/ptone/django/compare/master...app-loading

Day 2

Yesterday I replaced all direct calls to the get_app, get_model, etc. functions by method calls on the app_cache object, in order to normalize the code base and make it more obvious when we’re relying on global state. I saw how pervasive our use of the app cache was, wept quietly and moved on.

Today I created an AppConfig class. I gathered from past discussions that there was a consensus on this name. (“App” already means something else.) Then I refactored the app cache to store state in one AppConfig instance per application. I haven't introduced any changes in behavior — at least not knowingly.

The code is too complicated for my tastes because Django allows using models from apps that aren’t in INSTALLED_APPS. I’ve known about this behavior for some time and I’m still wondering why we need it. I’d like to deprecate this possibility by raising a warning and then an exception when a model is imported from an app that isn’t in INSTALLED_APPS. But I’ll leave that for later in order to keep my branch backwards-compatible.

I plan to continue to refactor code that uses the models module as a key. That means replacing all uses of app_cache.get_app(label), which returns a models module, by something that operates on an AppConfig instance. I will go as far as possible with 100% backwards-compatible refactoring before introducing changes.

Day 3

Today I’ve introduced get_app_config() and get_app_configs() methods and refactored get_app() and get_apps() to use them. The first one was tricky because get_app() has grown some cruft over the years that doesn’t seem useful any more. I’ve written a detailed explanation in the commit message.

I’ve then deprecated get_app_package(), get_app_path() and get_app_paths(). They were barely used. I’ve also simplified some parts of the app cache code.

I broke the test suite countless times and it’s still broken right now. That’s good news, because it means that the existing tests go though many code paths in the app cache. I’m relying on that to validate my refactoring. Tomorrow I’ll make sure that the test suite passes after each commit. (I’m rewriting history heavily to keep the branch reviewable.)

Then my next step will be to replace all uses of get_app() and get_apps() in Django and deprecate these two methods. All this aims at making the app cache code as tight as possible and at bringing it to a point where it’s trivial to provide full support for apps without models.py, with a small change in load_app().

Day 4

Today went pretty much as planned. I started by fixing the test suite and deprecating get_app() and get_apps(). Then I made it possible to create apps without a models module. The deprecated (pre-1.6) test runner relied heavily on models module; I refactored it in a separate commit for clarity.

--
Aymeric.

Aymeric Augustin

unread,
Dec 17, 2013, 4:36:58 AM12/17/13
to django-d...@googlegroups.com
On 16 déc. 2013, at 12:02, Aymeric Augustin <aymeric....@polytechnique.org> wrote:

> I sent a pull request implementing my first goal: https://github.com/django/django/pull/2076.

Merged: https://github.com/django/django/compare/fe1389e911b0...4a56a93cc458

--
Aymeric.




Aymeric Augustin

unread,
Dec 20, 2013, 10:37:07 AM12/20/13
to django-d...@googlegroups.com
Merge request

I sent a pull request implementing my second goal: https://github.com/django/django/pull/2089. It allows customizing application names in the admin.

A handful of core developers were kind enough to oversee my efforts. Their feedback on the design has been positive. However, as far as I know, that part of the branch hasn’t received a full review yet. It would be helpful if someone had the time to read through the commits.

Custom app names are implemented in a single commit that changes four lines and doesn’t have tests and docs yet. It doesn’t even deal with breadcrumbs. I’ve included it as a proof-of-concept but I can leave it aside for now if you find that too cavalier.

What I reallly want to merge is the seventeen previous previous commits. Their sole purpose is to make the app cache the only piece of code that knows about the INSTALLED_APPS settings. That’s a prerequisite for changing its format and support custom application configurations.

It’s a fairly large refactoring that, as far as I know, should be backwards compatible. Of course, third-party apps that use INSTALLED_APPS will need to be updated to support the new format. But as long as you aren’t using custom application configurations, everything should still work as usual. Since I’ve rewritten the app cache population code, the import sequence could be different, and that could create import loops in some projects. Unfortunately here’s no way to tell for sure besides asking people to try it. It could just as well suppress import loops in other projects, except there aren’t many active projects with import loops :-)

Again, if you see a good reason not to merge it, please tell me as soon as possible!

The API for customizing an application’s configuration comes in two flavors. Currently the only attribute you can customize is “verbose_name”.

1) You’re the author of a pluggable app called Rock ’n’ roll and you’re fed up with seeing Rock_N_Roll in the admin:

# rock_n_roll/__init__.py

from django.core.apps import base

class AppConfig(base.AppConfig):
verbose_name = "Rock ’n’ roll"

The class has to be called “AppConfig” for Django to discover it.

2) You’re an user of Rock ’n’ roll and you wish it had a more palatable name:

# myproject/apps.py

from django.core.apps import base

class GypsyJazzConfig(base.AppConfig):
name = "rock_n_roll"
verbose_name = "Gypsy jazz"

# myproject/settings.py

INSTALLED_APPS = [
# …
'myproject.apps.GypsyJazzConfig',
# …
]

The class can live anywhere, but it must provide a “name” so that Django knows what application it relates to.

Notes for reviewers

There’s one part of the patch I don’t like much: the AppCache methods listed under ### DANGEROUS METHODS ###. I’ve made it very obvious that they’re private APIs and they’re only used in tests. At this time I don’t have any better ideas. It seems reasonable to decide that we’ll improve it later them if the need arises.

Day 5

I thought about the API. The end result is described above. I’m skipping the details as I’m afraid they would add more confusion than clarity.

Day 6

I began laying the ground for AppConfig classes in INSTALLED_APPS by searching a way to remove all direct references to INSTALLED_APPS to use app_cache.get_app_configs() instead.

Note that app_cache.get_app_configs() calls app_cache.populate(). Therefore iterating on app_cache.get_app_configs() will trigger some imports, while iterating on settings.INSTALLED_APPS didn’t. This may result in different behavior, including import loops, but I’m afraid there’s no way around this. Hopefully the end result will be more deterministic than what we currently have, as Django will populate the app cache earlier.

I don’t know how I’m going to deal with override_settings(INSTALLED_APPS=[…]). There’s a few dozen of these in the test suite. Until now I’ve refrained from implementing AppCache.reload() because it means https://code.djangoproject.com/attachment/ticket/3591/app-loading.2.diff#L165 and that scares me. I might end up reusing AppCache.set_available_apps(), although currently it only supports restricting the set of installed apps.

Day 7

My plan to remove direct references to INSTALLED_APPS with get_app_configs() failed miserably. get_app_configs() calls populate() which imports every application. This triggers import loops. I also spent some time implementing support for override_settings(INSTALLED_APPS=…) at the level of the app cache but that got messy very quickly.

In fact, we need to be able to iterate on applications to discover things (like translations and templates) without importing models. Currently the app cache’s API doesn’t provide that feature. My next idea is to split populate() in two stages:
- First stage does everything that doesn’t require importing models modules. Application modules aren’t supposed to import much, if anything.
- Second stage imports models modules and adds them to the app_config objects.

This design makes it possible to replace `for app in INSTALLED_APPS` with a method of the app cache without triggering an import cascade. This is especially important when it occurs at the module level (which is awful but we have such code in django/template/loaders/app_directories.py).

In the mean time, I deborgified the app cache, which went surprisingly well, and renormalized the “installed” flag. I pushed these changes to master.

Day 8

I finally broke through the wall of circular imports I kept running into as soon as I touched populate() or attempted to use the app cache during the import sequence. I replaced populate() with the two-stage population process.

This had some interesting side effects on the implementation of the app cache, specifically on the registration of models. In my opinion it has become more resilient and straightforward. For instance I find that unconditionally calling an idempotent method is more robust than conditionally calling it depending on some global state that’s mutated throughout a stack of recursive calls.

I also created private APIs to mess with the list of registered apps, because Django has many tests doing just that, and it’s better to do it through an API.

Day 9

I replaced all direct uses of INSTALLED_APPS in django/ and tests/ with methods of the app cache. It was tedious and touched lots of dubious tests.

Day 10

I implemented support for custom AppConfig subclasses in INSTALLED_APPS, and I wrote a proof-of-concept in the admin.

--
Aymeric.

Aymeric Augustin

unread,
Dec 21, 2013, 11:01:19 AM12/21/13
to django-d...@googlegroups.com
Hello,

Based on the feedback I received through several channels (GitHub, IRC, private email) I’m planning to make two API changes before merging this pull request.


(1) Remove auto-discovery of AppConfig in application modules

I implemented this shim to make it possible to take advantage of app configs without changing the format of INSTALLED_APPS. I wanted to increase backwards-compatibility and accelerate adoption in pluggable apps. I also wanted to keep INSTALLED_APPS short and readable in the common case and avoid this pattern:

INSTALLED_APPS = (
‘django.contrib.admin.app.AdminConfig',
'django.contrib.auth.app.AuthConfig',
'django.contrib.contenttypes.app.ContentTypesConfig',
'django.contrib.sessions.app.SessionsConfig',
'django.contrib.messages.app.MessagesConfig',
'django.contrib.staticfiles.app.StaticFilesConfig’,
# et caetera ad nauseam
# cold enterprisey thorny feelings
)

Astute readers will note that MIDDLEWARE_CLASSES looks even worse but I’m not buying the “make it consistently ugly" argument :-)

(Also I haven’t prepared any changes to Django’s contrib apps yet.)

However, I can also list a few reasons not to provide this shim:

- “There should be one-- and preferably only one --obvious way to do it.”
- “Explicit is better than implicit.”
- Django shouldn't encourage writing code in __init__.py files because it’s a common cause of import loops eg. when a submodule attempts to import an object defined in a parent package. Strictly speaking, importing the base AppConfig class and implementing a subclass is safe: it doesn’t even load the Django settings. But it’s still a code smell.
- Python ≥ 3.3 introduces support for namespace packages (PEP 420) by making __init__.py files optional. I’m pretty sure someone will find a good reason to implement a Django app as a namespace package; then they couldn’t take advantage of this convention any more. Besides, skipping __init__.py files entirely might become a good practice in the next years.

One could also say that the explicit version makes it obvious which applications use an application configuration and which don’t, but I don’t find the difference between “using the application’s default AppConfig” and “using Django’s default AppConfig” relevant.

I’m ambivalent about this question. Considering that it can easily be added later, I’m leaning towards not including it for now. The reverse would be much more complicated.


(2) Move the code back into django.apps

I originally wrote all the code in django.apps.

Then I realized it could cause confusion because “apps” shipped with Django are in django.contrib, not in django.apps, and I moved the code to django.contrib.apps. One could argue that apps are a core concept.

However, there are two strong arguments for moving the code back to its original location.

- Most APIs intended to be imported by user code live outside of core: “django.forms”, “django.db.models”, “django.templates”, “django.views”, etc.
- Most of the code in django.core would better live in a shallower hierarchy. For instance “django.core.urlresolvers” doesn’t strike me a superior to “django.urls”. Besides “flat is better than nested”.


Let me if you have additional arguments in favor of or against this changes. I’ll try to implement them tomorrow or on Monday.

--
Aymeric.

Aymeric Augustin

unread,
Dec 22, 2013, 6:14:02 AM12/22/13
to django-d...@googlegroups.com
I’ve updated the pull request with these changes as well as a few minor comments made by reviewers.

https://github.com/django/django/pull/2089

I think it’s worth merging in that state. I still have a list of about 20 things to check, but none of them is likely to change anything fundamental.

--
Aymeric.

Aymeric Augustin

unread,
Dec 22, 2013, 11:10:57 AM12/22/13
to django-d...@googlegroups.com

Aymeric Augustin

unread,
Dec 23, 2013, 8:04:46 AM12/23/13
to django-d...@googlegroups.com
After the merge, some Selenium tests (which I don’t run locally in general) failed on Jenkins. That’s a test isolation issue: an AppDirectoriesFinder instance kept a cache of an incomplete list of applications, preventing later tests from finding static files correctly.

I wanted to review the methods that add and remove apps to the app cache anyway — that’s the “part of the patch I don’t like much” I was talking about in my first email. I’ve sent a new pull request for this purpose: https://github.com/django/django/pull/2103.

It introduces a new API called override_list_settings to alter settings that are lists of values. It allows simply using override_settings(INSTALLED_APPS=…) or override_list_settings(INSTALLED_APPS=…). django.test now contains some references to django.apps but that seems reasonable to be as the app cache is a core concept of Django. Overall the result seems much better, both in terms of API and implementation.

I’ll merge that once I get a +1 from another core dev.

--
Aymeric.

Aymeric Augustin

unread,
Dec 23, 2013, 4:50:39 PM12/23/13
to django-d...@googlegroups.com
The time frame I had allocated to this project expires tomorrow. With reference to my original list, I’ve reached goals 1, 2, 6 and 7, with the caveat that I still have to write documentation.

I ran out of time before looking seriously at goal 3, but it appears to be within reach at this point. The basic ideas are:
- to populate the app cache “early”, for instance in get_wsgi_application() and execute_from_command_line(),
- to call a conventional method on the app config, say startup(), as soon as the app cache is populated.
Apps that need to execute code at startup would put it in AppConfig.startup(). This solution doesn’t use signals, as there is no earlier opportunity to register them.

I know that I didn’t reach the standards of communication and quality expected in the Django community while I was working on this project. I didn’t try to have a discussion on this mailing list. I asked a few core devs who worked on app-loading in the past to look at my code and ruthlessly pushed it as soon as I got their feedback. Given the staggering amount of hard problems I was trudging though, that’s all I could manage, and it was a conscious choice. Now that master has almost reached a stable state, I will do my best to gather feedback and fine tune the code until the 1.7 release.

I’ve mentioned a few times that I was maintaining a list of future tasks I was leaving aside while I was focusing on more pressing issues. Here is it.


** Must do **

* Fix the test suite. There are still a few failures, apparently caused by leaking app cache state.

* Write documentation, especially explain the app[s].py conventions and why `for app in INSTALLED_APPS` doesn't work any more.

* "UserWarning: No fixture named 'comment_tests' found." when running the test suite.

* Review previous patches. Extract relevant tests and docs, if any.

* Review past discussions once again.

* Close #3951 and file individual tickets for remaining tasks. This massive ticket is a nuisance.


** Cleanup / small tasks **

* Rename AppCache to InstalledApps? Then we should also replace app_cache by installed_apps. Originally proposed by Arthur Koziel. The parallel with INSTALLED_APPS is interesting but it may also cause confusion. I would also consider simply Apps and apps, although `from django.apps import apps` doesn't look that good. This must be done by 1.7 alpha if we want to do it.

* Review all uses of get_app_configs() and determine if the results may be cached. If so, add a receiver for the setting_changed signal to reset the cache when INSTALLED_APPS changes.

* Replace TransRealMixin by suitable receivers for the setting_changed signal.

* Refactor AppCommand to be able to handle apps without a models module.

* Adapt the migrations code:
- Refactor signals to stop relying on the models module.
- Introduce MIGRATIONS_MODULE_NAME for consistency with MODELS_MODULE_NAME.

* Refactor get_models and get_migratable_models. They can probably be implemented more straightforwardly.

* Invalidate _get_models_cache appropriately or remove it entirely. This doesn't look like a performance-sensitive area.

* Add a comment to explain how postponing works in populate_models(), or remove it if possible. See https://github.com/django/django/pull/2089#discussion_r8516536.

* populate_models() isn't exactly idempotent. A recursive invocation may return while models aren't fully populated, and it won't set _models_loaded. This could probably be an issue in some cases, but it would be moot if we got rid of the postponing.

* Consider eliminating the check for duplicate imports of models in register_model(), which has become unnecessary with the new project layout introduced in 1.4. It would probably be safer to do it through a deprecation path.

* Review the implementation of dumpdata. It looks like it could be simplified by using AppConfigs.

* Review how much django.apps and django.conf import — they should stay lean to avoid import loops.


** Further work / large tasks **

(Many of these are open questions that could be debated on this mailing list.)

1) Provide an API to run code at startup, along the lines described above.

2) Some fields should be moved from Model._meta to Model._meta.app_config, which should be set as part of the app cache population process.

3) Make it possible to change the label in AppConfig. That's half of the original feature request in #3591 (the other half being verbose_name). Requires 2 — and then lots of work.

4) Make it possible to change various prefixes (admin permissions, database tables, dumpdata/loaddata). That was part of Arthur Koziel's GSoC proposal. Requires 2 — and then some work.

5) Investigate whether app labels currently need to be unique (the docs say so, but the code doesn’t enforce it). See https://groups.google.com/d/msg/django-developers/gzBWU_fUdgc/zlTTkKx-s5QJ. Consider enforcing unicity. Easier to pull off after 3.

6) Store application-specific settings in AppConfig? I'm not sure where to draw the frontier between settings and app configs but this feels like feature creep. Maybe not a wontfix but at least a someday/maybe.

7) Store urls and views in AppConfig? It never seemed a very practical idea, and that’s why we ended up with lean AppConfig objects. We’re certainly not in a position to decide this is a feature we want on Django at this point.

I'm not listing multiple instances of a pluggable app because I believe the current way of formulating this feature request is incorrect and cannot be implemented. For starters, when you import a model class, to which copy of its app does it belong?

--
Aymeric.

Andrew Godwin

unread,
Dec 23, 2013, 5:31:15 PM12/23/13
to django-d...@googlegroups.com
Thanks for your work on this, Aymeric - the reduced scope has really done well, I think.

Couple of comments (I agree that generally the list of work is accurate):

 - There's already a MIGRATION_MODULES setting which lets you set the path to a migration module by app label. We probably just need to rename it for consistency, no need to implement it!
 - I'm not sure that installed_apps is a good name to have to replace app_cache, but I also think that having "cache" in the name is useless. I don't mind just calling it "apps".
 - App labels need to be unique, and we should enforce this. It's always been the case that they had to be, and we've never checked properly. Too much stuff relies on their uniqueness.

The big question is, of course, how feasible is it that this will all* land by the alpha? It's January 20th - less than a month away - and while I'd like to see this land in 1.7, if it misses that date it's out; the release is already complex enough as it is to have a behind-time complex feature drag it back further, and now your dedicated time on this is over it's going to move slower.

Andrew


* By "all", I mean "enough of it that it works, doesn't break anything, and is an obvious improvement"


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/1E8B5E64-64A1-4561-A1F1-0FDD59D5336A%40polytechnique.org.

Aymeric Augustin

unread,
Dec 24, 2013, 5:23:01 AM12/24/13
to django-d...@googlegroups.com
On 23 déc. 2013, at 23:31, Andrew Godwin <and...@aeracode.org> wrote:

> The big question is, of course, how feasible is it that this will all* land by the alpha? It's January 20th - less than a month away - and while I'd like to see this land in 1.7, if it misses that date it's out; the release is already complex enough as it is to have a behind-time complex feature drag it back further, and now your dedicated time on this is over it's going to move slower.

I will deal with the items listed under “Must do”. I will try to do as many “Cleanup / small tasks” as possible.

“Further work / large tasks” are unlikely to make it into 1.7 unless they turn out to be really easy, but that’s all right. They’re incremental improvements that can be developed and documented in later releases.

Of course I’m happy to provide feedback and supervision to contributors who’d like to tackle some of these items. Just get in touch before you start coding to avoid duplicate work.

--
Aymeric.




Luc Saffre

unread,
Dec 25, 2013, 6:22:20 PM12/25/13
to django-d...@googlegroups.com
On 24/12/13 12:23, Aymeric Augustin wrote:
>
> Of course I�m happy to provide feedback and supervision to
> contributors who�d like to tackle some of these items. Just get in
> touch before you start coding to avoid duplicate work.

Hey Aymeric, thanks for your work! I am not a core developer, but feel
proud to see that Django "finally realizes" something I have always been
preaching (admittedly only in my corner). I have written something
almost similar. Your `AppConfig` corresponds to my `App` class in
https://github.com/lsaffre/djangosite/blob/master/djangosite/djangosite_site.py

But your implementation is more beautiful than mine, and I'm looking
forward to retire my djangosite project as soon as possible.

OTOH djangosite adds another concept, about which you do not seem to
speak: I currently call it a "Site" class, but to stay with your
terminology we might also call it a "ProjectConfig" class.

http://site.lino-framework.org/usage.html
http://site.lino-framework.org/application.html

I agree that my project is not very well documented, but I believe that
my work might be useful for Django, and I am ready to share it and to
help as much as possible in case you are interested.

Luc

Aymeric Augustin

unread,
Dec 26, 2013, 4:47:40 AM12/26/13
to django-d...@googlegroups.com
Hi Luc,

On 26 déc. 2013, at 00:22, Luc Saffre <luc.s...@gmx.net> wrote:

> On 24/12/13 12:23, Aymeric Augustin wrote:
>
>> Of course I’m happy to provide feedback and supervision to
>> contributors who’d like to tackle some of these items. Just get in
>> touch before you start coding to avoid duplicate work.
>
> Hey Aymeric, thanks for your work! I am not a core developer, but feel
> proud to see that Django "finally realizes" something I have always been
> preaching (admittedly only in my corner). I have written something
> almost similar. Your `AppConfig` corresponds to my `App` class in
> https://github.com/lsaffre/djangosite/blob/master/djangosite/djangosite_site.py
>
> But your implementation is more beautiful than mine, and I'm looking
> forward to retire my djangosite project as soon as possible.

I wouldn’t say “beautiful” as much as “minimal”. Django’s core must be
extremely conservative when it comes to assumptions about how it will
be used. Also, being allowed to change the internal helps ;-)

Your App class contains more data. It is CMS-oriented, with stuff like
site_js_snippets. You should look into inheriting AppConfig.

> OTOH djangosite adds another concept, about which you do not seem to
> speak: I currently call it a "Site" class, but to stay with your
> terminology we might also call it a "ProjectConfig" class.

This concept is more or less represented by the settings module and by
a bunch of classes such as Apps or ConnectionHandler that are directly
derived from some settings.

I don’t know if it’s a very good design but it has the advantage that you
can use Django as a library without having to call an arbitrary method
(e.g. django.startup()) first.

--
Aymeric.

Luc Saffre

unread,
Dec 26, 2013, 9:40:19 AM12/26/13
to django-d...@googlegroups.com
On 26/12/13 11:47, Aymeric Augustin wrote:
> On 26 d�c. 2013, at 00:22, Luc Saffre <luc.s...@gmx.net> wrote:
>
>> But your implementation is more beautiful than mine, and I'm looking
>> forward to retire my djangosite project as soon as possible.
>
> I wouldn�t say �beautiful� as much as �minimal�. Django�s core must be
> extremely conservative when it comes to assumptions about how it will
> be used. Also, being allowed to change the internal helps ;-)
>
> Your App class contains more data. It is CMS-oriented, with stuff like
> site_js_snippets. You should look into inheriting AppConfig.

I agree 100% (except that "beautiful" includes "minimal"). In fact I was
planning to do that (move the site_js_snippets stuff away from
djangosite.App and add it only in Lino).

>> OTOH djangosite adds another concept, about which you do not seem to
>> speak: I currently call it a "Site" class, but to stay with your
>> terminology we might also call it a "ProjectConfig" class.
>
> This concept is more or less represented by the settings module and by
> a bunch of classes such as Apps or ConnectionHandler that are directly
> derived from some settings.
>
> I don�t know if it�s a very good design but it has the advantage that you
> can use Django as a library without having to call an arbitrary method
> (e.g. django.startup()) first.

Thanks for your feedback. Indeed, the `djangosite.Site` class was there
before `djangosite.App`, and maybe I'd now implement these things using
one or several `AppConfig` subclasses. If one day I find a clear example
why a "ProjectConfig" class would be necessary, I'll share it.

Meanwhile thanks to you and all those working on Django.

Luc

Aymeric Augustin

unread,
Dec 26, 2013, 4:47:44 PM12/26/13
to django-d...@googlegroups.com
Hello,

All the “Must do” and "Further work” items are either done or filed as Trac tickets. I still have to work on the “Cleanup” items. Details follow, mostly for future reference.



> ** Must do **
>
> * Fix the test suite. There are still a few failures, apparently caused by leaking app cache state.

Done. It was mostly a matter of invalidating properly the cache storing the results of app_cache.get_models(). Credits go to Florian.

> * Write documentation, especially explain the app[s].py conventions and why `for app in INSTALLED_APPS` doesn't work any more.

Done. In the reference documentation, I’ve been very conservative. I’ve documented only a few Apps (ex-AppCache) methods that are obviously useful for introspection. The release notes are straightforward.

> * "UserWarning: No fixture named 'comment_tests' found." when running the test suite.

Done. It was a bug in override_settings when INSTALLED_APPS was changed to include an application that couldn’t be imported.

> * Review previous patches. Extract relevant tests and docs, if any.

Done. Compared to the last patch Jannis attached to #3951, on one hand I refactored the app cache more aggressively; on the other hand I didn't implement as many features.

Here’s a summary of the differences:
- I didn’t use a metaclass. Unlike models, app configs don’t need to be instantiated. Individual app configs can simply be instances of AppConfig. This allows easier customization of AppConfig subclasses, which is at the same time helpful and prone to abuse. Overall, keeping things simple is often a good idea.
- I didn’t implement a way to reload or reset the app cache because I don’t think it’s a good practice to remove things from sys.modules. Apps.all_models is populated by imports and treated as immutable (with the exception of tests that register broken models and must remove them to prevent other tests from failing).
- I didn't make it possible to set db_prefix or models_path.
- I didn’t provide signals, because I haven’t tackled goal 3 yet and I don’t want to use signals anyway.
- I didn’t attempt to populate the app registry earlier that currently.
- I required an explicit definition of `AppConfig.name` in all cases.
- I accessed the app cache through methods instead of attributes for encapsulation.
- I removed functions that proxied to app registry methods. The net effect is that you need `from django.apps import apps` instead of `from django import apps` before you can call `apps.xxx()`.
- I dropped Model._meta.installed.

I also looked at the tests in that patch to get started writing tests. Since my implementation is more lightweight and straightforward, I could reach a good coverage with fewer tests. Since I didn’t implement a way to reset the app cache, I can’t test much of the loading process that happens at import time, but I’ve simplified it significantly.

I found it more difficult to take advantage of Preston’s branch, because it was a work in progress and it hadn’t been polished. However, I started from his draft to write the docs.

> * Review past discussions once again.

Done.

> * Close #3951 and file individual tickets for remaining tasks. This massive ticket is a nuisance.

Done. See https://code.djangoproject.com/query?status=!closed&keywords=~app-loading.



> ** Cleanup / small tasks **
>
> * Rename AppCache to InstalledApps? Then we should also replace app_cache by installed_apps. Originally proposed by Arthur Koziel. The parallel with INSTALLED_APPS is interesting but it may also cause confusion. I would also consider simply Apps and apps, although `from django.apps import apps` doesn't look that good. This must be done by 1.7 alpha if we want to do it.

Done. Renamed to Apps.

> * Review all uses of get_app_configs() and determine if the results may be cached. If so, add a receiver for the setting_changed signal to reset the cache when INSTALLED_APPS changes.
>
> * Replace TransRealMixin by suitable receivers for the setting_changed signal.
>
> * Refactor AppCommand to be able to handle apps without a models module.
>
> * Adapt the migrations code:
> - Refactor signals to stop relying on the models module.
> - Introduce MIGRATIONS_MODULE_NAME for consistency with MODELS_MODULE_NAME.
>
> * Refactor get_models and get_migratable_models. They can probably be implemented more straightforwardly.
>
> * Invalidate _get_models_cache appropriately or remove it entirely. This doesn't look like a performance-sensitive area.

Done. Removing the caching makes the test suite more than twice slower, which I didn’t expect.

> * Add a comment to explain how postponing works in populate_models(), or remove it if possible. See https://github.com/django/django/pull/2089#discussion_r8516536.
>
> * populate_models() isn't exactly idempotent. A recursive invocation may return while models aren't fully populated, and it won't set _models_loaded. This could probably be an issue in some cases, but it would be moot if we got rid of the postponing.
>
> * Consider eliminating the check for duplicate imports of models in register_model(), which has become unnecessary with the new project layout introduced in 1.4. It would probably be safer to do it through a deprecation path.
>
> * Review the implementation of dumpdata. It looks like it could be simplified by using AppConfigs.
>
> * Review how much django.apps and django.conf import — they should stay lean to avoid import loops.




> ** Further work / large tasks **
>
> (Many of these are open questions that could be debated on this mailing list.)
>
> 1) Provide an API to run code at startup, along the lines described above.

https://code.djangoproject.com/ticket/21676

> 2) Some fields should be moved from Model._meta to Model._meta.app_config, which should be set as part of the app cache population process.

https://code.djangoproject.com/ticket/21682

> 3) Make it possible to change the label in AppConfig. That's half of the original feature request in #3591 (the other half being verbose_name). Requires 2 — and then lots of work.

https://code.djangoproject.com/ticket/21683

> 4) Make it possible to change various prefixes (admin permissions, database tables, dumpdata/loaddata). That was part of Arthur Koziel's GSoC proposal. Requires 2 — and then some work.

https://code.djangoproject.com/ticket/21684

> 5) Investigate whether app labels currently need to be unique (the docs say so, but the code doesn’t enforce it). See https://groups.google.com/d/msg/django-developers/gzBWU_fUdgc/zlTTkKx-s5QJ. Consider enforcing unicity. Easier to pull off after 3.

https://code.djangoproject.com/ticket/21679

> 6) Store application-specific settings in AppConfig? I'm not sure where to draw the frontier between settings and app configs but this feels like feature creep. Maybe not a wontfix but at least a someday/maybe.

I don’t think it’s a good idea to implement this in Django right now. But I’d be interested in seeing it done as an external package. It seems like it’s just a matter of providing a suitable subclass of AppConfig and a function to pull settings from there.

> 7) Store urls and views in AppConfig? It never seemed a very practical idea, and that’s why we ended up with lean AppConfig objects. We’re certainly not in a position to decide this is a feature we want on Django at this point.

I’m rejecting this for now.

> I'm not listing multiple instances of a pluggable app because I believe the current way of formulating this feature request is incorrect and cannot be implemented. For starters, when you import a model class, to which copy of its app does it belong?

I’ve already rejected this.

--
Aymeric.




Aymeric Augustin

unread,
Dec 27, 2013, 3:51:21 PM12/27/13
to django-d...@googlegroups.com
Hello,

Here’s an final update on the remaining items in my TODO list. If there’s anything you want to discuss, please start a new thread, as this one is just an long monologue designed to keep a public record of what I’ve been doing.

On 26 déc. 2013, at 22:47, Aymeric Augustin <aymeric....@polytechnique.org> wrote:

> I still have to work on the “Cleanup” items.
>
>> * Review all uses of get_app_configs() and determine if the results may be cached. If so, add a receiver for the setting_changed signal to reset the cache when INSTALLED_APPS changes.

https://code.djangoproject.com/ticket/21688

>> * Replace TransRealMixin by suitable receivers for the setting_changed signal.

https://code.djangoproject.com/ticket/21688

>> * Refactor AppCommand to be able to handle apps without a models module.

https://github.com/django/django/pull/2118 — please review!

>> * Adapt the migrations code:
>> - Refactor signals to stop relying on the models module.

https://github.com/django/django/pull/2119 — please review!

>> - Introduce MIGRATIONS_MODULE_NAME for consistency with MODELS_MODULE_NAME.

https://github.com/django/django/commit/2504a50cc2285b5e5f17d5403c9a79b620471b25

>> * Refactor get_models and get_migratable_models. They can probably be implemented more straightforwardly.

https://code.djangoproject.com/ticket/21677

>> * Add a comment to explain how postponing works in populate_models(), or remove it if possible. See https://github.com/django/django/pull/2089#discussion_r8516536.

https://code.djangoproject.com/ticket/21681

>> * populate_models() isn't exactly idempotent. A recursive invocation may return while models aren't fully populated, and it won't set _models_loaded. This could probably be an issue in some cases, but it would be moot if we got rid of the postponing.

https://code.djangoproject.com/ticket/21681

>> * Consider eliminating the check for duplicate imports of models in register_model(), which has become unnecessary with the new project layout introduced in 1.4. It would probably be safer to do it through a deprecation path.

https://code.djangoproject.com/ticket/21678

>> * Review the implementation of dumpdata. It looks like it could be simplified by using AppConfigs.

https://github.com/django/django/commit/efddae252ce8fe469196839a29a564c0b4942c89

>> * Review how much django.apps and django.conf import — they should stay lean to avoid import loops.

I exported DJANGO_SETTINGS_MODULE=myproject.settings and ran the following script in a Python shell:

import sys
original = set(name for name, module in sys.modules.items() if name.startswith('django') and module is not None)
from django.apps import apps
with_apps = set(name for name, module in sys.modules.items() if name.startswith('django') and module is not None)
for name in sorted(with_apps - original):
print name

The slightly complicated expression gets the list of modules imported within Django, ignoring cached misses.

This prints only modules from django.apps, django.conf, django.core.exceptions and django.utils, all of which are safe to import.

--
Aymeric.

Aymeric Augustin

unread,
Dec 30, 2013, 7:29:05 AM12/30/13
to django-d...@googlegroups.com
Hello,

There’ve been lots of brainstorming on this topic. It is now tracked in this ticket: https://code.djangoproject.com/ticket/21676. I’ve also sent a pull request: https://github.com/django/django/pull/2128.

There’s a consensus on the general idea:
- AppConfigs are a good place for startup code,
- that code should be called when the app registry gets populated,
- the app registry should be populated as early as possible.

But the details are tricky. Specifically there are two problems for which a design decision must be made. I need your feedback.


### Problem 1 — Designing the API

I’d like to make the API as simple as possible, which means:

class AppConfig(object):
# …
def setup(self):
“””Override this method in subclasses to run code when Django starts.”””

Logically, this would get called when the app cache is fully populated. Running user code while the app cache isn’t ready will backfire.

But at this point it’s too late to register listeners for the `class_prepared` signal, so we need an early entry point. I believe AppConfig.__init__ does the job.

I originally suggested apps_ready() and models_ready() but I wasn’t happy with this solution because it leaks internal implementation details and inflicts a needlessy complicated API on the end users.

Relevant commits in the pull request: “Added AppConfig.setup() to run setup code.” and “Updated advice on connecting signals at startup."


### Problem 2 — Populating the app registry when not using wsgi.py or manage.py

Everyone seems to agree with configuring the settings and populating the app registry:
- before serving requests, in django.core.wsgi.get_wsgi_application() — this will also prevent the server from starting with an invalid configuration and crashing during the first request.
- before running a command, in django.core.management.execute_from_command_line() — actually in BaseCommand.execute() when can_import_settings = True.

Relevant commit in the pull request: “Populated the app registry earlier at startup."

But these aren’t the only two entry points. It’s also possible to Django code as follows: DJANGO_SETTINGS_MODULE=myproject.settings python script.py

If one makes an ORM query in script.py, the app registry gets populated deep inside the ORM, as a side effect of get_models() when setting up relations between models:

% DJANGO_SETTINGS_MODULE=myproject.settings python
>>> from gallery.models import Album
>>> Album.objects.all()
(Pdb) bt
<stdin>(1)<module>()
django/db/models/query.py(115)__repr__()
-> data = list(self[:REPR_OUTPUT_SIZE + 1])
django/db/models/query.py(140)__iter__()
-> self._fetch_all()
django/db/models/query.py(962)_fetch_all()
-> self._result_cache = list(self.iterator())
django/db/models/query.py(264)iterator()
-> for row in compiler.results_iter():
django/db/models/sql/compiler.py(681)results_iter()
-> for rows in self.execute_sql(MULTI):
django/db/models/sql/compiler.py(752)execute_sql()
-> sql, params = self.as_sql()
django/db/models/sql/compiler.py(83)as_sql()
-> ordering, o_params, ordering_group_by = self.get_ordering()
django/db/models/sql/compiler.py(409)get_ordering()
-> self.query.get_meta(), default_order=asc):
django/db/models/sql/compiler.py(445)find_ordering_name()
-> field, targets, alias, joins, path, opts = self._setup_joins(pieces, opts, alias)
django/db/models/sql/compiler.py(477)_setup_joins()
-> pieces, opts, alias)
django/db/models/sql/query.py(1375)setup_joins()
-> names, opts, allow_many)
django/db/models/sql/query.py(1298)names_to_path()
-> field, model, direct, m2m = opts.get_field_by_name(name)
django/db/models/options.py(400)get_field_by_name()
-> cache = self.init_name_map()
django/db/models/options.py(429)init_name_map()
-> for f, model in self.get_all_related_m2m_objects_with_model():
django/db/models/options.py(546)get_all_related_m2m_objects_with_model()
-> cache = self._fill_related_many_to_many_cache()
django/db/models/options.py(560)_fill_related_many_to_many_cache()
-> for klass in self.apps.get_models():
django/utils/lru_cache.py(101)wrapper()
-> result = user_function(*args, **kwds)
django/apps/registry.py(203)get_models()
-> self.populate_models()

I considered populating the app registry when the settings are configured i.e. when the first setting is accessed or settings.configure() is called but I decided against it:
- There’s no good reason for django.conf to know about django.apps.
- The risk of creating circular imports is very high: one doesn’t expect `settings.FOO` to import all applications and models.
- It may not solve this use case; nothing guarantees that the settings will be accessed before using the ORM.

## Solution 2.a — Require users to call django.setup() before they use the ORM. setup() would configure the settings and populate the app registry.

It’s easy to implement a deprecation path. It’s just a matter of raising a warning whenever an app registry method is called while the app registry isn’t populated. Eventually this should raise an exception.

As a nice side effect, this allows us to decouple django.apps from django.conf entirely, helping our guerrilla against global state. The value of INSTALLED_APPS would always be passed explicitly to populate_apps(installed_apps).

I’m currently working on this and I’ll add a commit to the pull request (which I’ll remove if we go for solution 2.b).

## Solution 2.b — Keep the current behavior.

Barring an explicit function call, I can’t think of a way to populate the app registry before any user code runs in this scenario without getting into circular imports. Populating when importing django.apps.registry makes `from django.apps import apps` unsafe. Populating when creating the first model class immediately creates a loop.

If we can’t guarantee the app registry is populated on startup, we’d better keep the current behavior. Moving the population arbitrarily earlier would introduce action at a distance.

If we chose this solution, it would be interesting to review all uses of get_models(), examine which ones can assume the app registry is populated and which one can’t, and figure out if there’s room for refactoring.


--
Aymeric.

Aymeric Augustin

unread,
Dec 30, 2013, 11:16:21 AM12/30/13
to django-d...@googlegroups.com
On 30 déc. 2013, at 13:29, Aymeric Augustin <aymeric....@polytechnique.org> wrote:

> ## Solution 2.a — Require users to call django.setup() before they use the ORM. setup() would configure the settings and populate the app registry.
> (…)
> I’m currently working on this and I’ll add a commit to the pull request (which I’ll remove if we go for solution 2.b).

The pull request now contains that solution. It was a bit more complicated than I expected. I didn’t include a deprecation path. At worst you get an exception that tells you to call django.setup().

It turns out that this change will make it possible, and is most likely required, to remove the complex postponing behavior of populate_models (https://code.djangoproject.com/ticket/21681). This is a strong argument for that solution.

The real question — is requiring django.setup() acceptable? Explicit is better than implicit, after all…

--
Aymeric.




Ryan Hiebert

unread,
Dec 30, 2013, 11:37:53 AM12/30/13
to django-d...@googlegroups.com
On Mon, Dec 30, 2013 at 10:16 AM, Aymeric Augustin
<aymeric....@polytechnique.org> wrote:

> The real question — is requiring django.setup() acceptable? Explicit is better than implicit, after all…

It's probably obvious to others, but where would that `django.setup()` be?

Bas Peschier

unread,
Dec 30, 2013, 11:52:33 AM12/30/13
to django-d...@googlegroups.com


Op maandag 30 december 2013 17:16:21 UTC+1 schreef Aymeric Augustin:
On 30 déc. 2013, at 13:29, Aymeric Augustin <aymeric....@polytechnique.org> wrote:

[...]


The real question — is requiring django.setup() acceptable? Explicit is better than implicit, after all…


As far as I can see only people who load Django outside of get_wsgi_application are affected and they get a pretty clear error message. Sounds safe enough.

I like the explicitness of django.setup() without being so explicit that it scares people about what magic actually happens. It also makes the path of understanding how Django initializes everything simpler.

What I really like is that we introduce a set point at which Django loads and populates everything (I am looking at all the admin-esk auto-discover mechanisms).

Bas

Aymeric Augustin

unread,
Dec 30, 2013, 12:22:52 PM12/30/13
to django-d...@googlegroups.com
On 30 déc. 2013, at 17:37, Ryan Hiebert <ry...@ryanhiebert.com> wrote:

> It's probably obvious to others, but where would that `django.setup()` be?

import django
django.setup()

Since setup() couples django.apps and django.conf, putting it into the django package seemed appropriate.

--
Aymeric.

Anssi Kääriäinen

unread,
Dec 30, 2013, 3:10:33 PM12/30/13
to django-d...@googlegroups.com
First, a big +1 for improved app-loading sequence.

Problem 2: my vote for solution 2.b. This seems to be the most reliable way. Quick throwaway scripts will need two more lines of boilerplate code but that is easily worth the added reliability.

I have a question about problem 1: Are model imports going to be allowed from app modules? I believe allowing this will cause problems later on.

If app-loading and model loading happens in mixed order (due to model imports in app modules launching model loading for the imported models), then there could be project initialization differences depending on the order of import statements. This could at least cause problems when trying to connect to class_prepared signal. If the model is imported before connect is called for that model you are going to miss the signal. Another potential problem is that a model's app isn't going to be available during import from app-module. If the model setup is going to ever need anything from the model's AppConfig, then we have a problem. For example it would be nice to be able to use AppConfig to alter Permission model's permission name length constraint, but this can't be done easily if it is possible that the Permission is imported before auth's AppConfig is created.

Another question: when exactly is class-prepared signal going to be sent? It seems it happens currently after import of the model finishes. If it happens at model import time, then connecting to class-prepared signal in AppConfig.__init__ using from .models import SomeModel isn't going to work in any case.

 - Anssi

Aymeric Augustin

unread,
Dec 30, 2013, 3:49:47 PM12/30/13
to django-d...@googlegroups.com
On 30 déc. 2013, at 21:10, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote:

> I have a question about problem 1: Are model imports going to be allowed from app modules? I believe allowing this will cause problems later on.

While it doesn’t seem especially useful and hardly a good practice, it’s currently possible. It’s one of the two roles of Apps.all_models.

I’d like it to remain possible in the sense that Django shouldn’t crash. Side effects on the import sequence, and especially on import-time signals, are acceptable — in fact, they’re unavoidable.

I’m thinking of discouraging this practice in the documentation but not going an extra mile to support or prevent it.

> If app-loading and model loading happens in mixed order (due to model imports in app modules launching model loading for the imported models), then there could be project initialization differences depending on the order of import statements.

Since the app registry will import (1) each app config and app package in order and then (2) each models module in order, before handling any request, the import sequence will be deterministic for a given code base even if some models modules are imported during phase (1).

This is an improvement over the current situation where, depending on which view is hit by the first request, you may get a different import sequence.

> This could at least cause problems when trying to connect to class_prepared signal. If the model is imported before connect is called for that model you are going to miss the signal.

Yes, this problem can happen, but at least it cannot happen randomly any more since the import order is deterministic.

> Another potential problem is that a model's app isn't going to be available during import from app-module. If the model setup is going to ever need anything from the model's AppConfig, then we have a problem. For example it would be nice to be able to use AppConfig to alter Permission model's permission name length constraint, but this can't be done easily if it is possible that the Permission is imported before auth's AppConfig is created.

The AppConfig is only available through the app registry’s public APIs and those will raise a runtime error if you call them before the app registry is totally ready. See check_ready() in django/apps/registry.py. So, even if you don’t import models in the app package, you can’t use the app registry during the import of the models module. You’ll have to find another way.

> Another question: when exactly is class-prepared signal going to be sent? It seems it happens currently after import of the model finishes. If it happens at model import time, then connecting to class-prepared signal in AppConfig.__init__ using from .models import SomeModel isn't going to work in any case.

I don’t know. I discovered this signal today :-)

This is a good example of:
- why one shouldn’t import random stuff in packages
- side-effects I find acceptable if people do it anyway

--
Aymeric.

Marc Tamlyn

unread,
Dec 30, 2013, 3:51:41 PM12/30/13
to django-d...@googlegroups.com

I would suggest that Anssi's issues with model loading ordering are helped best by having `django.setup()` and doing away with the magical "I created a Model subclass and my app registry hot populated" issues. The proposed solution 2a seems best to me. It has never been clear how Django initialised and used a variety of dark magic (strange postponing, pythonpath manipulation and a freaking borg pattern!), and if the best way to make this explicit is to, well, require an explicit setup() call, then this is even better.

As an aside, these changes seem to me to make it easier to make things less global. A Django project (to use our traditional terminology) consists of some settings, which determine an instance of the app registry (and a load of apps inside it). The more isolated these become, the easier it seems to get a separate bunch of settings somewhere with different app registry at the same time. Obviously we still have the issue that everything assumes there is a canonical version of settings in Django.conf, but there may well be a way to remove a lot of settings with app configs.

Marc

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Marc Tamlyn

unread,
Dec 30, 2013, 3:59:22 PM12/30/13
to django-d...@googlegroups.com

On another note, if you're going to muck about with a model in an existing app (perhaps for a very good reason like Permission) the best way of doing that (IMO) would be to create a subclassed AppConfig, get the relevant model from Apps in the setup() method and modify it directly there (add fields, change a fields max_length etc.)

If this pattern is OK with your design Aymeric then it may even be worth documenting at some point, at risk of encouraging hacks. It's a much more elegant way to have dynamic cms models than has ever existed before though.

Aymeric Augustin

unread,
Dec 30, 2013, 4:00:51 PM12/30/13
to django-d...@googlegroups.com
On 30 déc. 2013, at 21:51, Marc Tamlyn <marc....@gmail.com> wrote:

doing away with the magical "I created a Model subclass and my app registry hot populated" issues.

This is ModelBase.__new__, another scary beast I haven’t attacked yet :)

-- 
Aymeric.

Aymeric Augustin

unread,
Dec 30, 2013, 4:05:37 PM12/30/13
to django-d...@googlegroups.com
On 30 déc. 2013, at 21:59, Marc Tamlyn <marc....@gmail.com> wrote:

On another note, if you're going to muck about with a model in an existing app (perhaps for a very good reason like Permission) the best way of doing that (IMO) would be to create a subclassed AppConfig, get the relevant model from Apps in the setup() method and modify it directly there (add fields, change a fields max_length etc.

Yes, it’s simple, and I’m quite confident it would work:

from django.apps import AppConfig

class CustomAuthConfig(AppConfig):
    name = “django.contrib.auth”
    def setup(self):
        Permission = self.get_model('Permission’)
        # twiddle Permission

If this pattern is OK with your design Aymeric then it may even be worth documenting at some point, at risk of encouraging hacks. It's a much more elegant way to have dynamic cms models than has ever existed before though.

Unfortunately, we don’t have a public API to alter models after they’ve been created, which makes it hard to recommend this pattern.

-- 
Aymeric.

Marc Tamlyn

unread,
Dec 30, 2013, 4:23:14 PM12/30/13
to django-d...@googlegroups.com

Heh, I forgot that contribute to class isn't documented. On second thoughts, let's not document this unless we actually have a good API for it.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Russell Keith-Magee

unread,
Dec 30, 2013, 10:28:54 PM12/30/13
to Django Developers
Hi Aymeric,

First off -  a *huge* thanks for all the work your doing on app reloading. I can't begin to express how much gratitude I have for the work you're doing here. 

On Mon, Dec 30, 2013 at 8:29 PM, Aymeric Augustin <aymeric....@polytechnique.org> wrote:
Hello,

There’ve been lots of brainstorming on this topic. It is now tracked in this ticket: https://code.djangoproject.com/ticket/21676. I’ve also sent a pull request: https://github.com/django/django/pull/2128.

There’s a consensus on the general idea:
- AppConfigs are a good place for startup code,
- that code should be called when the app registry gets populated,
- the app registry should be populated as early as possible.

But the details are tricky. Specifically there are two problems for which a design decision must be made. I need your feedback.


### Problem 1 — Designing the API

I’d like to make the API as simple as possible, which means:

class AppConfig(object):
    # …
    def setup(self):
        “””Override this method in subclasses to run code when Django starts.”””

Logically, this would get called when the app cache is fully populated. Running user code while the app cache isn’t ready will backfire.

But at this point it’s too late to register listeners for the `class_prepared` signal, so we need an early entry point. I believe AppConfig.__init__ does the job.

I originally suggested apps_ready() and models_ready() but I wasn’t happy with this solution because it leaks internal implementation details and inflicts a needlessy complicated API on the end users.

Relevant commits in the pull request: “Added AppConfig.setup() to run setup code.” and “Updated advice on connecting signals at startup."

An idle thought -- this problem is caused because the class_prepared signal is currently being issued at the end of ModelBase.__new__(). Is this signal location actually a requirement? Is there any reason that the prepare signal couldn't be decoupled from __new__(), and be issued at the end of app configuration (at some appropriate point in the process)? We'd need to be careful about exactly *when* the signal was issued, and what order signals were issued to make sure ordering doesn't change too drastically between releases - 
I'm not sure I'm completely convinced about this. 

I'll pay the argument that there's no good reason (from an architectural perspective) for conf to know about apps -- *unless* you're taking the view that the configuring settings *is* what passes for Django's setup sequence. There's an argument to be made that this is already the case - logging is configured as a consequence of accessing settings, for example, and this logic was put in settings specifically because configuring settings is the closest we come to a "first mover" event in the Django stack.

Is this what you would expect -- well, I suppose that comes down to what is documented. I don't think it's prima facie surprising that a method called "configure", that has access to lists of installed apps, models, logging and i18n configurations, will have the side effect of importing and configuring those modules/objects; and if the documentation says that this is what it does, then all the better.

Circular imports are obviously an issue we want to avoid - but I'm not sure how you would manufacture a circular import here. Django settings are essentially just constants plus some path joining, with some laziness thrown in; I'm not sure how you'd manufacture circular imports going through the settings layer.

As for the specific concern about the ORM -- Yes - we do have a guarantee -- you can't do anything on a database without knowing *at least* the settings in DATABASES['default'], which is enough to trigger startup logic.
 
## Solution 2.a — Require users to call django.setup() before they use the ORM. setup() would configure the settings and populate the app registry.

It’s easy to implement a deprecation path. It’s just a matter of raising a warning whenever an app registry method is called while the app registry isn’t populated. Eventually this should raise an exception.

As a nice side effect, this allows us to decouple django.apps from django.conf entirely, helping our guerrilla against global state. The value of INSTALLED_APPS would always be passed explicitly to populate_apps(installed_apps).

I’m currently working on this and I’ll add a commit to the pull request (which I’ll remove if we go for solution 2.b).

## Solution 2.b — Keep the current behavior.

Barring an explicit function call, I can’t think of a way to populate the app registry before any user code runs in this scenario without getting into circular imports. Populating when importing django.apps.registry makes `from django.apps import apps` unsafe. Populating when creating the first model class immediately creates a loop.

If we can’t guarantee the app registry is populated on startup, we’d better keep the current behavior. Moving the population arbitrarily earlier would introduce action at a distance.

If we chose this solution, it would be interesting to review all uses of get_models(), examine which ones can assume the app registry is populated and which one can’t, and figure out if there’s room for refactoring.

Personally, I'd prefer 2a over 2b -- implicit behaviours invoked at unknown times is (to my mind) one of the problems we're trying to fix. Graham Dumpleton's suggestion to invoke the validate() management command in your WSGI file is part of dealing with this implicit behaviour. Having an explicit understanding that the full system will be available at a given time (for whatever definition "full system" means)

Whether we add a django.setup() and it invokes settings, or we use settings.configure as the entry point that invokes app initialisation -- I could go either way on that. To my mind, there's value in keeping the entry point we already have unless there's a major architectural reason to do otherwise. However, I'm not fundamentally opposed to introducing a new, clean entry point whose mandate is very specifically "set things up".

Russ %-)

Daniel Lindsley

unread,
Dec 30, 2013, 11:41:16 PM12/30/13
to django-d...@googlegroups.com
Aymeric,


    As with the others, I'd like to echo my thanks! Having spent a lot of time with the old AppCache, I think these are solid improvements. Just a couple thoughts/questions from me. 

    On Problem #1, IMO, I'd prefer to see something akin to an ``AppConfig.setup`` (before anything is imported or built) & an ``AppConfig.complete`` (after all classes have been setup for the app). I think this would make it clearer as to when the behavior is taking effect, as well as granting additional time between ``__init__``, which I don't think many people would like overriding, and the actual setup. Additionally, having a place to hook into that's guaranteed to be post-model setup would be a boon.

   On Problem #2, having been one of those people who's written many scripts like that, I'd prefer #2a. It's not consistent with existing behavior, but the existing behavior can be very unstable (for all the reasons others have mentioned). A stable order would be nice from an "easier-to-rationalize-about" perspective.


Daniel

Andreas Pelme

unread,
Dec 31, 2013, 2:38:42 AM12/31/13
to django-d...@googlegroups.com
Aymeric,

Your work is amazing and has improved Django so much. Your work on timezone support, the transaction overhaul and now this app changes makes my and a lot of other peoples experience with Django so much better. Thank you!

2013/12/30 Aymeric Augustin <aymeric....@polytechnique.org>

## Solution 2.a — Require users to call django.setup() before they use the ORM. setup() would configure the settings and populate the app registry.

I think this approach is the only proper solution. We have hit multiple bugs within the import mechanism in the ORM with models not finding related models (#20143 for instance). To workaround this, we always explicitly call get_models() in the web app servers, celery workers and scripts before anything else. Whenever we forget the get_models() call, things break in a lot of interesting ways. :-)

Being explicit with an exception when trying to use the ORM in a non-initialised state would be very very useful in situations like this, and would have saved us many hours of confusion.

--
Andreas Pelme

Aymeric Augustin

unread,
Dec 31, 2013, 3:41:14 AM12/31/13
to django-d...@googlegroups.com
On 31 déc. 2013, at 04:28, Russell Keith-Magee <rus...@keith-magee.com> wrote:

On Mon, Dec 30, 2013 at 8:29 PM, Aymeric Augustin <aymeric....@polytechnique.org> wrote:

Logically, this would get called when the app cache is fully populated. Running user code while the app cache isn’t ready will backfire.

But at this point it’s too late to register listeners for the `class_prepared` signal, so we need an early entry point. I believe AppConfig.__init__ does the job.

An idle thought -- this problem is caused because the class_prepared signal is currently being issued at the end of ModelBase.__new__(). Is this signal location actually a requirement? Is there any reason that the prepare signal couldn't be decoupled from __new__(), and be issued at the end of app configuration (at some appropriate point in the process)? We'd need to be careful about exactly *when* the signal was issued, and what order signals were issued to make sure ordering doesn't change too drastically between releases -

That’s an interesting idea. To be honest, I’ve shied away from touching ModelBase.__new__ — a 230 line method in a metaclass, what could possibly go wrong? ;-)

In almost all cases AppConfig.setup() will be early enough. class_prepared is the only exception I found. I’m not sure what it’s designed for (nor that it’s a good idea at all).