loading.py - get_apps(), get_app() and load

Vinay Sajip

unread,

Apr 29, 2007, 1:19:28 PM4/29/07

to Django developers

I need some help with understanding some aspects of the above-named
functions. load_app is internal to loading.py and is called by both
get_apps and get_app to import a specific app module. But get_app(x)
first calls get_apps, which loads (using load_app) all the app modules
in INSTALLED_APPS. Then it calls load_app(x). Is this double calling
of load_app intentional? If so, what's the rationale for doing it this
way? Why not cache the information obtained in the initial call of
get_apps()?

Malcolm Tredinnick

unread,

Apr 29, 2007, 8:23:46 PM4/29/07

to django-d...@googlegroups.com

On Sun, 2007-04-29 at 10:19 -0700, Vinay Sajip wrote:
> I need some help with understanding some aspects of the above-named
> functions. load_app is internal to loading.py and is called by both
> get_apps and get_app to import a specific app module. But get_app(x)
> first calls get_apps, which loads (using load_app) all the app modules
> in INSTALLED_APPS.

Only the first time. On subsequent calls, it returns the cached list of
installed apps without calling load_app() at all.

> Then it calls load_app(x). Is this double calling
> of load_app intentional? If so, what's the rationale for doing it this
> way? Why not cache the information obtained in the initial call of
> get_apps()?

The double call isn't explicitly intentional or unintentional; it's just
an implementation detail. Currently get_apps() returns a list of all
applications and isn't just used by get_app(). So changing the return
type to save a function call means it now becomes
get_app_that_also_returns_something_about_one_particular_app() and in
the common case (when the internal cache has already been populated), it
has to do more work than it does now. If you wanted to change this, you
would need to write something that was backwards compatible (hopefully)
and no slower in both cases. Not sure that's worth it to save what is
two function calls in the slow case, which occurs once per Django
process's lifetime, and no function calls in the common case, but I've
been wrong before.

In the future (pre-1.0), that code will be rewritten somewhat because
the side-effect we are exploiting by calling get_apps() -- ensuring the
internal cache is fully populated -- is not actually a guaranteed
side-effect. The cache isn't fully populated all the time, which is a
cause of sporadic bugs. So we need a slightly smarter internal cache
object that understands it may not be fully initialised without also
being a real performance hog (populating the internal cache can be
realtively time-consuming when you have a lot of apps and models).

Regards,
Malcolm

Vinay Sajip

unread,

Apr 30, 2007, 3:02:05 AM4/30/07

to Django developers

> Only the first time. On subsequent calls, it returns the cached list of
> installed apps without calling load_app() at all.

Yep, noticed that.

> The double call isn't explicitly intentional or unintentional; it's just
> an implementation detail. Currently get_apps() returns a list of all
> applications and isn't just used by get_app(). So changing the return
> type to save a function call means it now becomes
> get_app_that_also_returns_something_about_one_particular_app() and in
> the common case (when the internal cache has already been populated), it
> has to do more work than it does now. If you wanted to change this, you
> would need to write something that was backwards compatible (hopefully)
> and no slower in both cases. Not sure that's worth it to save what is
> two function calls in the slow case, which occurs once per Django
> process's lifetime, and no function calls in the common case, but I've
> been wrong before.

I wasn't planning to change this just to save a function call - as
Knuth said, premature optimisation is the root of all evil ;-)

>
> In the future (pre-1.0), that code will be rewritten somewhat because
> the side-effect we are exploiting by calling get_apps() -- ensuring the
> internal cache is fully populated -- is not actually a guaranteed
> side-effect. The cache isn't fully populated all the time, which is a
> cause of sporadic bugs. So we need a slightly smarter internal cache
> object that understands it may not be fully initialised without also
> being a real performance hog (populating the internal cache can be
> realtively time-consuming when you have a lot of apps and models).

I noticed some unexpected behaviour in this area, which shows up when
you run tests/runtests.py. get_apps() is called twice, first with a
set of "always-loaded" apps() and then with a whole load of test apps.
The second time around, get_apps() does nothing since _loaded was set
to True by the first call. Anyway, I'll just chug along looking
around...thanks for the info.

loading.py - get_apps(), get_app() and load_app()

Vinay Sajip

Malcolm Tredinnick

Vinay Sajip