Improving django.setup() and app loading performance

738 views
Skip to first unread message

Rich Jones

unread,
Feb 26, 2016, 10:36:57 AM2/26/16
to Django developers (Contributions to Django itself)
(Originally posted this as a ticket, but we can have discussion here and leave the ticket for just more specific discussion.)

I imagine that this is an area that hasn't really been given much consideration with regards to optimization, because it isn't relevant to normal Django deployments. However, with "serverless" deployments (those without any permanent infrastructure), this becomes quite relevant as we have to call setup() every request.


So, I'd love to discuss ideas for performance optimizations to Django's setup method. (A sample output for profile is available here: https://github.com/Miserlou/django-zappa/issues/24 )


For starters - can we load apps in parallel?

Tim Graham

unread,
Feb 26, 2016, 11:04:56 AM2/26/16
to Django developers (Contributions to Django itself)
I'm not sure we can load apps in parallel -- they are loaded in the order in which they appear in INSTALLED_APPS so the process is consistent. Maybe it would work for some setups, but I guess it would introduce subtle bugs in others.

https://docs.djangoproject.com/es/stable/ref/applications/#how-applications-are-loaded

Rich Jones

unread,
Feb 26, 2016, 11:46:19 AM2/26/16
to Django developers (Contributions to Django itself)
That might be true in theory.. but in practice?

Do any of the core/contrib/top100 apps actually depend on loading orders? I've never encountered that problem before, but I don't know.

It seems like we could easily get a 10x performance improvement by doing this in parallel by default, and have anything that _does_ have a dependency have some kind of DEPENDS_ON in the package init, so that the loader loop wouldn't complete until everything is ready. This would be a major benefit, and I don't think it would introduce all that much complexity.

Aymeric Augustin

unread,
Feb 26, 2016, 12:52:28 PM2/26/16
to django-d...@googlegroups.com
Hi Rich,

On 26 févr. 2016, at 16:36, Rich Jones <mise...@gmail.com> wrote:

> I'd love to discuss ideas for performance optimizations to Django's setup method.


In my opinion, the first concrete step would be to measure how much time is spent executing Django code rather than importing Python modules. That’s tricky because the two are intertwined: django.setup() mostly imports all the modules required by the project. If 90% of the time is spent importing modules and 10% doing other things, then there’s not much to gain by optimizing these other things!

While performance improvements are always good, I’m against changes that would re-introduce non-determism in the app-loading process.

Django 1.7 eliminated a whole class of bugs, included some that randomly occurred with low probability when restarting a production server under load. These bugs aren’t fun to track down.

The global import lock was replaced by per-module locks in Python 3.3. This implies you could import things in threads on Python 3.3+ and it may be faster. However I’m afraid any thread-based solution will re-introduce non-determism.

I remember trying to simplify:

- building Options classes (aka. Model._meta)
- building relations between models

and failing multiple times, due to the difficulty of preserving backwards-compatibility. I think it’s doable but it will take someone smarter, more persistent and/or more familiar with the app-loading process than I am. And I’m not even sure it would have an effect on performance!

Best regards,

--
Aymeric.


Cristiano Coelho

unread,
Feb 26, 2016, 1:07:02 PM2/26/16
to Django developers (Contributions to Django itself)
If with "serverless" you are talking deployments such as Amazon Lambda or similar, I don't think setup is called on every request, at least for AWS Lambda, the enviorment is cached so it will only happen once each time it needs to scale up. Are there any other issues?

Rich Jones

unread,
Feb 26, 2016, 3:53:00 PM2/26/16
to Django developers (Contributions to Django itself)
@Aymeric

> In my opinion, the first concrete step would be to measure how much time is spent executing Django code rather than importing Python modules.

You can find a complete profile of a Django request as it goes through the complete request/response loop here: https://github.com/Miserlou/django-zappa/files/141600/profile.txt

Over 70% of the total request time is spent in django.setup() - so you can see why we have an incentive to improve this!


@ Cristiano -
> If with "serverless" you are talking deployments such as Amazon Lambda or similar, I don't think setup is called on every request, at least for AWS Lambda, the enviorment is cached so it will only happen once each time it needs to scale up. Are there any other issues?

You're halfway there, but the process is more complicated than that. The code is cached, not the internal state of the machine. You can follow our progress here: https://github.com/Miserlou/django-zappa/

But - another type caching could be another possibility. Can anybody think of any arrangements where we could perhaps call setup() with "pre-loaded"/cached applications?

Florian Apolloner

unread,
Feb 26, 2016, 4:34:46 PM2/26/16
to Django developers (Contributions to Django itself)


On Friday, February 26, 2016 at 9:53:00 PM UTC+1, Rich Jones wrote:
@Aymeric
> In my opinion, the first concrete step would be to measure how much time is spent executing Django code rather than importing Python modules.

Over 70% of the total request time is spent in django.setup() - so you can see why we have an incentive to improve this!

That profile looks somewhat worthless imo, please provide it in a usable form, ie cProfile output. As Aymeric pointed out, most of the time in setup is most likely spent in importing python modules, which is nothing we can do about.
 
You're halfway there, but the process is more complicated than that. The code is cached, not the internal state of the machine. You can follow our progress here: https://github.com/Miserlou/django-zappa/

What does this even mean? "code is cached"?! "internal state"!? which internal state?
 
But - another type caching could be another possibility. Can anybody think of any arrangements where we could perhaps call setup() with "pre-loaded"/cached applications?

Use a sane application stack? :D

Cristiano Coelho

unread,
Feb 26, 2016, 4:39:58 PM2/26/16
to Django developers (Contributions to Django itself)
Rich,

I believe you know a lot way more than me about AWS Lambda since you have made such a great project, and I'm really interested to know how it really works since theyr documentation is a bit superficial.

On their FAQ this is what they state:

Q: Will AWS Lambda reuse function instances?

To improve performance, AWS Lambda may choose to retain an instance of your function and reuse it to serve a subsequent request, rather than creating a new copy. Your code should not assume that this will always happen.


I always thought, and with very simple testing seen, that your code is basically "frozen" between each request, and for example, every import that is done in the main module is always done once (so your whole program is only initialized once) so this means django would only initialize once for quite a while (until your instance of the code is discarded, and a new request will basically generate all modules to be imported again). So technically if you do the right imports to have django call setup(), this should be only done once. What is really happening that makes it always call setup() on every request?


With the above said, if that's really the case. Python is known to able to serialize classes in a very interesting way (pickling) where you can even send a class with its methods over the wire and on the other side the person can execute every method defined in there and the class also keeps any state. Would it be possible to store the state with this somehow?

Collin Green

unread,
Feb 26, 2016, 4:56:33 PM2/26/16
to Django developers (Contributions to Django itself)
Ugh. As a strong advocate of both django and zappa, I'd love it if we could keep the conversation on target without degenerating into stack attacks. If you don't want to weigh in, please feel no obligation to do so.

I do agree that we could pare down the profile into more actionable sections. I'll see if I can get something a bit more preprocessed in the near future. Perhaps that will help keep this thread on target.

Florian Apolloner

unread,
Feb 26, 2016, 5:09:27 PM2/26/16
to Django developers (Contributions to Django itself)
Hi Collin,


On Friday, February 26, 2016 at 10:56:33 PM UTC+1, Collin Green wrote:
Ugh. As a strong advocate of both django and zappa, I'd love it if we could keep the conversation on target without degenerating into stack attacks. If you don't want to weigh in, please feel no obligation to do so.

Sorry that wasn't ment as attack (maybe I should have written it out a little bit better). Please understand that most of us have no experience with Lambda or whatever you are currently using. I really have no idea what "cached code" or "internal state" in that sense means. As for the application stack: Since Python is slow to start up and on every bigger project you are going to need to implement loads of modules, you have a long startup time -- stuff like this prompted the switch from CGI to FCGI ages ago (this was what prompted me to make the sane application stack "joke"). Do not get me wrong, I love Django, but if you are going back to a CGI like environment, I am not convinced that it is the best tool (though if you can clear up what "cached code", "internal state", "serverless infrastructure" etc actually means instead of throwing them around like buzzwords and assuming we all know what you mean that would be helpful too).
 
I do agree that we could pare down the profile into more actionable sections. I'll see if I can get something a bit more preprocessed in the near future. Perhaps that will help keep this thread on target.

Please do so, especially how much time is spent in importing and configuring python modules -- then substract those from your timings and see what else is left.

Cheers,
Florian

Rich Jones

unread,
Feb 29, 2016, 5:26:17 PM2/29/16
to Django developers (Contributions to Django itself)
Hey all!

Let me clarify a few of the terms here and describe a bit how Django operates in this context.

"Serverless" in this contexts means "without any permanent infrastructure". The server is created _after_ the request comes in, and it dies once the response is returned. The means that we never have to worry about server operations or horizontal scalability, and we pay far, far far less of a cost, as we only pay for server time by the millisecond. It's also radically easier to deploy - a single 'python manage.py deploy production' gives you an infinitely scalable, zero-maintenance  app. Basically, Zappa is what comes after Heroku.

To do this, we use two services - Amazon Lambda and Amazon API Gateway.

The first, AWS Lamdba - allows us to define any arbitrary function, which Amazon will then cache to memory and execute in response to any AWS system event (S3 uploads, Emails, SQS events, etc.) This was designed for small functions, but I've been able to squeeze all of Django into it.

The other piece, API Gateway, allows us to turn HTTP requests into AWS events - in this case our Lambda context. This requires using a nasty language called 'VTL', but you don't need to worry about this.

Zappa converts this API Gateway request into a 'normal' Python WSGI request, feeds it to Django, gets the response back, and performs some magic on it that lets it get back out through API Gateway.

You can see my slides about this here: http://jsbin.com/movecayuba/1/edit?output
and a screencast here: https://www.youtube.com/watch?v=plUrbPN0xc8

Now, this comes with a cost, but that's the trade off. The flip side is that it also means we need to call Django's 'setup()' method every time. All of this currently takes about ~150ms - the majority of which is spent setting up the apps. If we could do that in parallel, this would greatly increase the performance of every django-zappa request. Make sense?

We also have a Slack channel where we are working on this if you want to come by!

R

Rich Jones

unread,
Feb 29, 2016, 5:34:16 PM2/29/16
to Django developers (Contributions to Django itself)
For those who are still following along, these are the lines in question:

https://github.com/Miserlou/django-zappa/blob/master/django_zappa/handler.py#L31
https://github.com/Miserlou/django-zappa/blob/master/django_zappa/handler.py#L68

It's very possible there are ways of significantly improving this. Suggestions always welcome!

R

Cristiano Coelho

unread,
Feb 29, 2016, 6:57:28 PM2/29/16
to Django developers (Contributions to Django itself)
Sorry if this looks like a retarded question, but have you tried the setup calls to the top of the lambda_handler module? I'm not sure why you need the settings data as an argument to the lambda handler, but if you find a way to move those 4 lines near setup(), you will only load the whole django machinery once (or until AWS decides to kill your instance). I have a wild guess that this is related to the way you have implemented the "publish to aws" process. But the main issue here is that you are calling setup() on every request, where you really shouldn't be doing that, but rather do it at module level.

I'm sorry if this goes away too far from the actual thread, since this looks more like a response to a zappa forum :)

Rich Jones

unread,
Feb 29, 2016, 7:07:32 PM2/29/16
to Django developers (Contributions to Django itself)
Haven't tried that! I don't _think_ that'll work.. but worth a shot (I don't think they only cache the handler.. I think they cache the whole environment). Will report back! And as you're mentioning, we really shouldn't be doing it every request, so if there were even a way to cache that manually rather than calculate it every time, that'd be just as valuable.

There is no "zappa forum" - it's just me and a few other contributors in a Slack channel! And all of this stuff is super new, and I'm sure that you guys know a lot more about the Django internals than I do, so all suggestions are welcome!

R

Cristiano Coelho

unread,
Feb 29, 2016, 7:23:00 PM2/29/16
to Django developers (Contributions to Django itself)
I'm almost sure that right now you are calling setup() with django already initialized in some cases where the enviorment is reused, I'm amazed django doesn't complain when setup() is called twice.

Rich Jones

unread,
Feb 29, 2016, 7:27:28 PM2/29/16
to Django developers (Contributions to Django itself)
As I suspected, moving setup() outside of the handler had a negligible effect - in fact the test showed a slight drop in performance. :(

Testing from httping. From Berlin to US-East-1:

Before:
--- https://arb9clq9k9.execute-api.us-east-1.amazonaws.com/unicode/json_example/ ping statistics ---
52 connects, 52 ok, 0.00% failed, time 56636ms
round-trip min/avg/max = 59.1/104.8/301.9 ms

After:
--- https://arb9clq9k9.execute-api.us-east-1.amazonaws.com/unicode/json_example/ ping statistics ---
51 connects, 51 ok, 0.00% failed, time 57306ms
round-trip min/avg/max = 61.8/128.7/523.2 ms

It was a nice thought though!

Cristiano Coelho

unread,
Feb 29, 2016, 7:33:05 PM2/29/16
to Django developers (Contributions to Django itself)
That's quite odd, I recall testing this once, where I created a lambda which had a datetime.now() at the top, and just returned that value. Out of a few calls, it returned two different results, meaning the module was re used "most" of the time. This was tested calling the lambda from the AWS Test itself and not through API Gateway, so perhaps API Gateway is preventing the module from being re used? Could be there anything else that might prevent AWS from re using the module?

Cristiano Coelho

unread,
Feb 29, 2016, 7:42:23 PM2/29/16
to Django developers (Contributions to Django itself)
I have repeated the test this time through API Gateway, and out of many calls I only got two different dates that were instantiated at module level, meaning my module was only imported twice. I fail to see why it doesn't behave the same with your code.

Rich Jones

unread,
Feb 29, 2016, 7:44:43 PM2/29/16
to Django developers (Contributions to Django itself)
That certainly could have something to do with it - there isn't very much transparency about how API Gateway works. It's super new and pretty janky, TBQH. However, I think that the behavior describing is not what's expected - the caching seems to be for the assets of the whole environment, not of anything that's computed - whether or not they are held in memory or read from disk.

[[ Also, obviously it's not a fair comparison, but I thought I'd include these numbers for reference:
--- http://djangoproject.com/ ping statistics ---
52 connects, 52 ok, 0.00% failed, time 68473ms
round-trip min/avg/max = 227.9/329.3/1909.3 ms ]]

So, I think the interesting things to explore would be:
 - Loading apps in parallel
 - "Pre-loading" apps before app deployment, then loading that cached state at runtime. I guess I'd need to know more about what it means to "load" an app to see if that makes any sense at all.

I imagine the former is probably more feasible. I understand the aversion to non-determinism, but I think that shouldn't matter as long as there is a way to define inter-app dependencies.

Cristiano Coelho

unread,
Feb 29, 2016, 8:00:21 PM2/29/16
to Django developers (Contributions to Django itself)
Rich, I have just performed a real test, with a simple lambda and a datetime.now() defined at the top of the module as I said, and out of 100 requests, this was the result:
{u'2016-03-01T00:37:30.476828': [43], u'2016-03-01T00:36:51.536025': [58]}
Where the date is the datetime.now() defined at the module top, and the number is the amount of times that same value was returned (oddly it sums 101, I think I did an additional one). So the module is really being reused, even with API Gateway (I did the test from python to a remote address). So in your case there must be something else going on or I did this test wrongly

Rich Jones

unread,
Feb 29, 2016, 8:21:11 PM2/29/16
to Django developers (Contributions to Django itself)
Hm. This is the downside of cloud services - we cannot look under the hood.

Since I think that since this is something we _want_ cached, and because it will make the function being executed shorter in length - it is a good code decision to make. Thank you for the idea! However, it looks like the actual result is negligible in terms of round-trip time.

Maybe you can try with httping and show me your times? Or give django-zappa a shot yourself! This is the app I'm testing with: https://github.com/Miserlou/django-helloworld

Cristiano Coelho

unread,
Feb 29, 2016, 8:34:53 PM2/29/16
to Django developers (Contributions to Django itself)
I think I have found your issue, if I'm not wrong, django won't initialize twice if you call it twice (which was your previous behaviour) at least not completely, since apps registry has a "ready" flag.
Check this line: https://github.com/django/django/blob/master/django/apps/registry.py#L66

So basically having setup() on top or inside the handler, should yield almost the same results since django won't re load apps that are already loaded, and that's why the change didn't yield any different results. Does the slow down come from somewhere else rather than setup()

From here doing 100 requests these are the results, times are quite higher since I have at least 200ms round trip from here to the USA.

min: 0.34s
max: 3.58
avg:  0.57s

Rich Jones

unread,
Feb 29, 2016, 8:47:43 PM2/29/16
to Django developers (Contributions to Django itself)
Ah, interesting! Thanks for tracking that down.

In chat we basically discovered that the intercontinental latency and shoddy wifi connections were responsible for a lot of the confusion.

Testing from a US-based fiber connection, we got results of ~40ms in both scenarios.
16 connects, 16 ok, 0.00% failed, time 15989ms
round-trip min/avg/max = 32.7/42.3/142.4 ms

So, there you have it! As if you needed more of a reason to get down with Zappa?! :-D

(Now, that's with a naive-example. Expect this thread to get bumped when we start really hitting the databases..)

Cristiano Coelho

unread,
Feb 29, 2016, 8:58:33 PM2/29/16
to Django developers (Contributions to Django itself)
In my opinion, those latencies are still a bit high, how much is really used on python/lambda code?  On a project of mine without hitting the database and django-rest-framework my times were around 1-4ms excluding any network latency. Debug and loggers might have high impact on your times if it is using cloud watch.

An interesting way of testing concurrency is to use apache benchmark. You can do it easily with an EC2 machine, ideally on the same region as your lambdas to reduce network latency to 0. There you can test on a nearly real scenario as this tool is quite good for load testing. Be careful to not do a huge test as you might start to get charged :D

I have added a github issue on zappa from reading your handler.py code with a few suggestions that may help you improve performance even further. Sorry that I'm lazy to make a real pull request.

Aymeric Augustin

unread,
Mar 1, 2016, 2:26:41 AM3/1/16
to django-d...@googlegroups.com
Hello Rich,


On 01 Mar 2016, at 01:44, Rich Jones <mise...@gmail.com> wrote:

I think the interesting things to explore would be:
 - Loading apps in parallel
 - "Pre-loading" apps before app deployment, then loading that cached state at runtime. I guess I'd need to know more about what it means to "load" an app to see if that makes any sense at all.

As of Django 1.7, loading an app means:

1. importing its app config class (what the dotted Python path in INSTALLED_APPS points to; if INSTALLED_APPS points to a Python package, an app config is generated on the fly)
2. importing its models module, if there is one (determined from the app config)
3. running the ready() method of the app config

In practice Django does 1. for all apps, then 2. for all apps, then 3. for all apps.

If “load an app” confuses you, read it as “import an app” or “import Python modules comprising this app”. Really there isn’t much going on aside from importing a bunch of Python modules. Some datastructures are built dynamically but I’d be surprised if it took a measurable amount of the startup time.

As far as I can tell from the rest of the discussion, the result of django.setup() is already cached and reused anyway, but I thought I’d demystify things a bit!


I imagine the former is probably more feasible. I understand the aversion to non-determinism, but I think that shouldn't matter as long as there is a way to define inter-app dependencies.

Non-deterministic software has this strange tendency to deterministically work on developers’ laptops and deterministically fail on production servers ;-)

More seriously, there are cases where importing a Django project works when app A is imported before app B but deadlocks when app B is imported before app A. Apps that attempt to perform auto-discovery on other apps (such as the admin, debug-toolbar, haystack, etc.) are especially prone to this. This isn’t a theoretical argument. We’ve had these bugs and I spent hundreds of hours refactoring the app-loading process to eliminate them.

Also it’s unclear to me how developers could be sure that they’ve defined all required inter-app dependencies.


Best regards,

-- 
Aymeric.

Reply all
Reply to author
Forward
0 new messages