Memory profiling mod_wsgi/Django

285 views
Skip to first unread message

Nik Haldimann

unread,
Apr 30, 2015, 3:09:00 PM4/30/15
to mod...@googlegroups.com
Hi

I'm running a Django app on Amazon Elastic Beanstalk which means it runs on Apache 2.2.29 with mod_wsgi 3.5 in daemon mode.

In production the production app on beanstalk is exhibiting very obvious symptoms of a memory leak. Memory usage of the httpd/wsgi processes increases steadily (looks like ~2 MB per process every 5 minutes) until eventually the machine grinds to a halt. I suspect the leak is in the Python code or in some Python library I'm using.

I have not been able to reproduce this locally, either because I can't quite replicate the same environment or more likely because it's hard to simulate real production traffic and data. So I'm trying to do memory profiling on the production server but can't figure out how to do it. What are some techniques for this?

Things I've tried:

This seemed most promising and worked fine locally, but when I turn on remote monitoring in the Python code run within mod_wsgi, I can't find the process. Guppy is underdocumented so I can't really figure out if there's something obvious I'm missing.

Works as advertised, but unfortunately it's not uncovering my particular leak. I can't see any particular object count growing out of control, not sure why.

Nik

Nik Haldimann

unread,
Apr 30, 2015, 3:36:46 PM4/30/15
to mod...@googlegroups.com
One correction: I realized the Apache version Beanstalk uses by default is actually 2.4.10

Nik

Graham Dumpleton

unread,
May 5, 2015, 10:33:20 PM5/5/15
to mod...@googlegroups.com
Sorry for the slow reply. Have had various things distracting me of late so not getting to things as quickly as I should.

Anyway, have you learnt anything new about this?

One question I have is whether you have been able to determine whether the memory usage is in the Apache child worker processes, or in the mod_wsgi daemon processes?

Apache 2.2 and mod_wsgi 3.5 are both not really ideal versions to use.

There are certain cases with slow or blocking HTTP clients which can cause memory usage in Apache 2.2 child worker processes to balloon out. This would be more prevalent if returning larger responses. This issue is fixed in Apache 2.4.

Large request content where you have slow or blocking HTTP clients which dribble in content can also cause a memory issue in mod_wsgi daemon mode when that content is being read. This issue is fixed in later mod_wsgi 4.X versions by dropping the use of an Apache 1.3 API which caused this problem and using the better Apache 2.X API for reading request content. The change couldn't be made until Apache 1.3 support was dropped which was done in mod_wsgi 4.X.

So, second question is does your application have a high percentage of mobile browser clients as that would have a higher chance of causing these problems?

In short, no one should really be using Apache 2.2 or mod_wsgi 3.X, although accept that on AWS you don't have much choice because they use such out of date versions. Only choice would be to change to AWS support for running a docker container and create the deployment environment yourself with newer versions.

BTW. These two problems don't cause unbounded memory growth. They will though set a higher level of normal memory usage than could be achieved if newer Apache/mod_wsgi versions were used. That that would occur may still cause you to breach and limit they place on you.

Graham



Graham Dumpleton

unread,
May 5, 2015, 10:34:07 PM5/5/15
to mod...@googlegroups.com
Now I see you are actually using Apache 2.4.10. :-(

The issue with request content would still apply though.

Graham

Nik Haldimann

unread,
May 6, 2015, 10:09:13 AM5/6/15
to mod...@googlegroups.com
Thanks for the ideas, Graham. After much futzing about I was able to use Guppy on a production server and discovered that the leak was in one of the libraries I was using.

So the leak wasn't in mod_wsgi (I never thought it would be), I just didn't know how to use memory profiling tools correctly. I'll write up a blog post about that one of these days because it's not really obvious.

Nik

--
You received this message because you are subscribed to a topic in the Google Groups "modwsgi" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/modwsgi/9Tayl9hpFbI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
Visit this group at http://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Graham Dumpleton

unread,
May 6, 2015, 7:14:51 PM5/6/15
to mod...@googlegroups.com
Nothing with guppy/heapy is obvious. The few times I have tried to use it, it has been a right pain to use and generally not helpful.

The new tracemalloc module in Python 3.4 is interesting but again you need to devote some effort to try and work out how to use it and interpret what you get.

Anyway, glad to hear you sorted it out.

Graham 

You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages