First thing - make sure DEBUG is set to off.
If that's not the problem, Let me take a wild shoot in the dark; are you
by any chance looping through a large queryset?
I recently had a proces hog about 1.8 GB RAM when looping through a
queryset with approx. 350k items as:
for obj in Model.objects.all():
do_something(obj)
I rewrote it to:
objs = Model.objects.all().values_list("id", flat=True)
for obj_id in objs:
obj = Model.object.get(pk=obj_id)
do_something(obj)
... and my RAM usage was below 30 MB at all time.
> Maybe it's getting late and I deserve a nice gin and tonic...
Cheers,
--
Christian Joergensen
http://www.technobabble.dk
First thing - make sure DEBUG is set to off.
AndyB wrote:
> I've got a Django app that seems to eat up a lot of memory. I posted a
> message on Stack Overflow and it got a little sidetracked into a
> debate about the merits of WSGI and Apache in Worker MPM mode.
If that's not the problem, Let me take a wild shoot in the dark; are you
by any chance looping through a large queryset?
I recently had a proces hog about 1.8 GB RAM when looping through a
queryset with approx. 350k items as:
for obj in Model.objects.all():
do_something(obj)
I rewrote it to:
objs = Model.objects.all().values_list("id", flat=True)
for obj_id in objs:
obj = Model.object.get(pk=obj_id)
do_something(obj)
... and my RAM usage was below 30 MB at all time.
Cheers,
> Maybe it's getting late and I deserve a nice gin and tonic...
--
Christian Joergensen
http://www.technobabble.dk
Well, that in itself shouldn't cause that much RAM usage - just longer
execution time.
My guess, without looking at the queryset implementation, is that they
cache earlier, passed by items.
> for obj in Model.objects.all().iterator():
> do_something(obj)
Thank you. I didn't know about that function. That is certainly prettier
than my "hack" :)
I assumed that was the default behaviour when iterating. There shouldn't
be any need to cache previous items, as there (to my knowledge).is no
way to retrieve previous items from a python iterator.
Regards,
That's completely normal behaviour for Unix-like memory management. Once
memory is freed, it is returned to the process for later reuse (so a
subsequent malloc() call by that process will reuse memory that is had
earlier freed). This helps avoid some causes of memory fragmentation and
allows for contiguous large blocks to be malloc-ed by other processes.
It's also precisely why long running processes are often designed to
have their memory using portions restart periodically. It's the act of
exiting the process at the system level which returns the memory to the
system pool.
Regards,
Malcolm
Well, that's not really accurate, as you realise further down. The
maximum amount of RAM used will not decrease. However, it won't increase
without bound unless you actually require using a larger simultaneous
amount.
[...]
> So my memory usage will always statistically converge on:
> (Amount of RAM used by most expensive request) x (Maximum simultaneous
> Django processes)
Providing the most expensive request is likely to frequently. In many
cases, the most expensive request could well be an outlier that only
happens infrequently. Statistically infrequent events are a smaller
concern, since they'll happen less than once, on average, between
process restarts.
>
> and nothing I do with within Django itself can reduce this.
That's correct. It's not a really bad thing, since, as I mentioned, it's
completely normal Unix process behaviour. And webservers already account
for that with settings like the maximum number of requests per child.
It's also one of the arguments for using multiple threads instead of
multiple processes sometimes (the memory allocator operates on a
per-process granularity, so can be shared between threads).
Definitely something to take into consideration, but it's manageable.
Regards,
Malcolm