multiprocessing / forking memory usage

7 views
Skip to first unread message

Randall Smith

unread,
May 26, 2009, 1:52:22 PM5/26/09
to pytho...@python.org
I'm trying to get a grasp on how memory usage is affected when forking
as the multiprocessing module does. I've got a program with a parent
process using wx and other memory intensive modules. It spawns child
processes (by forking) that should be very lean (no wx required, etc).
Based on inspection using "ps v" and psutil, the memory usage (rss) is
much higher than I would expect for the subprocess.

My understanding is that COW is used when forking (on Linux). So maybe
"ps v pid" is reflecting that. If that's the case, is there a way to
better determine the child's memory usage? If it's not the case and I'm
using modules I don't need, how can I reduce the memory usage to what
the child actually uses instead of including everything the parent is using?

Randall

Piet van Oostrum

unread,
May 26, 2009, 5:14:37 PM5/26/09
to
>>>>> Randall Smith <ran...@tnr.cc> (RS) wrote:

>RS> I'm trying to get a grasp on how memory usage is affected when forking as
>RS> the multiprocessing module does. I've got a program with a parent process
>RS> using wx and other memory intensive modules. It spawns child processes (by
>RS> forking) that should be very lean (no wx required, etc). Based on
>RS> inspection using "ps v" and psutil, the memory usage (rss) is much higher
>RS> than I would expect for the subprocess.

The child is a clone of the parent. So both its virtual memory usage and
its resident memory usage will be equal to the parent's ones immediately
after the fork(). But the actual physical memory has only one copy
resident, although ps will show it on both processes (at least I think
that's how ps works). Of course later they will diverge.

>RS> My understanding is that COW is used when forking (on Linux).

I think this is true of all modern Unix systems.

>RS> So maybe "ps v pid" is reflecting that. If that's the case, is
>RS> there a way to better determine the child's memory usage?

Define `memory usage' in the light of the above.

As long as the parent is still around and you don't run out of virtual
memory in the child, not much harm is done.

If the parent stops and you don't run out of virtual memory in the
child, the excessive pages will eventually be paged out, and then no
longer occupy physical memory. As long as you have enough swap space it
shouldn't be a big problem. The extra paging activity is a bit of a
loss, however.

If you run out of virtual memory in the child you have a problem, however.

>RS> If it's not the case and I'm using
>RS> modules I don't need, how can I reduce the memory usage to what the child
>RS> actually uses instead of including everything the parent is using?

The best would be to fork the child before you import the excess modules
in the parent. If that is not possible you could try to delete as much
in the child as you can, for example by
del wx; del sys.modules['wx'] etc, delete all variables that you don't
need, and hope the garbage collector will clean up enough. But it will
make you application quite complicated. From the python level you can't
get rid of loaded shared libraries, however. And trying to do that from
the C level is probably close to committing suicide.

My advise: don't worry until you really experience memory problems.
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org

Randall Smith

unread,
May 27, 2009, 1:57:29 PM5/27/09
to pytho...@python.org
Thanks Piet. You gave a good explanation and I think I understand much
better now.

Aahz

unread,
May 30, 2009, 5:22:57 PM5/30/09
to
In article <mailman.771.1243360...@python.org>,

Randall Smith <ran...@tnr.cc> wrote:
>
>I'm trying to get a grasp on how memory usage is affected when forking
>as the multiprocessing module does. I've got a program with a parent
>process using wx and other memory intensive modules. It spawns child
>processes (by forking) that should be very lean (no wx required, etc).
>Based on inspection using "ps v" and psutil, the memory usage (rss) is
>much higher than I would expect for the subprocess.

One option if you're concerned about memory usage is to os.exec() another
program after forking, which will overlay the current process.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

my-python-code-runs-5x-faster-this-month-thanks-to-dumping-$2K-
on-a-new-machine-ly y'rs - tim

Reply all
Reply to author
Forward
0 new messages