Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

OSError: [Errno 12] Cannot allocate memory

349 views
Skip to first unread message

duncan smith

unread,
Nov 30, 2016, 12:35:02 PM11/30/16
to
Hello,
I have had an issue with some code for a while now, and I have not
been able to solve it. I use the subprocess module to invoke dot
(Graphviz) to generate a file. But if I do this repeatedly I end up with
an error. The following traceback is from a larger application, but it
appears to be repeated calls to 'to_image' that is the issue.


Traceback (most recent call last):
File "<pyshell#80>", line 1, in <module>
z = link_exp.sim1((djt, tables), variables, 1000, 400, 600,
[0,1,2,3,4,5,6], [6,7,8,9,10], ind_gens=[link_exp.males_gen()],
ind_gens_names=['Forename'], seed='duncan')
File "link_exp.py", line 469, in sim1
RL_F2 = EM_setup(data)
File "link_exp.py", line 712, in full_EM
last_g = prop.djt.g
File "Nin.py", line 848, in draw_model
dot_g.to_image(filename, prog='dot', format=format)
File "dot.py", line 597, in to_image
to_image(str(self), filename, prog, format)
File "dot.py", line 921, in to_image
_execute('%s -T%s -o %s' % (prog, format, filename))
File "dot.py", line 887, in _execute
close_fds=True)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory


The relevant (AFAICT) code is,


def to_image(text, filename, prog='dot', format='dot'):
# prog can be a series of commands
# like 'unflatten -l 3 | dot'
handle, temp_path = tempfile.mkstemp()
f = open(temp_path, 'w')
try:
f.write(text)
f.close()
progs = prog.split('|')
progs[0] = progs[0] + ' %s ' % temp_path
prog = '|'.join(progs)
_execute('%s -T%s -o %s' % (prog, format, filename))
finally:
f.close()
os.remove(temp_path)
os.close(handle)

def _execute(command):
# shell=True security hazard?
p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
close_fds=True)
output = p.stdout.read()
p.stdin.close()
p.stdout.close()
#p.communicate()
if output:
print output


Any help solving this would be appreciated. Searching around suggests
this is something to do with file handles, but my various attempts to
solve it have failed. Cheers.

Duncan

Chris Kaynor

unread,
Nov 30, 2016, 12:54:31 PM11/30/16
to
On Wed, Nov 30, 2016 at 9:34 AM, duncan smith <dun...@invalid.invalid> wrote:
> Hello,
> I have had an issue with some code for a while now, and I have not
> been able to solve it. I use the subprocess module to invoke dot
> (Graphviz) to generate a file. But if I do this repeatedly I end up with
> an error. The following traceback is from a larger application, but it
> appears to be repeated calls to 'to_image' that is the issue.

I don't see any glaring problems that would obviously cause this,
however have you checked to see if the processes are actually exiting
(it looks like you are on Linux, so the top command)?
This code has a potential dead-lock. If you are calling it from
multiple threads/processes, it could cause issues. This should be
obvious, as your program will also not exit. The communicate call is
safe, but commented out (you'd need to remove the three lines above it
as well). Additionally, you could just set stdin=None rather than
PIPE, which avoids the dead-lock, and you aren't using stdin anyways.
This issues comes if the subprocess may ever wait for something to be
written to stdin, it will block forever, but your call to read will
also block until it closes stdout (or possibly other cases). Another
option would be to close stdin before starting the read, however if
you ever write to stdin, you'll reintroduce the same issue, depending
on OS buffer sizes.

My question above also comes from the fact that I am not 100% sure
when stdout.read() will return. It is possible that a null or EOF
could cause it to return before the process actually exits. The
subprocess could also expliciting close its stdout, causing it to
return while the process is still running. I'd recommend adding a
p.wait() or just uncommenting the p.communicate() call to avoid these
issues.

Another, unrelated note, the security hazard depends on where the
arguments to execute are coming from. If any of those are controlled
from untrusted sources (namely, user input), you have a
shell-injection attack. Imagine, for example, if the user requests the
filename "a.jpg|wipehd" (note: I don't know the format command on
Linux, so replace with your desired command). This will cause your
code to wipe the HD by piping into the command. If all of the inputs
are 100% sanitized or come from trusted sources, you're fine, however
that can be extremely difficult to guarantee.

duncan smith

unread,
Nov 30, 2016, 12:54:42 PM11/30/16
to

[snip]

Sorry, should have said Python 2.7.12 on Ubuntu 16.04.

Duncan

Chris Angelico

unread,
Nov 30, 2016, 12:57:45 PM11/30/16
to
On Thu, Dec 1, 2016 at 4:34 AM, duncan smith <dun...@invalid.invalid> wrote:
>
> def _execute(command):
> # shell=True security hazard?
> p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE,
> stdout=subprocess.PIPE,
> stderr=subprocess.STDOUT,
> close_fds=True)
> output = p.stdout.read()
> p.stdin.close()
> p.stdout.close()
> #p.communicate()
> if output:
> print output

Do you ever wait() these processes? If not, you might be leaving a
whole lot of zombies behind, which will eventually exhaust your
process table.

ChrisA

duncan smith

unread,
Nov 30, 2016, 7:12:43 PM11/30/16
to
No. I've just called this several thousand times (via calls from a
higher level function) and had no apparent problem. Top reports no
zombie tasks, and memory consumption and the number of sleeping tasks
seem to be reasonably stable. I'll try running the code that generated
the error to see if I can coerce it into failing again. OK, no error
this time. Great, an intermittent bug that's hard to reproduce ;-). At
the end of the day I just want to invoke dot to produce an image file
(perhaps many times). Is there perhaps a simpler and more reliable way
to do this? Or do I just add the p.wait()? (The commented out
p.communicate() is there from a previous, failed attempt to fix this -
as, I think, are the shell=True and close_fds=True.) Cheers.

Duncan

Chris Kaynor

unread,
Nov 30, 2016, 7:47:02 PM11/30/16
to
That would appear to rule out the most common issues I would think of.

That said, are these calls being done in a tight loop (the full
call-stack implies it might be a physics simulation)? Are you doing
any threading (either in Python or when making the calls to Python -
using a bash command to start new processes without waiting counts)?
Is there any exception handling at a higher level that might be
continuing past the error and sometimes allowing a zombie process to
stay?

If you are making a bunch of calls in a tight loop, that could be your
issue, especially as you are not waiting on the process (though the
communicate does so implicitly, and thus should have fixed the issue).
This could be intermittent if the processes sometimes complete
quickly, and other times are delayed. In these cases, a ton of the dot
processes (and shell with shell=true) could be created before any
finish, thus causing massive usage. Some of the processes may be
hanging, rather than outright crashing, and thus leaking some
resources.

BTW, the docstring in to_image implies that the shell=True is not an
attempted fix for this - the example 'unflatten -l 3 | dot' is
explicitly suggesting the usage of shell=True.

duncan smith

unread,
Nov 30, 2016, 7:55:02 PM11/30/16
to
Thanks. So something like the following might do the job?

def _execute(command):
p = subprocess.Popen(command, shell=False,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
close_fds=True)
out_data, err_data = p.communicate()
if err_data:
print err_data


Duncan

Chris Kaynor

unread,
Nov 30, 2016, 8:12:49 PM11/30/16
to
On Wed, Nov 30, 2016 at 4:54 PM, duncan smith <dun...@invalid.invalid> wrote:
>
> Thanks. So something like the following might do the job?
>
> def _execute(command):
> p = subprocess.Popen(command, shell=False,
> stdout=subprocess.PIPE,
> stderr=subprocess.STDOUT,
> close_fds=True)
> out_data, err_data = p.communicate()
> if err_data:
> print err_data

I did not notice it when I sent my first e-mail (but noted it in my
second one) that the docstring in to_image is presuming that
shell=True. That said, as it seems everybody is at a loss to explain
your issue, perhaps there is some oddity, and if everything appears to
work with shell=False, it may be worth changing to see if it does fix
the problem. With other information since provided, it is unlikely,
however.

Not specifying the stdin may help, however it will only reduce the
file handle count by 1 per call (from 2), so there is probably a root
problem that it will not help.

I would expect the communicate change to fix the problem, except for
your follow-up indicating that you had tried that before without
success.

Removing the manual stdout.read may fix it, if the problem is due to
hanging processes, but again, your follow-up indicates thats not the
problem - you should have zombie processes if that were the case.

A few new questions that you have not answered (nor have they been
asked in this thread): How much memory does your system have? Are you
running a 32-bit or 64-bit Python? Is your Python process being run
with any additional limitations via system commands (I don't know the
command, but I know it exists; similarly, if launched from a third
app, it could be placing limits)?

Chris

duncan smith

unread,
Nov 30, 2016, 8:17:21 PM11/30/16
to
In this case the calls *are* in a loop (record linkage using an
expectation maximization algorithm).

> If you are making a bunch of calls in a tight loop, that could be your
> issue, especially as you are not waiting on the process (though the
> communicate does so implicitly, and thus should have fixed the issue).
> This could be intermittent if the processes sometimes complete
> quickly, and other times are delayed. In these cases, a ton of the dot
> processes (and shell with shell=true) could be created before any
> finish, thus causing massive usage. Some of the processes may be
> hanging, rather than outright crashing, and thus leaking some
> resources.
>

I'll try the p.communicate thing again. The last time I tried it I might
have already got myself into a situation where launching more
subprocesses was bound to fail. I'll edit the code, launch IDLE again
and see if it still happens.

> BTW, the docstring in to_image implies that the shell=True is not an
> attempted fix for this - the example 'unflatten -l 3 | dot' is
> explicitly suggesting the usage of shell=True.
>


OK. As you can see, I don't really understand what's happening under the
hood :-). Cheers.

Duncan

duncan smith

unread,
Dec 1, 2016, 12:10:39 PM12/1/16
to
8 Gig, 64 bit, no additional limitations (other than any that might be
imposed by IDLE). In this case the simulation does consume *a lot* of
memory, but that hasn't been the case when I've hit this in the past. I
suppose that could be the issue here. I'm currently seeing if I can
reproduce the problem after adding the p.communicate(), but it seems to
be using more memory than ever (dog slow and up to 5 Gig of swap). In
the meantime I'm going to try to refactor to reduce memory requirements
- and 32 Gig of DDR3 has been ordered. I'll also dig out some code that
generated the same problem before to see if I can reproduce it. Cheers.

Duncan

Duncan
0 new messages