While browsing the code, I noticed this in unix_events.py:
"""
def _sig_chld(self):
try:
try:
pid, status = os.waitpid(0, os.WNOHANG)
except ChildProcessError:
return
"""
The current code will only wait children in the same process group, so
if a child process called setpgrp() or setsid() (which is common e.g.
for a daemon), the above code won't work as expected: is this wanted?
Otherwise, what's the rationale behind this code?
I'm thinking about a couple things:
1. registering a signal handler for SIGCHLD makes it much more likely
to have syscalls failing with EINTR
2. by waiting for all children indiscriminately, this will make some
user code calling waitpid() on child processes they spawned fail with
ECHILD (subprocess is guarded against this, but I expect some
third-party code might not be)
I guess the reason is to avoid having to call waitpid() on multiple
PIDs at a periodic interval...
Otherwise, what's the rationale behind this code?It just sounds odd to have to do explicit regular polling for every known subprocess when the SIGCHLD feature was designed to notify you when you should poll.
1. Syscalls failing with EINTR *shouldn't* be a problem when using Tulip, the send/receive code deals with this, all calls have an except (BlockingIOError, InterruptedError) clause.
2. I guess this is an interoperability issue between Tulip and code that manually spawns subprocesses (and bypassing the subprocess module). I expect that would rarely be an issue.
Perhaps we can design the subprocess handling so that you can turn off the SIGCHLD handler and in that case let the event loop poll for specific children repeatedly?
How about another approach, such as spawning a dedicated thread that
will block on os.waitpid(<known child pid>) and will wake up the event
loop when it returns?
(if you're spawning a process, you can probably bear the cost of also
spawning an additional thread)
Really, EINTR should be handled by the interpreter/stdlib (yeah, I
know I signed for that, I've just been really busy recently...), but
that could help in the meantime for restartable syscalls.
The subprocess code is definitely not as baked as I'd like it to be -- I haven't had enough time to review and try it. See e.g. http://code.google.com/p/tulip/issues/detail?id=68, that issue light be related.
Otherwise, what's the rationale behind this code?
It just sounds odd to have to do explicit regular polling for every known subprocess when the SIGCHLD feature was designed to notify you when you should poll
I'm thinking about a couple things:
1. registering a signal handler for SIGCHLD makes it much more likely
to have syscalls failing with EINTR
2. by waiting for all children indiscriminately, this will make some
user code calling waitpid() on child processes they spawned fail with
ECHILD (subprocess is guarded against this, but I expect some
third-party code might not be)
I guess the reason is to avoid having to call waitpid() on multiple
PIDs at a periodic interval...
1. Syscalls failing with EINTR *shouldn't* be a problem when using Tulip, the send/receive code deals with this, all calls have an except (BlockingIOError, InterruptedError) clause.
2. I guess this is an interoperability issue between Tulip and code that manually spawns subprocesses (and bypassing the subprocess module). I expect that would rarely be an issue.
Perhaps we can design the subprocess handling so that you can turn off the SIGCHLD handler and in that case let the event loop poll for specific children repeatedly?