After running emacs for *some* time(like one or two days), call-process starts to fail:
(call-process "ls")
Debugger entered--Lisp error: (file-error "Creating process pipe" "no error") call-process("ls") eval((call-process "ls") nil) eval-last-sexp-1(nil) ad-Orig-eval-last-sexp(nil) eval-last-sexp(nil) call-interactively(eval-last-sexp nil nil)
What exactly does this mean?
In GNU Emacs 24.0.50.1 (i386-mingw-nt6.1.7600) of 2011-06-28 on 3249CTO Windowing system distributor `Microsoft Corp.', version 6.1.7600 configured using `configure --with-gcc (4.5) --no-opt --cflags -Ic:/build/include'
That something is broken inside Emacs, but Emacs doesn't tell what.
Did you build Emacs yourself? If so, could you please add a call to GetLastError to sys_pipe (defined on w32.c), after the _pipe call, and when it fails like that, see which error code it returns?
> In GNU Emacs 24.0.50.1 (i386-mingw-nt6.1.7600) > of 2011-06-28 on 3249CTO
This is quite old, suggest to update to a newer version, to avoid wasting energy on an old bug that was already fixed.
That something is broken inside Emacs, but Emacs doesn't tell what.
Did you build Emacs yourself? If so, could you please add a call to GetLastError to sys_pipe (defined on w32.c), after the _pipe call, and when it fails like that, see which error code it returns?
I was using Sean Sieger's build. Anyway, i update bzr repo to "revno: 105425". Made following change:
=== modified file 'src/w32.c' --- src/w32.c 2011-07-09 07:00:58 +0000 +++ src/w32.c 2011-08-10 01:26:51 +0000 @@ -5218,6 +5218,7 @@ pipes into binary mode; we will do text mode translation ourselves if required. */ rc = _pipe (phandles, 0, _O_NOINHERIT | _O_BINARY); + printf("xwl: error = %d\n", GetLastError ());
if (rc == 0) {
In gdb, the error number printed is always zero, even when this file-error comes up. But it seems it would first give this error:
Not sure yet, but it sounds unlikely (the limit is on simultaneous processes). Do you see the value of rc becoming negative at some point? If so, does _pipe return a negative value, or does it become negative in this fragment below the call to _pipe?
Not sure yet, but it sounds unlikely (the limit is on simultaneous processes). Do you see the value of rc becoming negative at some point? If so, does _pipe return a negative value, or does it become negative in this fragment below the call to _pipe?
> If the latter, it sounds like we are not closing the file handles > somewhere.
> I put a printf after "rc = -1". I can see lots of logs from there. > _pipe never returns a negative value.
We are close. This probably means that we are not closing file descriptors somewhere. When these printf's about rc == -1 start to appear, can you look at all the elements of the fd_info[] array (there are 64 of them), and see which flags are set on most of the elements, and whether or not the `cp' member is non-NULL? This information might give a clue as to what functionality is stealing the file descriptors and not releasing them.
> We are close. This probably means that we are not closing file > descriptors somewhere. When these printf's about rc == -1 start to > appear, can you look at all the elements of the fd_info[] array (there > are 64 of them), and see which flags are set on most of the elements, > and whether or not the `cp' member is non-NULL? This information > might give a clue as to what functionality is stealing the file > descriptors and not releasing them.
Here is the fd_info array. Most flags are 273, 274 or 0x111, 0x112, namely FILE_PIPE read and write? And most cp member is NULL, does that imply those are not properly released ones?
> It gives an error: > while: Spawning child process: resource temporarily unavailable
I'm not sure the problem reproduced by this snippet is the same one as what you reported originally. In the above snippet, the problem happens because we never give Emacs a chance to take note of the processes that exit, and free the handles used for the 2 pipes we open for each subprocess. If I uncomment the sleep-for call, the program runs to completion with no problems, even if I replace 1 with 0.1.
The underlying issue is that the Windows build of Emacs is limited to 31 simultaneous subprocesses. That's because the APIs used on Windows to listen to subprocesses are limited to 64 handles, and we use 2 handles per pipe (3 more handles are taken by the standard I/O handles). So we cannot start 50 subprocesses unless the first few exit by the time we get to the 32nd process. Emacs checks for exited subprocesses when it is idle, but the above loop never gives it a chance to do that. Adding a call to sleep-for does, and so the problem disappears.
I can achieve similar results with a patch I show below, which causes sys_pipe to retry the failed _pipe call after doing the equivalent of `(sleep-for 0.1)'.
However, I'm not sure this actually solves your original problem, for two reasons:
. you said that your problem starts happening only after some time that Emacs is up and running, whereas this recipe works right away after starting "emacs -Q"
. I really doubt that you use some code that launches many subprocesses one after the other without any idleness in between
So I think there's a different bug somewhere. Or maybe I'm missing something. Can you tell more about the context of your original problem, which produced the following backtrace:
Debugger entered--Lisp error: (file-error "Creating process pipe" "no error") call-process("ls") eval((call-process "ls") nil) eval-last-sexp-1(nil) ad-Orig-eval-last-sexp(nil) eval-last-sexp(nil) call-interactively(eval-last-sexp nil nil)
Was "ls" the only subprocess active at that time, or were you launching many more at the same time?
If none of the above gives a clue, could you please add printf's to the following functions:
. create_child and register_child, where they assign cp->fd = fd
. delete_child
In all of these places, please print cp->fd. When the problem starts to happen, it would be interesting to see which file descriptors somehow were not released.
Here's the patch that allows your test case to run without failing:
/* make pipe handles non-inheritable; when we spawn a child, we replace the relevant handle with an inheritable one. Also put pipes into binary mode; we will do text mode translation ourselves if required. */ + retry: rc = _pipe (phandles, 0, _O_NOINHERIT | _O_BINARY);
I believe the following commit has also fixed my problem. i have not
reproduced it in days.
Author: Eli Zaretskii <e...@gnu.org>
Date: Sat May 5 11:40:31 2012 +0300
Fix failures in starting subprocesses on Windows 7.
src/w32proc.c (new_child): Force Windows to reserve only 64KB of
stack for each reader_thread, instead of defaulting to 8MB
determined by the linker. This avoids failures in creating
subprocesses on Windows 7, see the discussion in this thread:
http://lists.gnu.org/archive/html/emacs-devel/2012-03/msg00119.html
> Date: Tue, 5 Jun 2012 09:45:40 +0800
> From: William Xu <william....@gmail.com>
> Cc: 9...@debbugs.gnu.org
> I believe the following commit has also fixed my problem. i have not
> reproduced it in days.
> Author: Eli Zaretskii <e...@gnu.org>
> Date: Sat May 5 11:40:31 2012 +0300
> Fix failures in starting subprocesses on Windows 7.
> src/w32proc.c (new_child): Force Windows to reserve only 64KB of
> stack for each reader_thread, instead of defaulting to 8MB
> determined by the linker. This avoids failures in creating
> subprocesses on Windows 7, see the discussion in this thread:
> http://lists.gnu.org/archive/html/emacs-devel/2012-03/msg00119.html
Thanks. I'm therefore closing this bug; feel free to reopen with new
data if the bug recurs.