Russ
> I was just browsing some other code, and came across
> madvise(MADV_DONTFORK) which is available since Linux 2.6.16.
>
> Would it make sense for Go to mark its heap with MADV_DONTFORK so that
> a "large" Go process can still fork small processes on resource-
> constrained systems?
A more standard approach would call vfork() or clone(2) with suitable
flags. Martin Buchholz contributed such code to OpenJDK (which faces
exactly the same problem), and he might still have a copy under terms
which are suitable for inclusion with Go.
There is some info here:
http://developers.sun.com/solaris/articles/subprocess/subprocess.html
I'll look it over and try to submit a CL.
Regards
Albert
It seems the options are vfork and clone(CLONE_VFORK).
As also noted at
http://en.wikipedia.org/wiki/Fork_(operating_system)
the man page for vfork says:
`It is rather unfortunate that Linux revived this specter from the
past. The BSD man page states: "This system call will be eliminated
when proper system sharing mechanisms are implemented. Users should
not depend on the memory sharing semantics of vfork() as it will, in
that case, be made synonymous to fork(2)."`
which makes me wonder if madvise(MADV_DONTFORK) is the solution for
the new millennium?
If vfork is okay, it might be as simple as changing
RawSyscall(SYS_FORK, 0, 0, 0)
to
RawSyscall(SYS_VFORK, 0, 0, 0)
in exec_unix.go, but this would leave the forking process suspended
until the child process reaches RawSyscall(SYS_EXECVE, ...).
I'm guessing the madvise approach avoids this pause.
Thoughts on the way forward?
Regards
Albert
> `It is rather unfortunate that Linux revived this specter from the
> past. The BSD man page states: "This system call will be eliminated
> when proper system sharing mechanisms are implemented. Users should
> not depend on the memory sharing semantics of vfork() as it will, in
> that case, be made synonymous to fork(2)."`
>
> which makes me wonder if madvise(MADV_DONTFORK) is the solution for
> the new millennium?
That paragraph has been there since vfork was invented, and it's a
canard. It will always be quicker to clone a process if you don't have
to clone the memory mappings, and that is getting more true rather than
less. It will never be more efficient to implement vfork as fork.
> If vfork is okay, it might be as simple as changing
>
> RawSyscall(SYS_FORK, 0, 0, 0)
>
> to
>
> RawSyscall(SYS_VFORK, 0, 0, 0)
>
> in exec_unix.go, but this would leave the forking process suspended
> until the child process reaches RawSyscall(SYS_EXECVE, ...).
Worse, it's not clear that all systems which provide vfork also permit
the child process to do things like call setuid. vfork only knows how
to unwind a limited set of changes to the child environment, and that
set of changes is poorly documented.
> I'm guessing the madvise approach avoids this pause.
I think madvise is the right way to go. But note that it will require
copying all the parameters to forkAndExecInChild into local stack
variables before the fork, as they may otherwise be inaccessible to the
child.
Ian