clone and CLONE_NEWPID

812 views
Skip to first unread message

Jonas Pfenniger (zimbatm)

unread,
Jan 23, 2013, 8:26:02 AM1/23/13
to golang-nuts
Hi gophers, I'm back again with a few questions.

Because the CLONE_NEWPID flag is unavailabe when using syscall.Unshare() I am left wondering how I could make a call to SYS_CLONE under Linux.

Maybe it's trivial but I can't seem to wrap my head around it. There is RawSyscall and RawSyscall6 that I might use but I don't know which-one to use. And then if the call is successful, will Go work properly once "cloned" so that I can further make calls to syscall.Setsid, syscall.Chroot and finally syscall.Exec ?

Finally, in go's src/pkg/syscall/exec_linux.go forkAndExecInChild
implementation, all calls seem to avoid the library implementations and go trough RawSyscalls. It left me wondering if this was because we are left in a special mode after fork (maybe related to GC) or if it's just a bootstrap workaround ?

Cheers,
Jonas

Russ Cox

unread,
Jan 23, 2013, 10:16:02 AM1/23/13
to Jonas Pfenniger (zimbatm), golang-nuts
On Wed, Jan 23, 2013 at 8:26 AM, Jonas Pfenniger (zimbatm) <jo...@pfenniger.name> wrote:
Hi gophers, I'm back again with a few questions.

Because the CLONE_NEWPID flag is unavailabe when using syscall.Unshare() I am left wondering how I could make a call to SYS_CLONE under Linux.

You really can't, except by duplicating the code in package syscall, which is careful to keep the child process from re-entering the scheduler and becoming hopelessly confused. 
 
Let's step back: what do you really need to do? You mentioned setsid and chroot but both are already available in the standard library presentation of fork+exec.

Russ

Jonas Pfenniger (zimbatm)

unread,
Jan 23, 2013, 10:40:31 AM1/23/13
to Russ Cox, golang-nuts
I'm using a combination of unshare, setuid/gid, chroot and cgroups to create a small jailer tool under Linux. CLONE_NEWFS is great, it lets you mount filesystems that are just visible to your process and it's childs. Go was super useful because it already contains helpers for all these tools, the only issue I had was when I wanted to atomically add the child pid in the cgroup before exec but I got a workaround using the Ptrace flag.

Now, some of the programs I run want access to the /proc vfs but if I mount that in the jailed chroot I give access to the whole PID namespace and would thus ideally like to restrict it. CLONE_NEWPID is useful just for that, your cloned process will become PID 1 in it's own namespace and a newly mounted procfs will only show that process and it's childs. Most of the CLONE_NEW* flags in clone() are also usable using unshare() which allows you to set them on your current process instead of on a child but CLONE_NEWPID seems to be an exception, probably because how it tampers with the pids. That's why I need to have access to the clone() syscall.

Does it make sense ?

minux

unread,
Jan 28, 2013, 1:30:28 PM1/28/13
to tob...@gmail.com, golan...@googlegroups.com, jo...@pfenniger.name

On Mon, Jan 28, 2013 at 10:36 AM, <tob...@gmail.com> wrote:
Your post inspired me to rewrite my "nschroot" tool in Go and it works fine. I found most of what I needed by sniffing around in the syscall package source.


I'm not sure if the syscall.ForkLock.Lock() is necessary, but from reading syscall/exec_unix.go, it sounded like a good idea. http://golang.org/src/pkg/syscall/exec_unix.go?s=6845:6910#L180

I couldn't find any good information on the dangers of running after fork() in go. In my call to clone() I'm careful not to share any more of the process than is necessary so it should be fairly
in short, there is some complex interactions between fork(2) and threads. 
safe to continue doing things, but I haven't tried it yet since nschroot execs right away. It's likely the GC/CoW interactions will use up a little extra memory, but that's normal for fork(). All the usual rules of fork() apply.

Putting children into cgroups can be done by a double fork. Fork once, add that pid to the cgroup's tasks file, then fork your real work inside the namespace with CLONE_PARENT and let the middle child exit.
Reply all
Reply to author
Forward
0 new messages