Unkillable processes due to PTRACE_TRACEME

208 views
Skip to first unread message

Dmitry Vyukov

unread,
Oct 19, 2015, 1:53:37 PM10/19/15
to LKML, Oleg Nesterov, rol...@hack.frob.com, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
Hello,

The following program hangs in some interesting state and is not
killable (started by a normal user, not root):


// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <pthread.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <stdio.h>
#include <signal.h>

void *thr(void *arg) {
ptrace(PTRACE_TRACEME, 0, 0, 0);
sleep(3);
kill(getpid(), SIGCHLD);
return 0;
}

int main() {
if (fork() == 0) {
sleep(1);
pthread_t th;
pthread_create(&th, 0, thr, 0);
sleep(1);
}
return 0;
}


The child process attaches as tracee to init process and then hangs in
a state that I don't understand. When I did a similar thing but
attached it to a normal parent process (shell), I still was able to
get rid of it by killing parent (shell). But definitely you don't want
to kill init.

I am not sure who is guilty here, but an unkillable process started by
a normal user looks like an issue in itself.
I am not sure whether it makes sense to allow to attach as tracee to
init. But I've been told that it can make sense in some security
setups where init traces everything.
Also, what is that state that the process hangs in? It looks like a
usual un-waited process, but when I just do ptrace(PTRACE_TRACEME) in
main, the process does not hang. The additional thread somehow makes a
difference.


I am on commit f9fbf6b72ffaaca8612979116c872c9d5d9cc1f5 (Sep 24).

Found with syzkaller system call fuzzer.

Thank you

Oleg Nesterov

unread,
Oct 19, 2015, 3:52:44 PM10/19/15
to Dmitry Vyukov, LKML, rol...@hack.frob.com, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
On 10/19, Dmitry Vyukov wrote:
>
> The following program hangs in some interesting state and is not
> killable (started by a normal user, not root):

Thanks.

> #include <pthread.h>
> #include <unistd.h>
> #include <sys/ptrace.h>
> #include <stdio.h>
> #include <signal.h>
>
> void *thr(void *arg) {
> ptrace(PTRACE_TRACEME, 0, 0, 0);
> sleep(3);
> kill(getpid(), SIGCHLD);
> return 0;
> }
>
> int main() {
> if (fork() == 0) {
> sleep(1);
> pthread_t th;
> pthread_create(&th, 0, thr, 0);
> sleep(1);
> }
> return 0;
> }
>
>
> The child process attaches as tracee to init process

Yes, although in a racy manner, the parent can exit after
PTRACE_TRACEME in this case the kernel will untrace the task
before reparenting. Not that this matters.

> and then hangs in
> a state that I don't understand. When I did a similar thing but
> attached it to a normal parent process (shell), I still was able to
> get rid of it by killing parent (shell).

See above.

So I bet the problem is that your /sbin/init doesn't use __WALL,
so wait() doesn't reap the traced zombie sub-thread, and thus it
can't release the non-empty thread group.

Could you please verify? Just do "strace -p1" and send SIGCHLD to
init.

perhaps eligible_child() should assume WALL if ptrace && ZOMBIE...

Oleg.

Dmitry Vyukov

unread,
Oct 19, 2015, 4:17:40 PM10/19/15
to Oleg Nesterov, LKML, rol...@hack.frob.com, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
I am using Ubuntu.
Here strace output from init:

waitid(P_ALL, 0, {}, WNOHANG|WEXITED|WSTOPPED|WCONTINUED, NULL) = 0

So what should be fixed here? Kernel of distro init?

Dmitry Vyukov

unread,
Oct 20, 2015, 4:35:08 AM10/20/15
to Oleg Nesterov, LKML, Roland McGrath, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
waitpid(__WALL) indeed joins these processes.
But __WALL can't be used with waitid and Ubuntu init uses waitid...

Dmitry Vyukov

unread,
Oct 20, 2015, 4:40:10 AM10/20/15
to Oleg Nesterov, LKML, Roland McGrath, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
I am thinking how to workaround this issue.

The following program joins both child processes:

#include <pthread.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <stdio.h>
#include <errno.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>

void *thr(void *arg) {
ptrace(PTRACE_TRACEME, 0, 0, 0);
return 0;
}

int main() {
int pid = fork();
if (pid == 0) {
pthread_t th;
pthread_create(&th, 0, thr, 0);
sleep(1);
return 0;
}
siginfo_t info = {};
int status = 0;
int res = waitpid(-1, &status, __WALL);
printf("pid=%d res=%d errno=%d\n", pid, res, errno);
res = waitpid(-1, &status, __WALL);
printf("pid=%d res=%d errno=%d\n", pid, res, errno);
return 0;
}


However, I need to wait for a particular child and if I change the
first waitpid to:

int res = waitpid(pid, &status, __WALL);

then it does not terminate.
So how can I wait for such child process?

Oleg Nesterov

unread,
Oct 20, 2015, 6:59:15 AM10/20/15
to Dmitry Vyukov, LKML, Roland McGrath, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
On 10/20, Dmitry Vyukov wrote:
>
> On Tue, Oct 20, 2015 at 10:34 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> > On Mon, Oct 19, 2015 at 10:17 PM, Dmitry Vyukov <dvy...@google.com> wrote:
> >> On Mon, Oct 19, 2015 at 9:49 PM, Oleg Nesterov <ol...@redhat.com> wrote:
> >>>
> >>> So I bet the problem is that your /sbin/init doesn't use __WALL,
> >>> so wait() doesn't reap the traced zombie sub-thread, and thus it
> >>> can't release the non-empty thread group.
> >>>
> >>> Could you please verify? Just do "strace -p1" and send SIGCHLD to
> >>> init.
> >>>
> >>> perhaps eligible_child() should assume WALL if ptrace && ZOMBIE...
> >>
> >>
> >> I am using Ubuntu.
> >> Here strace output from init:
> >>
> >> waitid(P_ALL, 0, {}, WNOHANG|WEXITED|WSTOPPED|WCONTINUED, NULL) = 0
> >>
> >> So what should be fixed here? Kernel of distro init?
> >
> > waitpid(__WALL) indeed joins these processes.

Thanks. And I just checked Fedora 22, it doesn't use __WALL too.

So I think we should change the kernel even if this is not a bug...
I'll send the patch.

> > But __WALL can't be used with waitid and Ubuntu init uses waitid...

Yes, and I never understood why. Perhaps we should change this too.
You can't. This is one of historical oddities. You need to reap the
traced sub-thread first. And PTRACE_DETACH doesn't work.

Oleg.

Pavel Machek

unread,
Dec 3, 2015, 3:56:15 PM12/3/15
to Oleg Nesterov, Dmitry Vyukov, LKML, Roland McGrath, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
Hi!
If kill -9 does not take out the process, surely that sounds like a
security problem?

I know ptrace is old and tricky and ugly, but ....?

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Oleg Nesterov

unread,
Dec 4, 2015, 2:02:22 PM12/4/15
to Pavel Machek, Dmitry Vyukov, LKML, Roland McGrath, syzk...@googlegroups.com, Kostya Serebryany, Alexander Potapenko, Robert Swiecki, Kees Cook, Julien Tinnes, Eric Dumazet
Hi Pavel,

On 12/03, Pavel Machek wrote:
>
> > You can't. This is one of historical oddities. You need to reap the
> > traced sub-thread first. And PTRACE_DETACH doesn't work.
>
> If kill -9 does not take out the process,

Just in case, "kill -9" can't help because the task is already killed and
zombie. The problem is that /sbin/init can't reap it without __WALL unless
we change the kernel.

> surely that sounds like a
> security problem?
>
> I know ptrace is old and tricky and ugly, but ....?

Yes this should be fixed. I'll resend the patches next week, I am a bit busy
now.

And, Dmitry, I didn't forget about another problem you reported ;) I'll try
to redo/resend the fixes for WARN_ON() in in task_participate_group_stop()
as well.

Oleg.

Reply all
Reply to author
Forward
0 new messages