Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

mtv leaves a zombie after exit

1 view
Skip to first unread message

Don Lewis

unread,
May 27, 2003, 1:18:56 AM5/27/03
to
On 12 May, To: Do...@freebsd.org wrote:
> On 12 May, Doug Barton wrote:
>> On Mon, 12 May 2003, Terry Lambert wrote:
>>
>>> A "ps -gaxl" will print the wait channel, which may be more
>>> informative.
>>
>> UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
>> 1000 0 1 0 -84 0 0 0 - ZW p4 0:00.00 (mtvp)
>>
>> BTW, inre your question about the shell, it's bash. But, I get the exact
>> same results if mtv is started as a child of the shell, as a child of
>> windowmaker, or as a child of netscape.
>
> Does this application use Linux threads? The following code in wait1()
> makes me think that if a thread somehow gets orphaned by the parent
> Linux process, it will never get reaped. The exit code for Linux should
> probably wait for any child threads to exit.
>
> LIST_FOREACH(p, &q->p_children, p_sibling) {
> PROC_LOCK(p);
> if (uap->pid != WAIT_ANY &&
> p->p_pid != uap->pid && p->p_pgid != -uap->pid) {
> PROC_UNLOCK(p);
> continue;
> }
>
> /*
> * This special case handles a kthread spawned by linux_clone
> * (see linux_misc.c). The linux_wait4 and linux_waitpid
> * functions need to be able to distinguish between waiting
> * on a process and waiting on a thread. It is a thread if
> * p_sigparent is not SIGCHLD, and the WLINUXCLONE option
> * signifies we want to wait for threads and not processes.
> */
> if ((p->p_sigparent != SIGCHLD) ^
> ((uap->options & WLINUXCLONE) != 0)) {
> PROC_UNLOCK(p);
> continue;
> }

I did some more digging, and it doesn't look like the p_sigparent test
should be causing problems, at least in the normal case, because of the
following code in exit1():

sx_xlock(&proctree_lock);
q = LIST_FIRST(&p->p_children);
if (q != NULL) /* only need this if any child is S_ZOMB */
wakeup(initproc);
for (; q != NULL; q = nq) {
nq = LIST_NEXT(q, p_sibling);
PROC_LOCK(q);
proc_reparent(q, initproc);
q->p_sigparent = SIGCHLD;
/*
* Traced processes are killed
* since their existence means someone is screwing up.
*/
if (q->p_flag & P_TRACED) {
q->p_flag &= ~P_TRACED;
psignal(q, SIGKILL);
}
PROC_UNLOCK(q);
}

If the parent process exits while it has outstanding Linux threads, it
should change p_sigparent to SIGCHLD, which should allow init to reap
it. The locking also looks ok to me.

There are still a couple of possibilities. The first is the following
code in kern_ptrace() that handles PT_DETACH:

if (req == PT_DETACH) {
/* reset process parent */
if (p->p_oppid != p->p_pptr->p_pid) {
struct proc *pp;

PROC_UNLOCK(p);
pp = pfind(p->p_oppid);
if (pp == NULL)
pp = initproc;
else
PROC_UNLOCK(pp);
PROC_LOCK(p);
proc_reparent(p, pp);
}
p->p_flag &= ~(P_TRACED | P_WAITED);
p->p_oppid = 0;

/* should we send SIGCHLD? */
}

If the Linux thread in question were being traced when the parent
process exited, it looks like the thread could get reparented to init
without having p_sigparent set to SIGCHLD.

A more likely cause of the problem is this code in exit1():

/*
* Notify parent that we're gone. If parent has the PS_NOCLDWAIT
* flag set, or if the handler is set to SIG_IGN, notify process
* 1 instead (and hope it will handle this situation).
*/
PROC_LOCK(p->p_pptr);
mtx_lock(&p->p_pptr->p_sigacts->ps_mtx);
if (p->p_pptr->p_sigacts->ps_flag & (PS_NOCLDWAIT | PS_CLDSIGIGN)) {
struct proc *pp;

mtx_unlock(&p->p_pptr->p_sigacts->ps_mtx);
pp = p->p_pptr;
PROC_UNLOCK(pp);
proc_reparent(p, initproc);
PROC_LOCK(p->p_pptr);
/*
* If this was the last child of our parent, notify
* parent, so in case he was wait(2)ing, he will
* continue.
*/
if (LIST_EMPTY(&pp->p_children))
wakeup(pp);

If the parent process of the Linux thread set its SIGCHLD handler to
SIG_IGN, then when the Linux thread exited, it would be reparented to
init but its p_sigparent would not be set to SIGCHLD, and init would not
be able to reap the thread.

As a quick and dirty test, try the patch below. One thing I'm not sure
about is how Linux threads and SIG_IGN should really mix.

Index: sys/kern/kern_exit.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.214
diff -u -r1.214 kern_exit.c
--- sys/kern/kern_exit.c 13 May 2003 20:35:59 -0000 1.214
+++ sys/kern/kern_exit.c 27 May 2003 04:39:19 -0000
@@ -439,6 +439,7 @@
pp = p->p_pptr;
PROC_UNLOCK(pp);
proc_reparent(p, initproc);
+ p->p_sigparent = SIGCHLD;
PROC_LOCK(p->p_pptr);
/*
* If this was the last child of our parent, notify
Index: sys/kern/sys_process.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/sys_process.c,v
retrieving revision 1.108
diff -u -r1.108 sys_process.c
--- sys/kern/sys_process.c 25 Apr 2003 20:02:16 -0000 1.108
+++ sys/kern/sys_process.c 27 May 2003 04:39:42 -0000
@@ -587,6 +587,7 @@
PROC_UNLOCK(pp);
PROC_LOCK(p);
proc_reparent(p, pp);
+ p->p_sigparent = SIGCHLD;
}
p->p_flag &= ~(P_TRACED | P_WAITED);
p->p_oppid = 0;

_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-curre...@freebsd.org"

0 new messages