Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to get rid of a zombie/defunct process?

0 views
Skip to first unread message

Jonathon Sako

unread,
Mar 23, 1998, 3:00:00 AM3/23/98
to

Is there anyway to get rid of a defunct process that init (pid 1) has
assumed ownership of? It doesn't appear so...

Barry Margolin

unread,
Mar 23, 1998, 3:00:00 AM3/23/98
to

In article <3516BB...@megsinet.com>,

Jonathon Sako <js...@megsinet.com> wrote:
>Is there anyway to get rid of a defunct process that init (pid 1) has
>assumed ownership of? It doesn't appear so...

If init is the parent, it should automatically reap any <defunct>
processes. But I've seen some versions of Unix where this was broken
(I saw it in AIX about 4 years ago). If this is happening to you, contact
your vendor.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Cambridge, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.

bre...@dickens.com

unread,
Mar 23, 1998, 3:00:00 AM3/23/98
to

Reboot your machine.

Virus

unread,
Mar 24, 1998, 3:00:00 AM3/24/98
to

bre...@dickens.com wrote in message <3516CC...@dickens.com>...
>Reboot your machine.


ARE YOU NUTS?!?

Jesus, you can do a kill -9 (as root or super-user) on the process or look
for its parent process via ps -efl | grep def <or> ps -efl | grep Z

Once you have found the parent process, then you can sort out why the child
is zombing out and attempt to debug it. If its a inetd problem, fix it via
kill then do 'init q' to restart the service.

Rebooting the machine is _only_ last resort.

Greg Wimpey

unread,
Mar 24, 1998, 3:00:00 AM3/24/98
to

Barry Margolin <bar...@bbnplanet.com> writes:

> In article <3516BB...@megsinet.com>,


> Jonathon Sako <js...@megsinet.com> wrote:
> >Is there anyway to get rid of a defunct process that init (pid 1) has
> >assumed ownership of? It doesn't appear so...
>

> If init is the parent, it should automatically reap any <defunct>
> processes. But I've seen some versions of Unix where this was broken
> (I saw it in AIX about 4 years ago). If this is happening to you, contact
> your vendor.
>

Probably more likely than a bug in 'init' is a problem with the
/etc/inittab file. For instance, if there is a bogus "wait" entry in
the inittab file that never finishes at boot time, init never goes
into its loop of calling wait() on orphaned child processes (it's
supposed to do this once per minute on AIX, if I recall). I think bad
tty entries in inittab can cause these problems, too, although I've
never seen that happen personally.

I did see an even funkier problem on AIX once, though, where inittab
was edited with a text editor that didn't put a newline at the end of
the last line in the file. But that prevented the system from booting
at all. Moral: use the AIX "chitab/mkitab/rmitab" commands rather
than trying to do it by hand.

--
Greg Wimpey Technical Coordinator, Processing and R&D
greg....@waii.com Western Geophysical Denver Processing Center

Scott L. Fields

unread,
Mar 24, 1998, 3:00:00 AM3/24/98
to

Run "who -p" and "who -d" to make sure ALL entries in the /etc/inittab
have been run.

Specifically make sure that all entries of type "wait" in /etc/inittab
are in the "who -d"
output. If you have an entry that never completes, this can keep init
from reaping zombie
processes.

Ulrich Rohde

unread,
Mar 24, 1998, 3:00:00 AM3/24/98
to

Virus schrieb in Nachricht <6f8kk9$2b6$1...@server.cntfl.com>...


>
>bre...@dickens.com wrote in message <3516CC...@dickens.com>...
>>Reboot your machine.
>
>
>ARE YOU NUTS?!?
>
>Jesus, you can do a kill -9 (as root or super-user) on the process or look
>for its parent process via ps -efl | grep def <or> ps -efl | grep Z
>
>Once you have found the parent process, then you can sort out why the child
>is zombing out and attempt to debug it. If its a inetd problem, fix it via
>kill then do 'init q' to restart the service.
>
>Rebooting the machine is _only_ last resort.
>

> Hallo Jonathon!
>
> Never I read from a so intelligent Boy (I think so) then this answer
before.
> But nobody is perfect. The question is ok and his answer is a big shit.:-)
> But at least I have an idea for your problem.
> You can get a lot of problems when you start Jobs (Shellscripts) from the
> prompt. If this prozess starts more than one childprozesses and the
> parentprozess dies, other prozesses further runs under init (1).
> So look alltimes when you stop prozesses, that you stop the childprozesses
> first. So you will be immune against problems like yours.
>
> Sorry for my bad english! But nobody is perfect:-)
> Uli
> Kassel, Germany

Barry Margolin

unread,
Mar 25, 1998, 3:00:00 AM3/25/98
to

In article <6f8kk9$2b6$1...@server.cntfl.com>, Virus <vi...@gnt.net> wrote:
>bre...@dickens.com wrote in message <3516CC...@dickens.com>...
>>Reboot your machine.
>
>
>ARE YOU NUTS?!?
>
>Jesus, you can do a kill -9 (as root or super-user) on the process or look
>for its parent process via ps -efl | grep def <or> ps -efl | grep Z

If it's a zombie, kill -9 won't have any effect -- the process is already
dead.

>Once you have found the parent process, then you can sort out why the child
>is zombing out and attempt to debug it. If its a inetd problem, fix it via
>kill then do 'init q' to restart the service.

The original post said that the parent is init, pid 1.

Mark Greene

unread,
Mar 25, 1998, 3:00:00 AM3/25/98
to

Scott L. Fields wrote:
>
> Run "who -p" and "who -d" to make sure ALL entries in the /etc/inittab
> have been run.
>
> Specifically make sure that all entries of type "wait" in /etc/inittab
> are in the "who -d"
> output. If you have an entry that never completes, this can keep init
> from reaping zombie
> processes.
>

If this proves to be the case, what then? Kill the process in "wait"? On
the other and, is "wait" a legitamate state, and therefore should there
be a way to resolve the problem so the "wait" state is cleared?

> Jonathon Sako wrote:
>
> > Is there anyway to get rid of a defunct process that init (pid 1) has
> > assumed ownership of? It doesn't appear so...

--
Mark Greene The above opinions are mine, not my employer's.
Work: greenemj AT hlthsrc DOT com Home: Prgrmr AT aol DOT com
http://members.aol.com/Prgrmr/basic.html

Gustav Yeung

unread,
Mar 26, 1998, 3:00:00 AM3/26/98
to Virus
Virus wrote:

> bre...@dickens.com wrote in message <3516CC...@dickens.com>...
> >Reboot your machine.
>
> ARE YOU NUTS?!?
>
> Jesus, you can do a kill -9 (as root or super-user) on the process or look
> for its parent process via ps -efl | grep def <or> ps -efl | grep Z
>

> Once you have found the parent process, then you can sort out why the child
> is zombing out and attempt to debug it. If its a inetd problem, fix it via
> kill then do 'init q' to restart the service.
>

This would theoretically work for all processes _except_ zombies. A zombie is a
process terminated before sending a SIGCHLD sigal to its parent process. The
parent process needs that signal to determine the exit status of child
processes, for the sake of robust and good programming and good handling the
system resource that kernel allocated. The parent process expects the signal
with wait() or waitpid() system calls. As the zombie dies before the parent
knows about it, it doesn't release the resource and notify the kernel to
destroy its process structure. That's why you can still see zombies in the
output of "ps -efl"

In non-technical terms, a zombie is an already-dead process (hence the name
zombie). How can you kill a dead process?

With a bit of luck and sufficient level of patching, the zombies would
disappear by themselves.

> Rebooting the machine is _only_ last resort.
>

Rebooting is the last AND only resort.

Gustav

--
-------------------------
Gustav Yeung Kwong Fung
Senior System Engineer, Vanda Computer & Equip. Co Ltd.
Certified AIX Support Professional
Certified RS/6000 SP Specialist

(Please remove NOSPAM when replying)
gustav...@vandagroup.com gus...@netvigator.com

Tel: (852) 2197 2194
Fax: (852) 2197 2323

vcard.vcf

Virus

unread,
Mar 26, 1998, 3:00:00 AM3/26/98
to

Some smartass said that I should apologize to the group, but I made no
direct statement. I expressed that rebooting should be a last resort even if
the PID of a zombie/defunct process is '1'. If you want to reboot and still
have that problem, don't come running back. You should have a few ideas for
a debugging process for even as low as checking to make sure your kernel is
buggy (and btw, inetd is _not_ a child process, its a daemon <master
object>).

We are all entitled to our opinion so bugger off =)

Gustav Yeung Kwong Fung

unread,
Mar 27, 1998, 3:00:00 AM3/27/98
to

In article <6fe1mt$ke3$1...@server.cntfl.com>, vi...@gnt.net says...

> Some smartass said that I should apologize to the group, but I made no
> direct statement. I expressed that rebooting should be a last resort even if
> the PID of a zombie/defunct process is '1'. If you want to reboot and still
> have that problem, don't come running back. You should have a few ideas for
> a debugging process for even as low as checking to make sure your kernel is
> buggy (and btw, inetd is _not_ a child process, its a daemon <master
> object>).
>

According to my Unix textbooks, init will not and can't be zombies. The
consequences of premature death of child processes should have been well
aware of, and the implementors of Unix should know what happens to init
if it become zombies. (init is a child of nobody BTW) There must have
been some hard-coded resolutions if init could become zombies, in good
implementations.

I conceive zombies as Unix design loophole, rather than implementation
errors.

According to my experience, it happened to me that when a resource-
insufficient system is I/O intensive, and the applications (pending the
I/O's) are so smart that they terminate themselve well before the kernel
switches the context to them, there would be lots of zombies. I would
recommend to reboot the machine and start no applications. Then start
them one by one and keep checking the output of "ps -efl".

By the way, inetd is the child of system resource controller
(/usr/sbin/srcmstr).

Gustav

--
----------------
Gustav Yeung
Senior System Engineer; Vanda Computer & Equipment Co. Ltd.


Certified AIX Support Professional
Certified RS/6000 SP Specialist

gustav...@vandagroup.com
gus...@netvigator.com

Tel : (+852) 2197 2194
Fax : (+852) 2197 2323

Michael O'Sullivan

unread,
Jul 9, 1998, 3:00:00 AM7/9/98
to
If init has adopted the process, as far as I know the only way to get
rid of it is to reboot your system. That is the easiest way anyway.

Michael.

Jim Dennis

unread,
Jul 9, 1998, 3:00:00 AM7/9/98
to
Michael O'Sullivan (osu...@networx.net.au) wrote:
: If init has adopted the process, as far as I know the only way to get

: rid of it is to reboot your system. That is the easiest way anyway.

If init has adopted a process *and* its a zombie (not blocked
on I/O --- but actually a zombie) for more than a few minutes
--- there's a bug in your 'init'!

Replace your init?

: Michael.

: Mark Greene wrote:
:> Scott L. Fields wrote:
:>> Run "who -p" and "who -d" to make sure ALL entries in the /etc/inittab
:>> have been run.
:>>
:>> Specifically make sure that all entries of type "wait" in /etc/inittab
:>> are in the "who -d"
:>> output. If you have an entry that never completes, this can keep init
:>> from reaping zombie
:>> processes.
:>
:> If this proves to be the case, what then? Kill the process in "wait"? On
:> the other and, is "wait" a legitamate state, and therefore should there
:> be a way to resolve the problem so the "wait" state is cleared?
:>
:>> Jonathon Sako wrote:
:>>> Is there anyway to get rid of a defunct process that init (pid 1) has
:>>> assumed ownership of? It doesn't appear so...

--
Jim Dennis,
Starshine Technical Services http://www.starshine.org

0 new messages