zombies

tbel...@riatax.com

unread,

Jun 23, 1995, 3:00:00 AM6/23/95

to

*** Tom Bellavia *** Research Instatute Of America Valhalla NY.
I have oracle 7.1.4 both versions of sqlnet 1 & 2 running. I am a sun1000E
Solaris 2.4 Platform. I have all the problems that everone else have (space
leaks ect). What I have is a different problem.

We are a big client / server shop. 300 - 400 PC's over a sizable Wan
connection. I have alot of people doing ctrl + alt + del when the net is slow.
Oracle (v1 of sqlnet) leaves these guys out there. Dropping the orasrv
connection at night not any help. I need a script that run under unix that
looks for zombies or dupl unix signon and blow them away. Anybody got any of
these or posible suggestions for resolution. Thanx ......

John Fruetel

unread,

Jun 25, 1995, 3:00:00 AM6/25/95

to

tbel...@riatax.com wrote:

You might want to see if you can fiddle with the TCP/IP KEEPALIVE
option on your host. I don't know about Solaris, but this can be
changed on SCO Unix from the default of two hours to something else (5
minutes in our case) and any "zombies" get nuked after 5 minutes.
John Fruetel
jfru...@ainet.com

John Marcus

unread,

Jun 26, 1995, 3:00:00 AM6/26/95

to

jfru...@ainet.com (John Fruetel) wrote:

>tbel...@riatax.com wrote:

As far as we are aware, Solaris does not have a TCP/IP KEEPALIVE
option (Unlike SunOS). ORACLE WWS's only answer was to upgrade from
SQL*Net V1 to SQL*Net V2, which has "KEEPALIVE" built into it rather
that relying on the O/S. I'm yet to test this out yet...has anyone
else got this working?

John Marcus (jma...@spirit.com.au)
Database Administration
Australian Bureau of Statistics
Canberra, Australia

Chuck Hamilton

unread,

Jun 26, 1995, 3:00:00 AM6/26/95

to

In <DApr6...@koko.csustan.edu> jfru...@ainet.com (John Fruetel)
writes:

>
>You might want to see if you can fiddle with the TCP/IP KEEPALIVE
>option on your host. I don't know about Solaris, but this can be
>changed on SCO Unix from the default of two hours to something else (5
>minutes in our case) and any "zombies" get nuked after 5 minutes.

Wouldn't that also "nuke" processes that were running long ( > 5
minute) queries?
--
Chuck Hamilton
chu...@ix.netcom.com

Never share a foxhole with anyone braver than yourself!

Larry Fishman

unread,

Jun 26, 1995, 3:00:00 AM6/26/95

to

John Marcus (jma...@spirit.com.au) wrote:
: jfru...@ainet.com (John Fruetel) wrote:

: >tbel...@riatax.com wrote:

: >>*** Tom Bellavia *** Research Instatute Of America Valhalla NY.
: >>I have oracle 7.1.4 both versions of sqlnet 1 & 2 running. I am a sun1000E
: >>Solaris 2.4 Platform. I have all the problems that everone else have (space
: >>leaks ect). What I have is a different problem.

: >>We are a big client / server shop. 300 - 400 PC's over a sizable Wan
: >>connection. I have alot of people doing ctrl + alt + del when the net is slow.
: >>Oracle (v1 of sqlnet) leaves these guys out there. Dropping the orasrv
: >>connection at night not any help. I need a script that run under unix that
: >>looks for zombies or dupl unix signon and blow them away. Anybody got any of
: >>these or posible suggestions for resolution. Thanx ......

: >You might want to see if you can fiddle with the TCP/IP KEEPALIVE

: >option on your host. I don't know about Solaris, but this can be
: >changed on SCO Unix from the default of two hours to something else (5
: >minutes in our case) and any "zombies" get nuked after 5 minutes.

: >John Fruetel
: >jfru...@ainet.com

: As far as we are aware, Solaris does not have a TCP/IP KEEPALIVE
: option (Unlike SunOS). ORACLE WWS's only answer was to upgrade from
: SQL*Net V1 to SQL*Net V2, which has "KEEPALIVE" built into it rather
: that relying on the O/S. I'm yet to test this out yet...has anyone
: else got this working?

we use sql*net v2.1 under Solaris 2.3 on the server and in the
sqlnet.ora file there is a line you may put in which reads:

sqlnet.expire_time=4

this kills/disconnects sessions checking every 4 minutes. it does work.
I use it doing client-server where the client runs under windows and may
GPF/user shuts-off the PC etc.

--

Tim Read - Sun Linlithgow - Principal SE and DB Specialist

unread,

Jun 27, 1995, 3:00:00 AM6/27/95

to

Solaris 2.x does have a keepalive option. All the parameters can be access via
the ndd command.

# ndd /dev/tcp \?
? (read only)
tcp_close_wait_interval (read and write)
tcp_conn_req_max (read and write)
tcp_conn_grace_period (read and write)
tcp_cwnd_max (read and write)
tcp_debug (read and write)
tcp_smallest_nonpriv_port (read and write)
tcp_ip_abort_cinterval (read and write)
tcp_ip_abort_interval (read and write)
tcp_ip_notify_cinterval (read and write)
tcp_ip_notify_interval (read and write)
tcp_ip_ttl (read and write)
tcp_keepalive_interval (read and write) <<<<<------
tcp_maxpsz_multiplier (read and write)
tcp_mss_def (read and write)
tcp_mss_max (read and write)
tcp_mss_min (read and write)
tcp_naglim_def (read and write)
tcp_old_urp_interpretation (read and write)
tcp_rexmit_interval_initial (read and write)
tcp_rexmit_interval_max (read and write)
tcp_rexmit_interval_min (read and write)
tcp_wroff_xtra (read and write)
tcp_deferred_ack_interval (read and write)
tcp_snd_lowat_fraction (read and write)
tcp_sth_rcv_hiwat (read and write)
tcp_sth_rcv_lowat (read and write)
tcp_dupack_fast_retransmit (read and write)
tcp_ignore_path_mtu (read and write)
tcp_rwin_credit_pct (read and write)
tcp_rcv_push_wait (read and write)
tcp_smallest_anon_port (read and write)
tcp_largest_anon_port (read and write)
tcp_xmit_hiwat (read and write)
tcp_xmit_lowat (read and write)
tcp_recv_hiwat (read and write)
tcp_fin_wait_2_flush_interval (read and write)
tcp_co_min (read and write)
tcp_status (read only)
tcp_bind_hash (read only)
tcp_listen_hash (read only)
tcp_conn_hash (read only)
tcp_queue_hash (read only)

Tim
---

P.S What space leaks - do you mean the VM system memory level goes down? If you
have the DB on the file system then this is just normal UNIX file cacheing.

In article q...@casper.spirit.com.au, jma...@spirit.com.au (John Marcus) writes:
>jfru...@ainet.com (John Fruetel) wrote:
>
>>tbel...@riatax.com wrote:
>
>
>>>*** Tom Bellavia *** Research Instatute Of America Valhalla NY.
>>>I have oracle 7.1.4 both versions of sqlnet 1 & 2 running. I am a sun1000E
>>>Solaris 2.4 Platform. I have all the problems that everone else have (space
>>>leaks ect). What I have is a different problem.
>
>>>We are a big client / server shop. 300 - 400 PC's over a sizable Wan
>>>connection. I have alot of people doing ctrl + alt + del when the net is slow.
>>>Oracle (v1 of sqlnet) leaves these guys out there. Dropping the orasrv
>>>connection at night not any help. I need a script that run under unix that
>>>looks for zombies or dupl unix signon and blow them away. Anybody got any of
>>>these or posible suggestions for resolution. Thanx ......
>
>>You might want to see if you can fiddle with the TCP/IP KEEPALIVE
>>option on your host. I don't know about Solaris, but this can be
>>changed on SCO Unix from the default of two hours to something else (5
>>minutes in our case) and any "zombies" get nuked after 5 minutes.
>>John Fruetel
>>jfru...@ainet.com
>
>As far as we are aware, Solaris does not have a TCP/IP KEEPALIVE
>option (Unlike SunOS). ORACLE WWS's only answer was to upgrade from
>SQL*Net V1 to SQL*Net V2, which has "KEEPALIVE" built into it rather
>that relying on the O/S. I'm yet to test this out yet...has anyone
>else got this working?
>
>

Lee Parsons

unread,

Jun 27, 1995, 3:00:00 AM6/27/95

to

Chuck Hamilton <chu...@ix.netcom.com> wrote:

>jfru...@ainet.com (John Fruetel) writes:
>>
>>You might want to see if you can fiddle with the TCP/IP KEEPALIVE
>>option on your host. I don't know about Solaris, but this can be
>>changed on SCO Unix from the default of two hours to something else (5
>>minutes in our case) and any "zombies" get nuked after 5 minutes.
>

>Wouldn't that also "nuke" processes that were running long ( > 5
>minute) queries?

No because the processes will still be there to talk to the backend. It
doen't matter that they are busy from a Application POV because the SYSTEM
side will take care of answering the Keepalive.

Also please note that this will do nothing to stop Orphan process that
are still running something. ie) user selects something from a REALLY long
table and reboots before it is finished.

In this case your keepalive timer will close the socket but the oracle
process will not look at the socket until it is finished running. Not a
big deal for most selects but on an endless pl/sql loop???

The onnly way to catch the above is to upgrade to SqlNet V2.? (not all
versions of V2 have the appl timer feature although most new versions do)
or to look for oracle processes that dont have a socket anymore and kill
them manually.

BTW- My Unix/TCP bias is showing. I have no clue how a NT server with
Pc frontends would solve this problem under IPX or if they would even have
the above problem. Anybody know?
--
Regards,

Lee E. Parsons
Systems Oracle DBA lpar...@world.std.com

Tony Jambu

unread,

Jun 29, 1995, 3:00:00 AM6/29/95

to

In article <3smds6$j...@ixnews5.ix.netcom.com>, chu...@ix.netcom.com (Chuck
Hamil
ton) writes:
> In <DApr6...@koko.csustan.edu> jfru...@ainet.com (John Fruetel)

> writes:
> >
> >You might want to see if you can fiddle with the TCP/IP KEEPALIVE
> >option on your host. I don't know about Solaris, but this can be
> >changed on SCO Unix from the default of two hours to something else (5
> >minutes in our case) and any "zombies" get nuked after 5 minutes.
>
> Wouldn't that also "nuke" processes that were running long ( > 5
> minute) queries?

No. Because the the tcp deamon will tru to communicate with the PC by
sending a probe. If no answer is received it waits for 75 seconds.
It then sends another probe. This goes on for 10 times and ONLY then
does it kills the process running on the server.

Having said that, I have to qualify this. The orginal posters question
was to do with ZOMBIES or defunc processes and clent PCs rebooting.
They are both different problems.

Zombies occur when the child process gets killed and is trying to communicate
with the parent process to inform it and is unable to. It then becomes
a zombie or defunc process. Zombie proceses DO NOT use any Oracle
resources nor take any machine resources. So is not a problem.

If the origianal poster is looking for duplicate UNIX signon, then that is a
different kettle of fish all together.

ta
tony

--
_____ ________ / ___ |Tony Jambu, Database Consultant
/_ _ /_ __ / |Wizard Consulting,Aust (ACN 065934778)
/(_)/ )(_/ \_/(///(/_)/_( |CIS: 10025...@compuserve.com FAX: +61-3-4163559
\_______/ |EMAIL:TJa...@wizard.com.au PHONE: +61-3-4122905

Database Admin

unread,

Jun 29, 1995, 3:00:00 AM6/29/95

to lpar...@eskimo.com

how to find out the oracle processes which don't
have a open socket?

thanks!

Database Admin

unread,

Jun 29, 1995, 3:00:00 AM6/29/95

to lpar...@eskimo.com

as mentioned in your article about the application timer
feature but if this is not present then
how to find out the oracle processes(backend or shadow) which don't have
a socket anymore 'cuz the application or thier parent has died.

response appreciated.
email : gsin...@us.oracle.com

Joel Garry

unread,

Jul 5, 1995, 3:00:00 AM7/5/95

to

In article <3st0u8$a...@newsserver.trl.OZ.AU> a...@phantom.telecom.com.au (Tony Jambu) writes:
>In article <3smds6$j...@ixnews5.ix.netcom.com>, chu...@ix.netcom.com (Chuck
>Hamil
>ton) writes:
>> In <DApr6...@koko.csustan.edu> jfru...@ainet.com (John Fruetel)
>> writes:
>> >
>> >You might want to see if you can fiddle with the TCP/IP KEEPALIVE
>> >option on your host. I don't know about Solaris, but this can be
>> >changed on SCO Unix from the default of two hours to something else (5
>> >minutes in our case) and any "zombies" get nuked after 5 minutes.
>>
>> Wouldn't that also "nuke" processes that were running long ( > 5
>> minute) queries?
>
>No. Because the the tcp deamon will tru to communicate with the PC by
>sending a probe. If no answer is received it waits for 75 seconds.
>It then sends another probe. This goes on for 10 times and ONLY then
>does it kills the process running on the server.
>
>Having said that, I have to qualify this. The orginal posters question
>was to do with ZOMBIES or defunc processes and clent PCs rebooting.
>They are both different problems.
>
>Zombies occur when the child process gets killed and is trying to communicate
>with the parent process to inform it and is unable to. It then becomes
>a zombie or defunc process. Zombie proceses DO NOT use any Oracle
>resources nor take any machine resources. So is not a problem.
>

That is, if you don't consider
PROC:TABLE IS FULL
blowing off random users a problem.

_I_ know how to deal with it, the problem is new customers who don't
know how to configure their systems find this very frightening.
--
Joel Garry joe...@amber.rossinc.com Compuserve 70661,1534
These are my opinions, not necessarily those of Ross Systems, Inc.
%DCL-W-SOFTONEDGEDONTPUSH, Software On Edge - Don't Push.
panic: ifree: freeing free inodes...