Well, I wish I ran to the server as soon as I noticed that- it's not my
workstation. It's the server itself. If I connect to SQL Plus, with all
the performance counters running, the server (cpu utilization, and all
processes' utlilization) barely blinks an eye. As soon as I disconnect from
SQL Plus, the CPU utlization jumps to 70%, falls back to about 50%, then
jumps up to about 60%, and as soon as the SQL prompt comes back, CPU
utilization goes to zero again.
I ran the counters for all the oracle background processes (PMON, LGWR,
etc), and they don't move; they're near zero. It's oracle.exe itself that
has high CPU % !! What could possibly be happening for there to be such a
load for a logoff? The database runs fast otherwise (large queries
included), but that logoff is the culprit. Even if I log on to sql plus,
and quickly disconnect without doing ANY dml/dll/select ops, the delay is
there. I even booted everyone off the system on Sunday night (okay, well,
nobody was on but me), and the same thing happens.
That was the extent of my real-life tests, by the way- I opened one sqlplus
session on the server, and after I was connected, with perfmon.msc running,
I would just issue conn/disc commands over & over again- and it doesn't
matter which Oracle user I log in as... I even created a user account for
this test, and granted only the create session privlege- so the user had no
access to any other tables other than 6 sys tables... hopefully
"eliminating" any triggers that are causing the problem. Same thing... all
the Oracle background processes (perfmon.msc- threads object) are quiet, but
the oracle.exe (perfmon.msc- process object) spikes up, and follows the CPU
(perfmon.msc- processor object) utilization graph.
The system wasn't always like this... back in October, I applied a patch to
our database (from our vendor) that changed some triggers around. I'm
wondering if something got added to the schema- but I can't seem to find it.
To perform these patches, I logged in as the application's schema owner- not
as sys- so I doubt the patches could've added a SYS trigger!
I checked all_triggers and dba_triggers for any event-driven logoff
procedures, and didn't find any. I shutdown/started the instance, nothing
changed. I disabled the server's virus scanner, nothing changed. I even
rebooted the server, and nothing changed. The disks aren't bogging down,
and if it was just the internet connection, then oracle.exe itself wouldn't
be using up CPU time. This happens for all our applications (slow logoff),
too. What's strange is that I have an old Pentium 3, 600 mhz, 512mb RAM, 6gb
HD as a test/development server, that also runs Oracle 8i. Although it's a
smaller database, it's got no problems- so it's definately not the hardware
on the production server!
Are there any triggers I've missed? Any other places I should check for
things? What happens when a database connection is closed?
We have auditing running; I searched the dba_audit_session, and found a few
"logoff by cleanup" commands, but we haven't had any of those yet today.
What is a "logoff by cleanup", anyway? Should I turn off auditing? Can
that slow a system down? The audit log is in the system tablespace... but
even if I don't do a query, if I just connect and disconnect, that shouldn't
be writing much to the audit table! We have "audit_trail = true" in the
init.ora file; should I comment that line out, and restart the instance, to
see what happens? Or... is there a spot in the audit logs that I've missed
that would help me?
I even took a peek at v$system_event, but the only timeout values were for
the events that "don't matter", like the rdbms ipc message, SQL*Net message
to client, pmon timer, etc. There was a "latch free" even that had 800
timeouts with 835 waits... but in all my simple connect/disconnect tests,
these numbers didn't budge. Any other places I should look?
I'm really confused because I would've expected a slow connection AND a slow
disconnection- or even just a slow connection! But a slow DISconnection? It
doesn't make sense. I'd increase the SGA, but that doesn't make sense
either- it's not slow to query, it's slow for a disconnection! I've got
processes=1000, which is more than adequate- especially if I'm the only one
connected! Some more specs on the parameters... buffer cache is 120 mb (8k
block, db_block_buffers =15000), dml_locks=2000, shared_pool_size=23 mb.
There are 12 datafiles, about 5.1 gb total. Total SGA is 175 mb, which I
thought was good (it's 3% of our total tablespace). Large pool is 0 mb;
we're not running MTS... plus, I seriously doubt this memory area is used by
a database disconnection!
Thanks for any/all clues or ideas or past experience! I've searched the
Oracle docs, but can't really find anything that talks about what happens
when a user closes their Oracle connection. Nothing in the tuning sections
mentions this scenario, either. Anything I've found on google regarding
"oracle.exe high cpu" deals with the background processes (dbw0 etc), not
oracle.exe itself. My main guess is that there's something going on as part
of a database disconnection that I don't know about. Sorry this question is
so long... I figure the more info I can give at the beginning, the better!
-Thomas
you aren't running a patchset such as 8.1.7.4.10, are you?
did you check the bdump and udump folders?
you may have processes leaking large amounts of memory at disconnect.
we hit this issue, the solution was to sit back on 8.1.7.4.6.
Pd
Srivenu, I don't think I have any logoff triggers... I had searched the
all_triggers and dba_triggers views' triggering_event columns, but didn't
see any. That covers all the triggers, right? Is there anywhere else they
could hide on me?
But you hit the answer! Thanks for the tip on session trace! I connected
thru SqlPlus, set the trace level to 8, and disconnected. Then I went to
the init.ora's user_dump_dest directory and picked up my trace file- Wow!
It's definately the audit table! I have over 1000 'db file scattered read'
events!
I did some quick poking around; turns out the poor thing is doing a full
table scan of the aud$ table. Apparently when the experts came out to
install our database, they moved the audit table out of the SYS tablespace
into it's own tablespace. Guess what- there's no index on aud$! My guess
is that they didn't know about the 'alter table mytable move tablespace
newtblspace' command... so they probably manually created the second aud$
table, inserted all the data into it, and forgot to recreate the index.
So I've got some work ahead... I'm going to kill auditing from the init.ora
file, cut down the current aud$ table into the useful stuff (such as the
entries at the beginning, when the system was installed- or when any table
changes were made). Then I'll create another tablespace, create an
audit_old table and copy the data in (probably with nologging). Then I'll
mark that new tablespace as read-only and take it offline. Finally I'll
truncate the original aud$ table, re-create the index on aud$ for sessionid
and sess$tid, and set up explicit auditing instead of auditing the whole
darn database. Anyone see any problems with this plan?
Thanks! I have to wonder if this over-extensive auditing was meant "just"
for installation, and was supposed to be turned off when the installation
was finished.
-Thomas
sorry - I forgot to mention, that the parameter cursor_sharing=FORCE
with 8.1.7.4.10 causes memory leaks. setting cursor_sharing back to
EXACT caused the memory leaks to stop.
Pd
regards
Srivenu