Surprising Performance Changes with Oracle 11.2.0.1 (Long Post)

Charles Hooper

unread,

Sep 4, 2009, 9:04:37 PM9/4/09

to

I am working on an Oracle performance related project, so I thought
that I would compare the performance of Oracle database 11.1.0.6 (on
64 bit Linux), 11.1.0.7 (on 64 bit Windows), and 11.2.0.1 (on 64 bit
Linux). I set up a test case with a table containing 100,000,000 rows
having an average row length of 53 bytes. A normal B*tree index
exists on a numeric column which contains numbers ranging from 0 to
10,000. A test query will be selecting 2,547,158 (2.55%) of the rows
from the table. Leaving OPTIMIZER_INDEX_COST_ADJ at the default of
100 with NOWORKLOAD system statistics results in a full table in all
three releases of Oracle. Setting OPTIMIZER_INDEX_COST_ADJ to 5, for
instance, results in an index access path. So, the question is:
should the OPTIMIZER_INDEX_COST_ADJ parameter be set to the lower
number to (quoting from a posting on the Internet) “immediately tune
all of the SQL in your database to favor index scans over full-table
scans”? I am certain that I know the answer based on extensive
testing, but what are your thoughts?

What about the performance results?
* Oracle 11.1.0.7 on 64 bit Windows finished the full tablescan access
path for the query in 35.38 seconds and the index access path in 1
hour, 38 minutes and 7 seconds.
* Oracle 11.1.0.6 on 64 bit Oracle Enterprise Linux 5 required 41.98
seconds for the full tablescan access path and one hour, 14 minutes
and 40 seconds for the index access path.
* Oracle 11.2.0.1 required 13.13 seconds (a significant drop from the
41.98 seconds received on the same server with the same operating
system and disk subsystem) and 22.13 seconds for the index access path
(an unbelievable drop from the one hour, 14 minutes and 40 seconds on
the same server, operating system, and I/O subsystem).

Obviously from the above, 11.2.0.1 is significantly faster at the
index access than is 11.1.0.6 on the same platform. But wait, I
forgot something. On 11.1.0.6 on Linux I had enabled direct I/O and
asynchronous I/O by setting the FILESYSTEMIO_OPTIONS parameter to
SETALL. I did not change that parameter on 11.2.0.1, so it defaulted
to NONE. What happens when the FILESYSTEMIO_OPTIONS parameter is set
to SETALL?
* Oracle 11.2.0.1 now required 42.45 seconds for the full tablescan
while the index access path required one hour, 16 minutes and 46
seconds.

Wait Event Summary:
SQL*Net message to client 0.025758 On Client/Network Min
Wait: 0.000001 Avg Wait: 0.000001 Max Wait: 0.000040
db file sequential read 0.048904 On DB Server Min
Wait: 0.004940 Avg Wait: 0.016301 Max Wait: 0.034032
direct path read 24.058316 On DB Server Min
Wait: 0.000001 Avg Wait: 0.003651 Max Wait: 0.121805
SQL*Net message from client 4.663131 On Client/Network Min
Wait: 0.000110 Avg Wait: 0.000183 Max Wait: 0.000638
db file scattered read 0.006442 On DB Server Min
Wait: 0.006442 Avg Wait: 0.006442 Max Wait: 0.006442

Wait Event Summary:
SQL*Net message to client 0.033969 On Client/Network Min
Wait: 0.000001 Avg Wait: 0.000001 Max Wait: 0.000010
db file sequential read 462.867187 On DB Server Min
Wait: 0.000134 Avg Wait: 0.005942 Max Wait: 0.128406
SQL*Net message from client 4.649001 On Client/Network Min
Wait: 0.000086 Avg Wait: 0.000183 Max Wait: 0.000655
db file parallel read 3988.799453 On DB Server Min
Wait: 0.003374 Avg Wait: 0.052228 Max Wait: 0.184168
db file scattered read 0.869661 On DB Server Min
Wait: 0.000189 Avg Wait: 0.005998 Max Wait: 0.048183

Wait Event Summary:
SQL*Net message to client 0.017904 On Client/Network Min
Wait: 0.000001 Avg Wait: 0.000001 Max Wait: 0.000021
Disk file operations I/O 0.001040 On DB Server Min
Wait: 0.001040 Avg Wait: 0.001040 Max Wait: 0.001040
db file sequential read 0.000047 On DB Server Min
Wait: 0.000012 Avg Wait: 0.000016 Max Wait: 0.000019
direct path read 1.674106 On DB Server Min
Wait: 0.000028 Avg Wait: 0.000262 Max Wait: 0.000703
SQL*Net message from client 4.359015 On Client/Network Min
Wait: 0.000081 Avg Wait: 0.000171 Max Wait: 0.001026
db file scattered read 0.000023 On DB Server Min
Wait: 0.000023 Avg Wait: 0.000023 Max Wait: 0.000023

Wait Event Summary:
SQL*Net message to client 0.013756 On Client/Network Min
Wait: 0.000001 Avg Wait: 0.000001 Max Wait: 0.000033
db file sequential read 0.413121 On DB Server Min
Wait: 0.000003 Avg Wait: 0.000005 Max Wait: 0.000270
SQL*Net message from client 2.358266 On Client/Network Min
Wait: 0.000066 Avg Wait: 0.000093 Max Wait: 0.000787
db file parallel read 7.574612 On DB Server Min
Wait: 0.000008 Avg Wait: 0.000099 Max Wait: 0.000482
db file scattered read 0.000842 On DB Server Min
Wait: 0.000019 Avg Wait: 0.000034 Max Wait: 0.000042

Wait Event Summary:
SQL*Net message to client 0.020031 On Client/Network Min
Wait: 0.000001 Avg Wait: 0.000001 Max Wait: 0.000027
Disk file operations I/O 0.001034 On DB Server Min
Wait: 0.001034 Avg Wait: 0.001034 Max Wait: 0.001034
db file sequential read 0.022515 On DB Server Min
Wait: 0.006175 Avg Wait: 0.007505 Max Wait: 0.008720
direct path read 24.816811 On DB Server Min
Wait: 0.000060 Avg Wait: 0.004305 Max Wait: 0.037980
SQL*Net message from client 5.124648 On Client/Network Min
Wait: 0.000159 Avg Wait: 0.000201 Max Wait: 0.001608
db file scattered read 0.005516 On DB Server Min
Wait: 0.005516 Avg Wait: 0.005516 Max Wait: 0.005516

Wait Event Summary:
SQL*Net message to client 0.028755 On Client/Network Min
Wait: 0.000001 Avg Wait: 0.000001 Max Wait: 0.000031
db file sequential read 484.066367 On DB Server Min
Wait: 0.000130 Avg Wait: 0.006024 Max Wait: 0.143995
SQL*Net message from client 4.916161 On Client/Network Min
Wait: 0.000105 Avg Wait: 0.000193 Max Wait: 0.001608
db file parallel read 4091.261439 On DB Server Min
Wait: 0.005298 Avg Wait: 0.053553 Max Wait: 0.195775
db file scattered read 0.175371 On DB Server Min
Wait: 0.000354 Avg Wait: 0.007307 Max Wait: 0.015409

Posing the question again: should the OPTIMIZER_INDEX_COST_ADJ
parameter be set to the lower number to “immediately tune all of the
SQL in your database to favor index scans over full-table scans”?
Some questions do not have simple answers.

Incidentally, essentially the same parameters were used for 11.1.0.6,
11.1.0.7 and 11.2.0.1 with the SGA_TARGET set to 8G, the
DB_KEEP_CACHE_SIZE set to 6G (the test table used the DEFAULT buffer
pool), the PGA_AGGREGATE_TARGET set to 1800M, and most other
parameters at their default values.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc .

Jonathan Lewis

unread,

Sep 5, 2009, 2:04:45 AM9/5/09

to

> "Charles Hooper" <hoope...@yahoo.com> wrote in message
> news:1ea99c67-8713-4ece...@v2g2000vbb.googlegroups.com...

> I am working on an Oracle performance related project, so I thought
> that I would compare the performance of Oracle database 11.1.0.6 (on
> 64 bit Linux), 11.1.0.7 (on 64 bit Windows), and 11.2.0.1 (on 64 bit
> Linux).

...

Charles,

Interesting figures.
A couple of follow-on questions:
How much memory in the machine ?
How many CPUs
Were you running the client locally - ipc or tcp
How much CPU usage "outside" Oracle in each test
Any significant variation in 'prefetch' blocks
Any other significant variation in stats

--
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Aman Sharma

unread,

Sep 5, 2009, 3:27:31 AM9/5/09

to

Very nice Charles!

regards
Aman....

Robert Klemme

unread,

Sep 5, 2009, 6:41:53 AM9/5/09

to

On 05.09.2009 08:04, Jonathan Lewis wrote:
>> "Charles Hooper" <hoope...@yahoo.com> wrote in message
>> news:1ea99c67-8713-4ece...@v2g2000vbb.googlegroups.com...
>> I am working on an Oracle performance related project, so I thought
>> that I would compare the performance of Oracle database 11.1.0.6 (on
>> 64 bit Linux), 11.1.0.7 (on 64 bit Windows), and 11.2.0.1 (on 64 bit
>> Linux).
>
> ...
>
>
> Charles,
>
> Interesting figures.
> A couple of follow-on questions:
> How much memory in the machine ?
> How many CPUs
> Were you running the client locally - ipc or tcp
> How much CPU usage "outside" Oracle in each test
> Any significant variation in 'prefetch' blocks
> Any other significant variation in stats

I am also wondering about table and index stats. Charles, I see you
only mention system stats with NOWORKLOAD. What about stats for the
table and what about system stats gathered with INTERVAL and / or
START/STOP?

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Matthias Hoys

unread,

Sep 5, 2009, 11:17:39 AM9/5/09

to

"Jonathan Lewis" <jona...@jlcomp.demon.co.uk> wrote in message
news:stydnRbjqr7nYjzX...@bt.com...

And what disk subsystem did you use? DAS or a shared system?

Matthias

Charles Hooper

unread,

Sep 5, 2009, 11:28:54 AM9/5/09

to

On Sep 5, 2:04 am, "Jonathan Lewis" <jonat...@jlcomp.demon.co.uk>
wrote:
> > "Charles Hooper" <hooperc2...@yahoo.com> wrote in message

> >news:1ea99c67-8713-4ece...@v2g2000vbb.googlegroups.com...
> > I am working on an Oracle performance related project, so I thought
> > that I would compare the performance of Oracle database 11.1.0.6 (on
> > 64 bit Linux), 11.1.0.7 (on 64 bit Windows), and 11.2.0.1 (on 64 bit
> > Linux).
>
> ...
>
> Charles,
>
> Interesting figures.
> A couple of follow-on questions:
> How much memory in the machine ?
> How many CPUs
> Were you running the client locally - ipc or tcp
> How much CPU usage "outside" Oracle in each test
> Any significant variation in 'prefetch' blocks
> Any other significant variation in stats
>
> --
> Regards
>
> Jonathan Lewis
> http://jonathanlewis.wordpress.com
>

> Author: Cost Based Oracle: Fundamentalshttp://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

>
> The Co-operative Oracle Users' FAQhttp://www.jlcomp.demon.co.uk/faq/ind_faq.html

12GB of memory, 1 CPU (a Core i7 940 at 2.93GHz), local connection
with no listener started.

Robert, good question. Yes, table and index stats were collected with
no histograms.

A short sampling of the waits found during the index scan (near the
start of the trace) with an unset FILESYSTEMIO_OPTIONS (69689 is the
table, 69690 is the index):
FETCH
#8:c=0,e=897,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252080204301576
WAIT #8: nam='SQL*Net message from client' ela= 103 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252080204301706
WAIT #8: nam='db file parallel read' ela= 173 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204301959
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1350890
blocks=1 obj#=69689 tim=1252080204302009
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252080204302022
WAIT #8: nam='db file parallel read' ela= 192 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204302283
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1356525
blocks=1 obj#=69689 tim=1252080204302373
WAIT #8: nam='db file scattered read' ela= 33 file#=7 block#=2162857
blocks=7 obj#=69690 tim=1252080204302442
WAIT #8: nam='db file parallel read' ela= 88 files=1 blocks=19
requests=19 obj#=69689 tim=1252080204302579
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1361383
blocks=1 obj#=69689 tim=1252080204302666
FETCH
#8:c=1000,e=982,p=107,cr=102,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252080204302701
WAIT #8: nam='SQL*Net message from client' ela= 105 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252080204302823
WAIT #8: nam='db file parallel read' ela= 178 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204303081
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1364081
blocks=1 obj#=69689 tim=1252080204303130
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252080204303143
WAIT #8: nam='db file parallel read' ela= 174 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204303381
WAIT #8: nam='db file sequential read' ela= 8 file#=7 block#=1369027
blocks=1 obj#=69689 tim=1252080204303468
WAIT #8: nam='db file parallel read' ela= 98 files=1 blocks=19
requests=19 obj#=69689 tim=1252080204303613
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1373988
blocks=1 obj#=69689 tim=1252080204303719
FETCH
#8:c=1000,e=921,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252080204303757

In the above, note the impossibly fast times for the waits - this is
fairly consistent through the trace file. The index blocks were read
by db file scattered read, db file sequential read, but a quick check
of the file did not seem to indicate that db file parallel read was
used for the index blocks. The table blocks were read by db file
scattered read, db file sequential read, and db file parallel read.

A short sampling of the waits found during the index scan (near the
start of the trace) with FILESYSTEMIO_OPTIONS set to SetAll (69689 is
the table, 69690 is the index):
FETCH
#8:c=2000,e=251016,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252088662939190
WAIT #8: nam='SQL*Net message from client' ela= 200 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252088662939432
WAIT #8: nam='db file parallel read' ela= 94215 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663033884
WAIT #8: nam='db file sequential read' ela= 5612 file#=7
block#=1397615 blocks=1 obj#=69689 tim=1252088663039645
WAIT #8: nam='SQL*Net message to client' ela= 2 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252088663039684
WAIT #8: nam='db file parallel read' ela= 74655 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663114551
WAIT #8: nam='db file sequential read' ela= 3377 file#=7
block#=1402576 blocks=1 obj#=69689 tim=1252088663118159
WAIT #8: nam='db file parallel read' ela= 41403 files=1 blocks=19
requests=19 obj#=69689 tim=1252088663159699
WAIT #8: nam='db file sequential read' ela= 8198 file#=7
block#=1407689 blocks=1 obj#=69689 tim=1252088663168086
FETCH
#8:c=2999,e=228700,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252088663168161
WAIT #8: nam='SQL*Net message from client' ela= 219 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252088663168420
WAIT #8: nam='db file scattered read' ela= 7961 file#=7 block#=1346037
blocks=3 obj#=69690 tim=1252088663176493
WAIT #8: nam='db file parallel read' ela= 61848 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663238666
WAIT #8: nam='db file sequential read' ela= 54241 file#=7
block#=1410476 blocks=1 obj#=69689 tim=1252088663293049
WAIT #8: nam='SQL*Net message to client' ela= 2 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252088663293115
WAIT #8: nam='db file parallel read' ela= 87347 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663380663
WAIT #8: nam='db file sequential read' ela= 9152 file#=7
block#=1415385 blocks=1 obj#=69689 tim=1252088663390014
WAIT #8: nam='db file parallel read' ela= 30242 files=1 blocks=19
requests=19 obj#=69689 tim=1252088663420387
WAIT #8: nam='db file sequential read' ela= 6660 file#=7
block#=1420835 blocks=1 obj#=69689 tim=1252088663427228
FETCH
#8:c=4000,e=258841,p=103,cr=102,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252088663427311

Elapsed time for the full tablescan with with direct I/O and asynch I/
O and other monitoring enabled: roughly 42 seconds

Top output with direct I/O and asynch I/O enabled during the full
tablescan operation (terminated about 5 seconds into the run):
top - 09:48:21 up 1:40, 4 users, load average: 0.42, 0.23, 0.15
Tasks: 244 total, 1 running, 243 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.7%us, 1.0%sy, 0.0%ni, 85.5%id, 9.6%wa, 0.2%hi,
0.1%si, 0.0%st
Mem: 12291184k total, 1694080k used, 10597104k free, 93000k
buffers
Swap: 14352376k total, 0k used, 14352376k free, 1123524k
cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
7090 oracle 16 0 8416m 125m 112m S 24.3 1.0 0:10.34
oracle
7082 oracle 18 0 65468 12m 9032 S 9.6 0.1 0:04.15
sqlplus
4966 oracle 15 0 284m 18m 8964 S 2.3 0.2 0:03.53 gnome-
terminal
4848 oracle 15 0 492m 30m 12m S 0.7 0.3 0:18.77
nautilus
718 root 10 -5 0 0 0 S 0.3 0.0 0:05.19 usb-
storage
...

V$OSSTAT delta values (60 second capture) with direct I/O and asynch I/
O enabled during the full tablescan operation:
STAT_NAME END_VALUE DELTA DELTA_SEC
------------------------- ---------- ----------- ------------
BUSY_TIME 5912 1759 29.32
GLOBAL_RECEIVE_SIZE_MAX 4194304 0 0.00
GLOBAL_SEND_SIZE_MAX 1048586 0 0.00
IDLE_TIME 864418 46730 778.83
IOWAIT_TIME 4973 3424 57.07
LOAD .319335938 0 0.00
NICE_TIME 0 0 0.00
NUM_CPUS 8 0 0.00
NUM_CPU_CORES 4 0 0.00
NUM_CPU_SOCKETS 1 0 0.00
PHYSICAL_MEMORY_BYTES 1.2586E+10 0 0.00
RSRC_MGR_CPU_WAIT_TIME 0 0 0.00
SYS_TIME 1353 383 6.38
TCP_RECEIVE_SIZE_DEFAULT 87380 0 0.00
TCP_RECEIVE_SIZE_MAX 4194304 0 0.00
TCP_RECEIVE_SIZE_MIN 4096 0 0.00
TCP_SEND_SIZE_DEFAULT 16384 0 0.00
TCP_SEND_SIZE_MAX 4194304 0 0.00
TCP_SEND_SIZE_MIN 4096 0 0.00
USER_TIME 4120 1266 21.10

V$SYS_TIME_MODEL delta values (60 second capture) with direct I/O and
asynch I/O enabled during the full tablescan operation:
STAT_NAME DELTA
DELTA_SEC
-------------------------------------------------- -----------
------------
DB CPU 10574389
176239.82
DB time 40284606
671410.10
Java execution elapsed time 0
0.00
PL/SQL compilation elapsed time 135017
2250.28
PL/SQL execution elapsed time 799537
13325.62
RMAN cpu time (backup/restore) 0
0.00
background cpu time 262959
4382.65
background elapsed time 6982341
116372.35
connection management call elapsed time 310170
5169.50
failed parse (out of shared memory) elapsed time 0
0.00
failed parse elapsed time 0
0.00
hard parse (bind mismatch) elapsed time 62740
1045.67
hard parse (sharing criteria) elapsed time 114156
1902.60
hard parse elapsed time 3062463
51041.05
inbound PL/SQL rpc elapsed time 0
0.00
parse time elapsed 3414383
56906.38
repeated bind elapsed time 7238
120.63
sequence load elapsed time 9007
150.12
sql execute elapsed time 39621673
660361.22

------------------------------------

Elapsed time for the full tablescan with other monitoring enabled:
11.97 seconds

Top output with unset filesystemio_options during the full tablescan
operation (terminated about 5 seconds into the run):
top - 10:26:47 up 2:18, 4 users, load average: 0.41, 0.28, 0.20
Tasks: 245 total, 2 running, 243 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.5%us, 3.6%sy, 0.0%ni, 86.8%id, 0.1%wa, 0.0%hi,
0.0%si, 0.0%st
Mem: 12291184k total, 8364088k used, 3927096k free, 95948k
buffers
Swap: 14352376k total, 0k used, 14352376k free, 7768492k
cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
7936 oracle 18 0 8416m 138m 125m S 74.2 1.2 0:20.07
oracle
4994 oracle 18 0 69240 16m 9392 R 25.6 0.1 0:16.30
sqlplus
4727 root 15 0 173m 43m 23m S 1.7 0.4 1:26.83
Xorg
4848 oracle 15 0 492m 30m 12m S 1.3 0.3 0:25.46
nautilus
7870 oracle 16 0 8401m 69m 67m S 0.7 0.6 0:00.14 oracle
...

V$OSSTAT delta values (60 second capture) with unset
filesystemio_options during the full tablescan operation:
STAT_NAME END_VALUE DELTA DELTA_SEC
------------------------- ---------- ----------- ------------
BUSY_TIME 4843 1464 24.40
GLOBAL_RECEIVE_SIZE_MAX 4194304 0 0.00
GLOBAL_SEND_SIZE_MAX 1048586 0 0.00
IDLE_TIME 194252 46582 776.37
IOWAIT_TIME 66 30 0.50
LOAD .559570313 0 0.00
NICE_TIME 0 0 0.00
NUM_CPUS 8 0 0.00
NUM_CPU_CORES 4 0 0.00
NUM_CPU_SOCKETS 1 0 0.00
PHYSICAL_MEMORY_BYTES 1.2586E+10 0 0.00
RSRC_MGR_CPU_WAIT_TIME 0 0 0.00
SYS_TIME 1218 380 6.33
TCP_RECEIVE_SIZE_DEFAULT 87380 0 0.00
TCP_RECEIVE_SIZE_MAX 4194304 0 0.00
TCP_RECEIVE_SIZE_MIN 4096 0 0.00
TCP_SEND_SIZE_DEFAULT 16384 0 0.00
TCP_SEND_SIZE_MAX 4194304 0 0.00
TCP_SEND_SIZE_MIN 4096 0 0.00
USER_TIME 3561 1067 17.78

V$SYS_TIME_MODEL delta values (60 second capture) with unset
filesystemio_options during the full tablescan operation:
STAT_NAME DELTA
DELTA_SEC
-------------------------------------------------- -----------
------------
DB CPU 9641528
160692.13
DB time 10092101
168201.68
Java execution elapsed time 0
0.00
PL/SQL compilation elapsed time 49928
832.13
PL/SQL execution elapsed time 227048
3784.13
RMAN cpu time (backup/restore) 0
0.00
background cpu time 124977
2082.95
background elapsed time 339846
5664.10
connection management call elapsed time 42929
715.48
failed parse (out of shared memory) elapsed time 0
0.00
failed parse elapsed time 0
0.00
hard parse (bind mismatch) elapsed time 718
11.97
hard parse (sharing criteria) elapsed time 48453
807.55
hard parse elapsed time 1029104
17151.73
inbound PL/SQL rpc elapsed time 0
0.00
parse time elapsed 1324548
22075.80
repeated bind elapsed time 21478
357.97
sequence load elapsed time 4464
74.40
sql execute elapsed time 9611995
160199.92

------------------------------------
here
Elapsed time for the index access with other monitoring enabled: 23.44
seconds

Top output with unset filesystemio_options during the index scan
operation (terminated about 5 seconds into the run):
top - 10:41:56 up 2:33, 4 users, load average: 0.10, 0.09, 0.12
Tasks: 247 total, 2 running, 245 sleeping, 0 stopped, 0 zombie
Cpu(s): 7.4%us, 5.0%sy, 0.0%ni, 87.1%id, 0.4%wa, 0.1%hi,
0.0%si, 0.0%st
Mem: 12291184k total, 8811144k used, 3480040k free, 97216k
buffers
Swap: 14352376k total, 0k used, 14352376k free, 8205800k
cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
7936 oracle 18 0 8412m 652m 641m R 88.5 5.4 0:32.71
oracle
4994 oracle 18 0 69240 16m 9392 S 9.3 0.1 0:19.61
sqlplus
7924 oracle 15 0 8405m 74m 68m S 0.7 0.6 0:00.43
oracle
6 root 34 19 0 0 0 S 0.3 0.0 0:02.67 ksoftirqd/
1
718 root 10 -5 0 0 0 S 0.3 0.0 0:08.02 usb-
storage
...

V$OSSTAT delta values (60 second capture) with unset
filesystemio_options during the index scan operation:
STAT_NAME END_VALUE DELTA DELTA_SEC
------------------------- ---------- ----------- ------------
BUSY_TIME 9826 2459 40.98
GLOBAL_RECEIVE_SIZE_MAX 4194304 0 0.00
GLOBAL_SEND_SIZE_MAX 1048586 0 0.00
IDLE_TIME 913964 45579 759.65
IOWAIT_TIME 256 123 2.05
LOAD .33984375 0 0.01
NICE_TIME 0 0 0.00
NUM_CPUS 8 0 0.00
NUM_CPU_CORES 4 0 0.00
NUM_CPU_SOCKETS 1 0 0.00
PHYSICAL_MEMORY_BYTES 1.2586E+10 0 0.00
RSRC_MGR_CPU_WAIT_TIME 0 0 0.00
SYS_TIME 2807 962 16.03
TCP_RECEIVE_SIZE_DEFAULT 87380 0 0.00
TCP_RECEIVE_SIZE_MAX 4194304 0 0.00
TCP_RECEIVE_SIZE_MIN 4096 0 0.00
TCP_SEND_SIZE_DEFAULT 16384 0 0.00
TCP_SEND_SIZE_MAX 4194304 0 0.00
TCP_SEND_SIZE_MIN 4096 0 0.00
USER_TIME 6618 1459 24.32

V$SYS_TIME_MODEL delta values (60 second capture) with unset
filesystemio_options during the index scan operation:
STAT_NAME DELTA
DELTA_SEC
-------------------------------------------------- -----------
------------
DB CPU 21543721
359062.02
DB time 22548982
375816.37
Java execution elapsed time 0
0.00
PL/SQL compilation elapsed time 179573
2992.88
PL/SQL execution elapsed time 210949
3515.82
RMAN cpu time (backup/restore) 0
0.00
background cpu time 184972
3082.87
background elapsed time 482752
8045.87
connection management call elapsed time 52331
872.18
failed parse (out of shared memory) elapsed time 0
0.00
failed parse elapsed time 0
0.00
hard parse (bind mismatch) elapsed time 5838
97.30
hard parse (sharing criteria) elapsed time 49516
825.27
hard parse elapsed time 899229
14987.15
inbound PL/SQL rpc elapsed time 0
0.00
parse time elapsed 1097713
18295.22
repeated bind elapsed time 7553
125.88
sequence load elapsed time 5252
87.53
sql execute elapsed time 22255425
370923.75

Test table and index (think of this example as picking up the X or Y
positions of an engine cylinder wall during CMM testing with 32 test
data points per cylinder – the buffer cache is flushed for consistency
and to simulate an infrequently accessed table):
CREATE TABLE T1 (
ID NUMBER,
DESCRIPTION VARCHAR2(80));

INSERT INTO T1
SELECT
CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
'This is the long description for this number '|| TO_CHAR(CEIL(ABS
(SIN(ROWNUM/9.9999)*10000)))
FROM
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=10000),
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=10000);

COMMIT;

CREATE INDEX IND_T1 ON T1(ID);

EXEC DBMS_STATS.GATHER_TABLE_STATS
(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE, METHOD_OPT=>'FOR ALL
COLUMNS SIZE 1')

ANALYZE INDEX IND_T1 VALIDATE STRUCTURE;

Test script:
set linesize 150
set pagesize 2000
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH SHARED_POOL;
SET ARRAYSIZE 100

spool /u01/app/oracle/results/BCHRTestShortNoAsync.txt

SET AUTOTRACE TRACEONLY EXPLAIN
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

SET ARRAYSIZE 100
SET AUTOTRACE TRACEONLY STATISTICS

SET TIMING ON

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE100';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE5';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

spool off

Charles Hooper

unread,

Sep 5, 2009, 12:14:43 PM9/5/09

to

On Sep 5, 11:17 am, "Matthias Hoys" <a...@spam.com> wrote:
> "Jonathan Lewis" <jonat...@jlcomp.demon.co.uk> wrote in message
>
> news:stydnRbjqr7nYjzX...@bt.com...
> >> "Charles Hooper" <hooperc2...@yahoo.com> wrote in message

Direct attached, so the only buffering was in Oracle's buffer cache
and the 32MB of cache memory built into each of the drives. The
problem, I believe, is caused by the drives having to constantly
reposition their heads for each random I/O - I believe that the drives
have either a 5ms or 6ms random I/O time rating. The average time for
the db file sequential read wait is 5.942ms, which I believe is
considered very good for a random I/O.

The trace file shows that Oracle keeps jumping between reads of the
index, followed by several usually parallel reads (I believe that each
of these block reads would also be considered a random I/O) of the
table blocks.

On an unrelated note, it may take over an hour to generate the sample
T1 table, so fair warning for anyone wanting to try the test on their
system.

Jonathan Lewis

unread,

Sep 5, 2009, 12:35:35 PM9/5/09

to

"Charles Hooper" <hoope...@yahoo.com> wrote in message

news:20921d95-425c-4a2e-8442

COMMIT;

spool /u01/app/oracle/results/BCHRTestShortNoAsync.txt

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

SET TIMING ON

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

spool off

Charles,

My numbers may be wrong, but I estimate that your table is
about 6.5GB and your index is about 1.5GB, on which you do
a range scan of 4% for a total 600MB likely to be buffered.

When you finish the range scan versions, how much of the
buffer cache is still free. On one hand we might expect the
entire table and the section of index to be buffered - leaving
about 1GB free - on the other hand Oracle may have been
re-using buffers for the range scan even though the number of
free buffers was huge. (Consider the possibility that you are
reading into the cold half only - this probably shouldn't be
happening after a flush buffer cache, but if it is your buffer
cache is effectively 4GB instead of 8GB).

A possible interpretation of the big difference in figures is as
follows: when doing the range scan you visit many blocks in
the table 3 or 4 times (due to the cyclic but non-uniform nature
of your data). If Oracle is re-using buffers instead of consuming
free buffers all the time then you have to re-read a lot of buffers.

When you enable direct I/O, all those reads come from disk - if
async i/o is also enabled many of those reads might be competing
with each other through different AIO processes.

When you disable direct I/O you have an extra 4GB of file system
buffer backing the Oracle buffer cache, and do far fewer real disk
accesses.

When you do the tablescan, you visit each block only once - when
you do direct I/O you get readahead benefits from the hardware
and don't waste CPU double-buffering through the file system.

When you disable direct IO you use more CPU because of the double
buffering - but because of the long physical reads you don't lose any extra
time on the physical I/O.

You might like to reboot the hardware between runs to eliminate any
filesystem and SAN caching if you want to do a painfully rigorous test,
of course, but I don't think it would affect my guess by much.

Mladen Gogala

unread,

Sep 5, 2009, 1:00:53 PM9/5/09

to

On Fri, 04 Sep 2009 18:04:37 -0700, Charles Hooper wrote:

> Obviously from the above, 11.2.0.1 is significantly faster at the index
> access than is 11.1.0.6 on the same platform. But wait, I forgot
> something. On 11.1.0.6 on Linux I had enabled direct I/O and
> asynchronous I/O by setting the FILESYSTEMIO_OPTIONS parameter to
> SETALL. I did not change that parameter on 11.2.0.1, so it defaulted to
> NONE. What happens when the FILESYSTEMIO_OPTIONS parameter is set to
> SETALL?
> * Oracle 11.2.0.1 now required 42.45 seconds for the full tablescan
> while the index access path required one hour, 16 minutes and 46
> seconds.

Charles, that is to be expected, although this is a bit drastic
difference. Most file systems do read prefetch and even asynchronous
reads of the blocks when using buffer cache. Also, just as Oracle RDBMS,
Unix derivatives also have smart strategies for keeping the hot blocks in
the cache, which results in rather drastic performance gains. If you
start doing direct I/O, lose all the help from the system buffer cache
and OS. You can help things by using readahead* RH EL services and
"blockdev" command to increase read-ahead.

--
http://mgogala.freehostia.com

Charles Hooper

unread,

Sep 5, 2009, 4:26:42 PM9/5/09

to

On Sep 5, 12:35 pm, "Jonathan Lewis" <jonat...@jlcomp.demon.co.uk>
wrote:
> "Charles Hooper" <hooperc2...@yahoo.com> wrote in message

>
> news:20921d95-425c-4a2e-8442
>
> CREATE TABLE T1 (
> ID NUMBER,
> DESCRIPTION VARCHAR2(80));
>
> INSERT INTO T1
> SELECT
> CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
> 'This is the long description for this number '|| TO_CHAR(CEIL(ABS
> (SIN(ROWNUM/9.9999)*10000)))
> FROM
> (SELECT
> ROWNUM RN
> FROM
> DUAL
> CONNECT BY
> LEVEL<=10000),
> (SELECT
> ROWNUM RN
> FROM
> DUAL
> CONNECT BY
> LEVEL<=10000);
>
> COMMIT;
>

Jonathan,

Those are very good estimates. The table and index reside in an ASSM
autoallocate tablespace. The table and index statistics looked like
this during one of the test runs (either on 64 bit Windows with
11.1.0.7 or 64 bit Windows with 10.2.0.4, but it sometimes changed in
repeated tests):
SEGMENT EXTENTS EXT_SIZE_KB TOTAL_MB
------- -------- ------------ ---------
IND_T1 16 64 1
IND_T1 63 1,024 63
IND_T1 120 8,192 960
IND_T1 10 65,536 640
T1 16 64 1
T1 63 1,024 63
T1 120 8,192 960
T1 1 15,360 15
T1 1 22,528 22
T1 1 63,488 62
T1 82 65,536 5,248

DBA_INDEXES reported the CLUSTERING_FACTOR as 101,149,320.
INDEX_STATS reported the following with an 8KB block size:
HEIGHT BLOCKS LF_BLKS LF_ROWS DISTINCT_KEYS
MOST_REPEATED_KEY PCT_USED
------ -------- -------- ------------ ------------- -----------------
----------
3 212,992 208,854 100,000,000 10,000
900,324 90

212,992 * 8,192 = 1,744,830,464 = 1.625 GB

The table extents add up to about 6.22GB, but of course it is possible
that the last 64MB extent was not fully used (this seems to be
confirmed by the number of physical reads).

Unfortunately, I did not check how much free space was available in
the buffer cache following the runs. The full tablescans primarily
performed direct path reads which I believe would prevent the blocks
from being cached in the buffer cache (this was also a bit of a
surprise as I was expecting db file scattered read waits).
Considering that there were 2,573,633 consistent block reads and
2,508,560 physical block reads during the test with 11.2.0.1 with
direct I/O and asych I/O enabled, and only 838,370 consistent block
reads and 813,120 physical block reads during the full tablescan, it
would seem that not many of the blocks remained in the buffer cache.
On Windows, __DB_CACHE_SIZE had a value of about 1,375,731,712 at the
end of the test. I did not check the value on Linux after the test
run. On Windows I actually performed one other test, which
essentially set the KEEP buffer cache to a very small value. This
allowed the __DB_CACHE_SIZE parameter to increase from roughly 1.3GB
to roughly 6.6GB which decreased the time for the full tablescan to
31.68 seconds, and the index range scan decreased to 32 minutes and
40.27 seconds.

Mladen, thanks for sharing that information.

This thread seems to fit in nicely with a couple recent blog posts on
Jonathan's website: Queue Time, and Real World. However, I must say
that everyone in this thread did not jump to the conclusion that when
selecting 2.55% of a table, that an index should be used rather than a
full tablescan.

There were, of course, several surprises:
* A tablescan of a "large" table would correctly be preferred by
Oracle when selecting 2.55% of the rows in the test table.
* A tablescan is significantly faster in some cases than an index
range scan when a small portion of the data from the table is needed.
* Direct I/O and Asynch I/O, which seem to be frequently recommended
to improve performance, do not always improve performance, and may in
fact drastically affect performance.
* 64 bit Windows faired reasonably well with Linux when Direct I/O and
Asynch I/O were enabled in Linux, when the same hardware is used for
both platforms.
* Linux 11.2.0.1 seemed to be a bit slower than 11.1.0.6 with Direct I/
O and Asynch I/O enabled, but of course 11.2.0.1 does not suffer as
badly from adaptive cursor sharing problems as 11.1.0.6.
* Oracle used direct path reads rather than db file scattered reads
during the tablescan.
* Others?

So, should the OPTIMIZER_INDEX_COST_ADJ parameter be set to the lower

number to (quoting from a posting on the Internet) “immediately tune
all of the SQL in your database to favor index scans over full-table

scans”? :-)

Please don't spend too much time considering how to fix this test case
(it was intended as a simple set up, which evolved a bit when 11.2.0.1
was released and I obtained *amazing* performance improvements. There
were other tests too, but I will save those for later.

Aman, thanks for the compliment.

John Hurley

unread,

Sep 5, 2009, 6:23:47 PM9/5/09

to

On Sep 4, 9:04 pm, Charles Hooper <hooperc2...@yahoo.com> wrote:

snip

> I am working on an Oracle performance related project, so I thought
> that I would compare the performance of Oracle database 11.1.0.6 (on
> 64 bit Linux), 11.1.0.7 (on 64 bit Windows), and 11.2.0.1 (on 64 bit
> Linux). I set up a test case with a table containing 100,000,000 rows
> having an average row length of 53 bytes. A normal B*tree index
> exists on a numeric column which contains numbers ranging from 0 to
> 10,000. A test query will be selecting 2,547,158 (2.55%) of the rows
> from the table. Leaving OPTIMIZER_INDEX_COST_ADJ at the default of
> 100 with NOWORKLOAD system statistics results in a full table in all
> three releases of Oracle. Setting OPTIMIZER_INDEX_COST_ADJ to 5, for
> instance, results in an index access path. So, the question is:
> should the OPTIMIZER_INDEX_COST_ADJ parameter be set to the lower
> number to (quoting from a posting on the Internet) “immediately tune
> all of the SQL in your database to favor index scans over full-table
> scans”? I am certain that I know the answer based on extensive
> testing, but what are your thoughts?

Hey Charles I lost a little of what your investigations found when
trying to digest your posting here ( it has been a long week so
probably my concentration is not up to task ).

It seems like at least several of ( possibly many ) oracle experts
have de-emphasized making manual adjustments to some of the
OPTIMIZER_* parameters now that system statistics are available.
( Not quite sure where Tom Kyte stands here or Jonathan Lewis for that
matter ). Any ideas on what makes the most sense based on what you
have discovered so far?

Is your test case a general argument that people should still be
making manual adjustments to some of these settings even if running
11.2?

Is it a specific example that based on specific conditions shows 11.2
is better or worse than other releases when making certain manaul
adjustments?

Is there a difference shown between NOWORKLOAD system statistics and
"real workload" statistics under 11.2 for your test case?

Charles Hooper

unread,

Sep 5, 2009, 10:19:44 PM9/5/09

to

On Sep 5, 6:23 pm, John Hurley <johnbhur...@sbcglobal.net> wrote:
> Hey Charles I lost a little of what your investigations found when
> trying to digest your posting here ( it has been a long week so
> probably my concentration is not up to task ).
>
> It seems like at least several of ( possibly many ) oracle experts
> have de-emphasized making manual adjustments to some of the
> OPTIMIZER_* parameters now that system statistics are available.
> ( Not quite sure where Tom Kyte stands here or Jonathan Lewis for that
> matter ). Any ideas on what makes the most sense based on what you
> have discovered so far?
>
> Is your test case a general argument that people should still be
> making manual adjustments to some of these settings even if running
> 11.2?
>
> Is it a specific example that based on specific conditions shows 11.2
> is better or worse than other releases when making certain manaul
> adjustments?
>
> Is there a difference shown between NOWORKLOAD system statistics and
> "real workload" statistics under 11.2 for your test case?

I will have to defer the question to someone else.

Too many recommendations may be found through Internet searches and
book searches suggesting to blindly set OPTIMIZER_INDEX_COST_ADJ to a
low number such as 10, 5, or even 1 without giving any consideration
to what just might happen with certain types of data. A nice 30
second report might become a nice 80 minute report, for example.
(This test case surprised me too.)

I used NOWORKLOAD statistics in this case to improve the chances that
if someone else ran the test case, they would also see a full
tablescan when OPTIMIZER_INDEX_COST_ADJ was set to 100 and an index
range scan when the parameter was set to 5. Incidentally I also ran
the SQL statement on 11.1.0.7 and 10.2.0.4 on 64 bit Windows with a
WHERE clause of "ID BETWEEN 1 AND 10" which selected 0.06% of the
table (I think that I also ran this test on Linux with 11.1.0.6 and
11.2.0.1 but I have not reviewed those test runs). Oracle 10.2.0.4
which had CPU statistics collected still deciided to select a full
tablescan with OPTIMIZER_INDEX_COST_ADJ set to 100, while 11.1.0.7
with NOWORKLOAD statistics selected to perform an index range scan
regardless of the value of OPTIMIZER_INDEX_COST_ADJ. Oracle 10.2.0.4
had auto-tuned the MBRC value in SYS.AUX_STATS$ to 47 - changing it to
the NOWORKLOAD default of 8 resulted in an index range scan regardless
of the value of OPTIMIZER_INDEX_COST_ADJ with the modified WHERE
clause. One would think that when selecting 0.06% of a table, a full
table scan of a table containing more than 5GB of data would be a bad
idea. The results? 34.61 seconds (with 813,876 consistent gets,
813,234 physical block reads) for the full tablescan, while the plan
using the index access required 2 minutes and 40.16 seconds (with
64,425 consistent gets and 63,789 physical block reads). One of those
plans will severely hurt the BCHR... but that is a another topic for
the chapter.

In one of the production databases I had an issue where I attempted to
leave OPTIMIZER_INDEX_COST_ADJ set at 100 while relying on the
collected CPU statistics (OK, I actually set MBRC to 128, up from the
calculated value of 8 based on the an 1MB multi-block read size which
should be easily reached given the ASSM assigned extent sizes). With
this setup Oracle selected to perform full tablescans over index range
scans of fairly selective indexes, even when much of the table blocks
were cached in the KEEP pool (this is expected as all block reads are
calculated as physical block reads, unless that changed in 11.2).
While it might seem like a good idea, it resulted in excessive CPU
utilization and very long query execution times (about 0.2 seconds for
the index range scan and about 35 seconds for the plan using the full
tablescan if I remember correctly). I compromised a bit on the two
approaches after that.

I guess the point is that one could poke their eye out by blindly
pushing buttons without understanding the potential side effects.

I was, for about 30 minutes any way, excited about the performance
improvement offered by 11.2.0.1 - until I realized that I forgot to
set the parameter enabling direct I/O and async I/O when I started
examining the reason for the difference. I hope that does not happen
to any one else, otherwise there will be articles on the web stating
"Amazing guru ____ improves Oracle performance by a factor of 236
(4474.545172/18.980762) by upgrading from 11.1.0.7 to 11.2.0.1 after
tweaking a couple guru level mystery settings."

I guess the answer is test. I think that Jonathan would likely be a
better source than myself when discussing system statistics
strategies.

Robert Klemme

unread,

Sep 6, 2009, 11:14:20 AM9/6/09

to

On 05.09.2009 22:26, Charles Hooper wrote:

> * Direct I/O and Asynch I/O, which seem to be frequently recommended
> to improve performance, do not always improve performance, and may in
> fact drastically affect performance.

Wouldn't you have to increase SGA target when switching to direct IO for
a fair comparison? Jonathan pointed out that without direct IO you are
actually increasing disk buffer size (Oracle's SGA + OS level cache).
So, assuming a dedicated machine (so not much OS buffer needed for other
tasks), the direct IO test would have to run with the SGA increased so
you get a similar amount of memory used for caching blocks.

> So, should the OPTIMIZER_INDEX_COST_ADJ parameter be set to the lower
> number to (quoting from a posting on the Internet) “immediately tune
> all of the SQL in your database to favor index scans over full-table
> scans”? :-)

I am by far not as expert as Jonathan but I have a bad gut feeling about
such a "global optimization". It is not very targeted and I have seen
too many cases where an optimization which looked good on first sight
hat bad effects overall...

Charles, thank you for sharing this!

Robert Klemme

unread,

Sep 6, 2009, 11:18:24 AM9/6/09

to

Mladen, you make it sound as if OS level caching was smarter than
Oracle's own buffer cache strategies. In theory Oracle's code would
have better knowledge about access patterns so my expectation would be
that taking memory from OS buffer cache and giving it to the Oracle
instance would lead to better performance than using OS level caching.
Otherwise we should switch off Oracle's own buffering and use only OS
features.

What am I missing?

Charles Hooper

unread,

Sep 6, 2009, 11:55:46 AM9/6/09

to

On Sep 6, 11:14 am, Robert Klemme <shortcut...@googlemail.com> wrote:
> On 05.09.2009 22:26, Charles Hooper wrote:
>
> > * Direct I/O and Asynch I/O, which seem to be frequently recommended
> > to improve performance, do not always improve performance, and may in
> > fact drastically affect performance.
>
> Wouldn't you have to increase SGA target when switching to direct IO for
> a fair comparison? Jonathan pointed out that without direct IO you are
> actually increasing disk buffer size (Oracle's SGA + OS level cache).
> So, assuming a dedicated machine (so not much OS buffer needed for other
> tasks), the direct IO test would have to run with the SGA increased so
> you get a similar amount of memory used for caching blocks.

That is a good question. The machine in this setup had 12GB of
memory. The SGA_TARGET was set to 8GB, the PGA_AGGREGATE_TARGET was
set to 1.8GB, and the KEEP buffer pool was set to 6GB. Since the
actual table which the test table mimics will likely be infrequently
queried in full, such as a report which attempts to indicate the
change in the size measured for the left side of the cylinder wall
over the production lifetime of the part, it probably would not be a
good idea to optimize the instance and operating system performance
for this one query. Other data from other tables would likely be
occupying the KEEP buffer pool, which means that in production the
available RAM for caching of data blocks at the operating system level
might be quite limited. It might be interesting to test what happens
when the KEEP buffer pool is fully utilized - will Linux start
swapping other memory out to disk to buffer the Oracle blocks in the
file system cache? What happens when a 512MB (or larger) redo log
needs to be archived, will it hinder the effects of the operating
system level caching of Oracle blocks? It might come down to how
closely the test environment is able to mimic the production
environment.

> > So, should the OPTIMIZER_INDEX_COST_ADJ parameter be set to the lower
> > number to (quoting from a posting on the Internet) “immediately tune
> > all of the SQL in your database to favor index scans over full-table
> > scans”? :-)
>
> I am by far not as expert as Jonathan but I have a bad gut feeling about
> such a "global optimization". It is not very targeted and I have seen
> too many cases where an optimization which looked good on first sight
> hat bad effects overall...
>
> Charles, thank you for sharing this!
>
> Kind regards
>
> robert

This test case is a light-weight example from the book "Expert Oracle
Practices: Oracle Database Administration from the Oak Table". The
test case took a different twist while testing on 11.2.0.1.

Gerard H. Pille

unread,

Sep 9, 2009, 5:48:45 AM9/9/09

to

Hallo Charles,

I'd like to see your statistics for this select:

SELECT /*+ index (t1 ind_t1) */
ID,
DESCRIPTION
FROM
sys.T1 t1
WHERE
ID >= 9991
/

I suppose you were aware of the skew in the index, but how does this
account for the difference in performance?

On my old PC - with a new disk - the FTS takes 2 minutes, the IS 20
minutes, selecting id between 1 and 400 via index takes over 3 hours.

Kind regards,

Gerard

Charles Hooper

unread,

Sep 9, 2009, 8:12:12 PM9/9/09

to

Yes, there is a skew in the data - it should be more densely packed at
each end of the range.

Here is a slightly modified version of the script you requested I run
so that we are able to see the costs, and potentially some of the
effects of an unset filesystemio_options parameter by executing the
same query a second time:

set linesize 150
set pagesize 2000
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH SHARED_POOL;
SET ARRAYSIZE 100

spool /u01/app/oracle/results/filesystemio_options_unset.txt

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT /*+ index (t1 ind_t1) */
ID,
DESCRIPTION
FROM

T1 t1
WHERE
ID >= 9991;

SET ARRAYSIZE 100
SET AUTOTRACE TRACEONLY STATISTICS

SET TIMING ON

SELECT /*+ index (t1 ind_t1) */
ID,
DESCRIPTION
FROM
T1 t1
WHERE
ID >= 9991;

SELECT /*+ index (t1 ind_t1) */
ID,
DESCRIPTION
FROM

T1 t1
WHERE
ID >= 9991;

spool off

This is the output:
Execution Plan
----------------------------------------------------------
Plan hash value:
634656657

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time
|
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 5273K| 99458
(1)| 00:19:54
|
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 100K| 5273K| 99458
(1)| 00:19:54
|
|* 2 | INDEX RANGE SCAN | IND_T1 | 100K| | 212
(1)| 00:00:03
|
--------------------------------------------------------------------------------------

Predicate Information (identified by operation
id):
---------------------------------------------------
2 - access
("ID">=9991)

2847287 rows selected.

Elapsed: 00:05:04.91

Statistics
----------------------------------------------------------
1 recursive
calls
0 db block
gets
2568572 consistent
gets
2540316 physical
reads
0 redo
size
173286364 bytes sent via SQL*Net to
client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from
client
0 sorts
(memory)
0 sorts
(disk)
2847287 rows
processed

2847287 rows selected.

Elapsed: 00:00:19.39

Statistics
----------------------------------------------------------
0 recursive
calls
0 db block
gets
2568572 consistent
gets
2536953 physical
reads
0 redo
size
173286364 bytes sent via SQL*Net to
client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from
client
0 sorts
(memory)
0 sorts
(disk)
2847287 rows
processed

The first execution completed in 5 minutes and almost 5 seconds. The
second execution completed in 19.39 seconds with a nearly identical
value for physical block reads.

Gerard H. Pille

unread,

Sep 10, 2009, 3:37:47 AM9/10/09

to

On 10 sep, 02:12, Charles Hooper <hooperc2...@yahoo.com> wrote:
>
> Yes, there is a skew in the data - it should be more densely packed at
> each end of the range.
>

Unless I made a mistake building the test data, that is not the case:
not at each end, but only at *the* end of the range. If I remember
right, 4K records for id 1, 1M for id 10000.

> The first execution completed in 5 minutes and almost 5 seconds. The
> second execution completed in 19.39 seconds with a nearly identical
> value for physical block reads.
>

So your physical reads are anything but (the second time round). On
my poor PC, 2Gb of memory, 500Mb SGA of which only 200Mb for data, the
physical reads mostly are physical indeed.

Why would my system have such trouble fetching 2.5M records for ids 1
to 400, and perform much better fetching 2.8M records for ids 9991 to
10000? The FTS, of course, doesn't care which ids it has to fetch.
But for the IS, it seems to matter a lot.

Kind regards,

Gerard

Mladen Gogala

unread,

Sep 10, 2009, 4:26:24 AM9/10/09

to

Na Wed, 09 Sep 2009 17:12:12 -0700, Charles Hooper napisao:

> The first execution completed in 5 minutes and almost 5 seconds. The
> second execution completed in 19.39 seconds with a nearly identical
> value for physical block reads.

Question: was direct I/O enabled here? If not, the blocks may have been
in the system buffer cache in which case oracle would still report them
as "physical reads" but blocks would be coming from memory instead.

--
http://mgogala.freehostia.com

Gerard H. Pille

unread,

Sep 10, 2009, 5:20:49 AM9/10/09

to

On 10 sep, 10:26, Mladen Gogala wrote:

>
> Question: was direct I/O enabled here? If not, the blocks may have been
> in the system buffer cache in which case oracle would still report them
> as "physical reads" but blocks would be coming from memory instead.
>

Why do you ask this? Charles clearly states that it was not. Will
you believe him next time??

Mladen Gogala

unread,

Sep 10, 2009, 7:20:48 AM9/10/09

to

Because this was the new execution and I lost track of the other
conditions. If direct I/O was disabled, Linux buffer cache would have
contained the cached blocks and that would explain the difference in
performance.

--
http://mgogala.freehostia.com

Charles Hooper

unread,

Sep 11, 2009, 8:11:01 AM9/11/09

to

No direct I/O was disabled, I left the FILESYSTEMIO_OPTIONS parameter
unset, which should have caused Oracle to use file system buffered I/O
if I understand the documentation correctly.

If this post makes it out to Usenet (the posts were failing yesterday
through Google's interface), I will post the results of 3 test runs of
the second script (suggested by Gerard) with direct I/O and with a
reboot between each test run. It is interesting to see Oracle aging
out the blocks read during the test run shortly after the test run
completes, even with no other load on the system and the test run
should not have encountered an AWR collection.

Also interesting is that on 11.2.0.1 with the SGA_TARGET set to 8000M,
the DB_KEEP_CACHE_SIZE set to 6G, and the table and its index using
the default buffer pool, I am seeing somewhat unexpected values for a
couple of the hidden parameters at the end of the run with the full
index access path (1 to 400) with direct I/O enabled and asychronous I/
O enabled. __DB_CACHE_SIZE was set to 536,870,912 (512MB).
__SHARED_POOL_SIZE was set to 1,140,850,688 (1,088MB). It would seem
that Oracle should have at the least set __DB_CACHE_SIZE to a value
larger than the __SHARED_POOL_SIZE value as it did on 10.2.0.4
(Windows 64 bit), 11.1.0.7 (Windows 64 bit), and 11.1.0.6 (Linux 64
bit). This might explain why the test run required slightly longer on
11.2.0.1.

Charles Hooper

unread,

Sep 11, 2009, 8:24:57 AM9/11/09

to

On Sep 11, 8:11 am, Charles Hooper <hooperc2...@yahoo.com> wrote:
> If this post makes it out to Usenet (the posts were failing yesterday
> through Google's interface), I will post the results of 3 test runs of
> the second script (suggested by Gerard) with direct I/O and with a
> reboot between each test run. It is interesting to see Oracle aging
> out the blocks read during the test run shortly after the test run
> completes, even with no other load on the system and the test run
> should not have encountered an AWR collection.

Output from the first test run:

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 5273K| 99458
(1)| 00:19:54 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 100K| 5273K| 99458
(1)| 00:19:54 |
|* 2 | INDEX RANGE SCAN | IND_T1 | 100K| | 212
(1)| 00:00:03 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation
id):
---------------------------------------------------

2 - access("ID">=9991)

2847287 rows selected.

Elapsed: 00:08:09.31

Statistics
----------------------------------------------------------
8 recursive calls

0 db block gets
2568572 consistent gets

2540350 physical reads

0 redo size
173286364 bytes sent via SQL*Net to client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2847287 rows processed

2847287 rows selected.

Elapsed: 00:08:10.87

Statistics
----------------------------------------------------------
8 recursive calls

0 db block gets
2568572 consistent gets
2536953 physical reads
0 redo size
173286364 bytes sent via SQL*Net to client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2847287 rows processed

Output from the second test run (after a reboot):
2847287 rows selected.

Elapsed: 00:08:01.98

Statistics
----------------------------------------------------------
3 recursive calls

0 db block gets
2568572 consistent gets

2540183 physical reads

0 redo size
173286364 bytes sent via SQL*Net to client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2847287 rows processed

2847287 rows selected.

Elapsed: 00:08:04.46

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2568572 consistent gets
2536953 physical reads
0 redo size
173286364 bytes sent via SQL*Net to client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2847287 rows processed

Output from the third test run (after a reboot):
2847287 rows selected.

Elapsed: 00:08:05.50

Statistics
----------------------------------------------------------
5 recursive calls

0 db block gets
2568572 consistent gets

2540176 physical reads

0 redo size
173286364 bytes sent via SQL*Net to client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2847287 rows processed

2847287 rows selected.

Elapsed: 00:07:55.58

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2568572 consistent gets
2536953 physical reads
0 redo size
173286364 bytes sent via SQL*Net to client
313716 bytes received via SQL*Net from
client
28474 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2847287 rows processed

After the first execution of the script I noticed that when the server
sat idle for a couple minutes the number of blocks remaining in the
buffer cache for the table and index continued to decrease. For the
final execution of the script I monitored the number of blocks
remaining in the buffer cache:

Immediately after the two SQL statements ended (481.73 MB):
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 61507
IND_T1 xcur 155

Two minutes after the the two SQL statements ended (481.51 MB)
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 61479
IND_T1 xcur 155

Four minutes after the the two SQL statements ended (481.51 MB)
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 61417
IND_T1 xcur 154

Six minutes after the the two SQL statements ended (481.02 MB)
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 60947
IND_T1 xcur 153

----
After rebooting the server again, I executed just the index range scan
portion of the original script (BETWEEN 1 AND 400) so that I could
answer Jonathan's question about the number of blocks remaining in the
buffer cache at the end of the run:

Execution Plan
----------------------------------------------------------
Plan hash value: 634656657

| 0 | SELECT STATEMENT | | 4000K| 206M| 198K
(1)| 00:39:47 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 4000K| 206M| 198K
(1)| 00:39:47 |
|* 2 | INDEX RANGE SCAN | IND_T1 | 4000K| | 420
(1)| 00:00:06 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation
id):
---------------------------------------------------

2 - access("ID">=1 AND "ID"<=400)

System altered.

Elapsed: 00:00:00.04

System altered.

Elapsed: 00:00:00.02

Session altered.

Elapsed: 00:00:00.00

Session altered.

Elapsed: 00:00:00.04

Session altered.

Elapsed: 00:00:00.00

2547158 rows selected.

Elapsed: 01:16:03.82

Statistics
----------------------------------------------------------
6 recursive calls
0 db block gets
2573633 consistent gets
2463148 physical reads
0 redo size
17418653 bytes sent via SQL*Net to client
280705 bytes received via SQL*Net from
client
25473 SQL*Net roundtrips to/from client

0 sorts (memory)
0 sorts (disk)

2547158 rows processed

Immediately after the two SQL statements ended (488.01 MB)
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 62400
IND_T1 xcur 65

Two minutes after the the two SQL statements ended
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 62255
IND_T1 xcur 65

Four minutes after the the two SQL statements ended
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 62236
IND_T1 xcur 65

Six minutes after the the two SQL statements ended
OBJECT_NAME STATUS BLOCKS
------------- ---------- ----------
T1 xcur 62196
IND_T1 xcur 65