Planning Extents
It is often said that there is another dimension to space, which is
time. For DBAs that is the time that they spend managing their
database space. That time is often considerable, and matched only by
the strength of their opinions on the matter. The options are many,
and the differences between versions are profound, making this a
difficult topic to discuss. Nevertheless, the issue must be addressed
prior to database creation, lest you end up with a mess that is going
to waste a lot of time for both the DBA and the database.
Does extent size matter?
Yes, extent size does matter, but not greatly. Nevertheless, all
extents should be a multiple of the multiblock read size. Otherwise,
when a full table or index scan is performed, an extra multiblock read
will be required to read the last few blocks of each extent, except
probably the last one. This is because multiblock reads never span
extent boundaries, even if the extents happen to be contiguous, and
even in a locally managed tablespace.
Consider for example the table T1 illustrated below. It is comprised
of 8 extents of 50 blocks each. The first block is the segment header,
there are 389 data blocks in use, and there are 10 free blocks above
the high water mark. With a multiblock read count of 16 blocks, and
assuming none of the blocks are already in the cache, a full table
scan on this table will require 4 data block reads per extent, except
the last - a total of 31 multiblock reads.
If the table is rebuilt as T2 with an extent size that is an exact
multiple of the multiblock read size, then the number of multiblock
reads required to scan the table is minimized. This is illustrated in
the figure below. The table is now comprised of 5 extents of 80 blocks
each. So, a full table scan now requires 5 multiblock reads per
extent, or a total of 25 multiblock reads.
Please note that it was not the reduction in the number of extents as
such that made the difference. There would be no further saving in
rebuilding the table with a single extent of 400 blocks. The number of
multiblock reads required to scan the table would still be 25.
Does the number of extents matter?
It is necessary to dwell on this briefly because some DBAs are
reluctant to allow more than a few extents in any segment in the
mistaken belief that such "fragmentation" degrades performance. This
myth was started by the Oracle documentation for the export utility in
version 3, and is still abroad. Within reason, the performance impact
of multiple extents is almost insignificant if they are sized
correctly. Indeed, some senior Oracle developers have stated that
extents can be allowed to proliferate, even to thousands of extents
per segment "without a significant impact on performance". However,
that statement needs to be carefully qualified, and even then it is
too generous.
Dictionary Managed Tablespaces
In dictionary managed tablespaces, every extent is represented by a
row in the data dictionary. Free extents are recorded in the FET$
table, which is clustered in the C_TS# cluster. Because it is created
without a SIZE clause, one block is reserved in the cluster for each
tablespace. However, if a tablespace ever has more free extents than
can be accommodated in a single cluster block, then cluster block
chaining will be necessary. This chaining cannot be repaired without
recreating the database, and can have a significant impact on data
dictionary performance and space management transactions in
particular. Ideally, the number of free extents in a tablespace should
never be greater than can be recorded in the primary cluster block for
that tablespace, which is about 500 free extents for a database with
an 8K database block size.
Used extents are recorded in the data dictionary in the UET$ table,
which is clustered in the C_FILE#_BLOCK# cluster. This data dictionary
cluster is sized on the assumption that segments will have an average
of just 4 or 5 extents each. Unless your data dictionary was
specifically customized prior to database creation to allow for more
used extents per segment, then creating segments with thousands of
extents will cause excessive cluster block chaining in this cluster.
Excessive numbers of used and/or free extents can also distort the
operation of the dictionary cache LRU mechanism. So extents should not
be allowed to proliferate, but it is because of the performance impact
on the data dictionary and dictionary cache, rather than any supposed
impact of full table scan performance.
The performance impact of a proliferation of extents is most obvious
when attempting to drop a segment with a large number of extents.
There are anecdotes of such operations taking days, or taking so long
that is was quicker to rebuild the database.
Locally Managed Tablespaces
Extents in locally managed tablespaces are not represented in the data
dictionary at all. Free space is represented in a bitmap that is
stored in the tablespace itself, and used extents are only recorded in
the extent map in the segment header block of each segment, and if
necessary, in additional extent map blocks within the segments. This
is a great blessing in every respect but one - namely that this
information is not cached in the dictionary cache. It must be obtained
from the database blocks every time that it is required, and if those
blocks are not in the buffer cache, that involves I/O and potentially
lots of it. For example, a query against the DBA_EXTENTS view could
well have to read every segment header and every additional extent map
block in the entire database. For this reason, we recommend that the
number of extents per segment in locally managed tablespaces be
limited to the number of rows that can be accommodated in the extent
map within the segment header block - that is, approximately
(db_block_size / 16) - 7.
There is one case in which data dictionary updates are required for
space management operations in locally managed tablespaces. Namely, if
tablespace quotas are used. The total amount of space used against
each quota is maintained in the TSQ$ table, which is clustered in the
C_USER# cluster. However, updates to this table do not require the ST
enqueue, and so there is no serialization risk at this point.
Should all extents be the same size?
For dictionary managed tablespaces, it is strongly recommended that
all extents in each tablespace should be a uniform size, or at least a
uniform set of sizes, to prevent or control fragmentation. If all
extents are the same size, then there will be no unusable free space
fragments (except possibly one, right at the end of each datafile).
Hence it will never be necessary for Oracle to attempt to coalesce
free space fragments when searching for a free extent. This greatly
simplifies space management planning, and also optimizes the recursive
transactions used for extent space management.
The performance of these space management transactions for dictionary
managed tablespaces is critical, because they are single threaded on
the ST (Space Transaction) enqueue. This is necessary to ensure that
the changes made by space transactions to the data dictionary are
atomic. So whenever a segment in a dictionary managed tablespace is
created or extended or shrunk or freed, the ST enqueue must be held
for the duration of that transaction, to the exclusion of all other
such space management transactions. Contention for the ST enqueue can
be a problem, particularly in parallel segment creation operations. To
reduce the risk of contention for the ST enqueue, space management
transactions should be both minimized and optimized, and adopting a
fixed extent size policy is an important part of the strategy.
It is interesting to note that it is possible for a space transaction
against a dictionary managed tablespace to require a recursive space
transaction. For example, to coalesce free extents the free extent
rows for the tablespace must be sorted. This sort may require the
creation of a temporary segment, which involves a space transaction.
That space transaction may in turn trigger its rollback segment to
extend, which involves another space transaction. Recursive space
transactions do not attempt to coalesce free space. They just fail if
a big enough free extent is not immediately available.
Oracle 7.3
In version 7.3 (not that anyone should be creating 7.3 databases
anymore), a fixed extent size policy is best implemented by specifying
a DEFAULT STORAGE clause for each tablespace, with equal INITIAL and
NEXT extent size and zero PCTINCREASE, and then allowing the storage
parameters of the segments to be taken from their tablespaces.
However, there are a few traps to be aware of.
Unless your extents are less than 5 blocks, Oracle rounds the extent
size up to a multiple of 5 blocks. For example, if your multiblock
read count is 16 blocks and you attempt to allocate extents of 48
blocks (3 multiblock reads), then you will actually get extents of 50
blocks as in the example at the beginning of this tip. Extents may be
enlarged further if their allocation would otherwise leave a free
space fragment of less than 5 blocks. Oracle will only fail to round
an extent size up to a multiple of 5 blocks, if it cannot allocate an
extent of the enlarged size. It is unfortunate that Oracle chose 5 as
the multiple for rounding up extent sizes, rather than a binary power
such as 4 or 8. As a result, multiblock reads will not align to extent
boundaries unless all extents are a multiple of 5 blocks as well as
being a multiple of the multiblock read size.
Another feature of Oracle that can undermine a fixed extent size
policy is extent trimming. When Oracle7 creates a table or index in
parallel, each of the final set of parallel query slaves writes to its
own temporary segment. The temporary segments are then merged into a
single data segment, and the unused part of the final extent is
trimmed off and returned to the free extent table. Under a fixed
extent size policy, the small free extents that are trimmed off are
unusable. Extent trimming can be disabled by setting event 10901.
event = "10901 trace name context forever"
If you have adopted a fixed extent size policy, and have no
tablespaces with a non-zero default PCTINCREASE value, you can also
save SMON the trouble of checking for free extents to coalesce every 5
minutes when it wakes up to perform its timeout actions. This further
reduces the risk of contention for the ST enqueue. The event setting
to disable free extent coalescing by SMON is 10269.
event = "10269 trace name context forever"
Even if you have not adopted a fixed extent size policy, this last
event should be set in the parameter file of all but one instance of a
parallel server database to prevent contention between the instances
for space management resources. Similarly, event 10061 should be set
in all but one instance of a parallel server database, to ensure that
only that instance will check for stray temporary segments to clean
up. By default, SMON performs this check once every 2 hours and 5
minutes.
Please note that if multiple event settings are included in the
parameter file, as contemplated here, then no other parameter settings
must be placed between them. Otherwise, only the last set of
consecutive event settings will be effective.
Another event setting that is sometimes used in connection with a
fixed extent size policy is event 10268. This event prevents forward
coalescing when freeing extents. Forward coalescing means that
whenever an extent is being freed, Oracle checks to see whether there
is already a free extent immediately following the extent about to be
freed. If so, it deletes that row of the free extent table and inserts
a new free extent row for the combined free space. When freeing
multiple extents, Oracle frees them in reverse order to increase the
chances of being able to perform forward coalescing.
Forward coalescing is not necessary under a fixed extent size policy,
because the extents being freed are exactly the size that will be
required when they are later reused. However, forward coalescing also
reduces the risk of cluster block chaining in C_TS# and should
therefore not normally be disabled.
Oracle 8.0
One of the most under-valued features of Oracle8 is ability to specify
a MINIMUM EXTENT size for each tablespace. This feature makes adopting
a fixed extent size policy much easier. If you specify a minimum
extent size that is a multiple of the multiblock read size, Oracle
will not allocate extents of less than that size and will round all
extent sizes up to a multiple thereof. This prevents unwanted rounding
to a multiple of 5 blocks, prevents unwanted extent trimming, and even
lets you get away with values of PCTINCREASE other than 0 and 100
without introducing severe fragmentation problems. We strongly
recommend the use of this feature for all Oracle 8.0 tablespaces.
Oracle 8.1
Adopting a fixed extent size policy is even easier still under version
8.1, because locally managed tablespaces support it explicitly.
Locally managed tablespaces can also be created with the AUTOALLOCATE
policy. However, this is less efficient in terms of both space and
algorithmic complexity than the UNIFORM SIZE policy. We strongly
recommend the use of the locally managed uniform size policy for all
Oracle 8.1 tablespaces, except SYSTEM.
Our only reservation in recommending a fixed extent size policy is
that we have often seen it implemented badly. Firstly, a fixed extent
size policy is no excuse for a proliferation of extents. Secondly, the
fact that two segments have the same extent size requirements, does
not necessarily mean that they should be placed in the same tablespace
and reside on the same disks. Segments that have incompatible I/O
patterns should reside on separate disks. If that means having
multiple tablespaces with the same fixed extent size, so be it.
Planning Datafiles
Disk load balancing is a matter relocating datafiles on disks in order
to minimize waiting time in the I/O subsystem. To facilitate disk load
balancing, it strongly recommended that all ordinary datafiles be
exactly the same size, and that a uniform stripe breadth be adopted.
Why uniformly sized datafiles?
In disk load balancing, datafiles are moved around to spread the I/O
load evenly across the available disks. If the datafiles are
arbitrarily sized, you will be severely limited in how you can
rearrange them, but if they are uniformly sized, you will have maximum
flexibility.
The datafile size should be chosen in proportion to the total size of
the database with an allowance for database growth. If the datafiles
are too large, then the free space will be unnecessarily fragmented
between tablespaces. However, a proliferation of small datafiles
should also be avoided, because most datafile headers must be updated
for most checkpoints, and each Oracle process' memory usage is in part
proportional to the number of datafiles that it has open. There the
number of datafiles also impacts the rate at which DBWn attempts to
write, and thus a proliferation of small datafiles increases the risk
of write complete waits.
When choosing a datafile size for a database that will be created on
raw logical volumes, remember that allowance needs to be made for
logical volume control block, if any, and a single datafile header
block. That is, the SIZE specified in the filespec clause of the
CREATE TABLESPACE command must be at least that much smaller than the
logical volume in which it is being created. 128K is an adequate
allowance in all cases.
Also, remember that locally managed tablespaces have a 64K bitmap
after the datafile header block. This means that the SIZE of a locally
managed tablespace with a uniform extent allocation policy should be a
multiple of the extent size plus 64K, otherwise the final "extent"
will be too small to be used.
Why a uniform stripe breadth?
These uniformly sized datafiles should be divided into two data
protection classes, namely RAID-5 and striping with mirroring. Within
each of these classes, all datafiles should be uniformly striped. You
should adopt a modest stripe breadth that is well suited to your disk
technology, and stick to it religiously. Your logical volume naming
conventions should encode both the data protection class and disk
region, and possibly the stripe breadth as well.
Using a uniform stripe breadth within each data protection class gives
you maximum flexibility in disk load balancing. And using a modest
stripe breath enables you to maintain the required I/O separation.
It is sometimes objected that some database segments require higher
concurrency and thus broader striping than the moderate stripe breadth
proposed above for all datafiles. In most cases, this can be catered
for by ensuring that these segments have extents in a number of such
datafiles residing on different sets of disks. If however a single
extent will contain a hot spot that requires higher concurrency, then
separate provision should be made for that tablespace only.
What naming convention?
We recommended above that the naming convention for logical volumes
should encode both the data protection class and disk region, and
possibly the stripe breadth as well. However, the naming convention
for the file systems or symbolic links to which they map should encode
the database name, tablespace name and tablespace relative datafile
number instead. This is intended to reduce the risk of administrative
error during backup and restore operations. Note that it is assumed
here that symbolic links will be used to refer to raw logical volumes.
This assists both in datafile identification and in disk load
balancing.
The naming conventions adopted should not be overly verbose,
particularly those used for directory names. Long pathname components
prevent all subordinate pathnames from being cached in the DNLC (name
cache), which is used by the operating system for pathname to inode
translations. For many operating systems, a long pathname component is
defined as anything longer than 14 bytes. Further, because directories
are searched linearly, frequently opened files (such as
subdirectories) should be created first in their directories. It is
also better have a deeper directory structure with a low branching
factor, rather than a shallower directory structure with many files in
each directory. However, the absolute pathname to all datafiles (and
raw logical volumes) should be limited to 59 bytes if possible.
Also, beware of allowing large numbers of files to accumulate in the
archive, audit and dump destination directories. If this does occur,
the directory concerned should be entirely removed and then
recreated.
If you have uniformly sized datafiles with clearly differentiated I/O
characteristics, and a moderate number of tablespaces with well-
differentiated I/O requirements, then you will then be ideally
equipped to perform disk load balancing.
· Because all datafiles are the same size, any set of datafiles can be
exchanged as required.
· Because the tablespaces are well differentiated, the V$FILESTAT
statistics can be relied upon to characterize the I/O.
o The ratio of blocks read to read operations can be used to derive an
indicator of the degree of sequential reading as opposed to random
reading.
o The ratio of write operations to read operations shows whether the I/
O is at all write-intensive.
o And, most importantly, the total number of read and write operations
serves as an indicator of I/O load.
· Finally, because the characteristics of all logical volumes are well
known, it is a relatively simple matter to balance disk workload
almost perfectly, as long as you have enough distinct disks.
Planning Tablespaces
Grouping segments into tablespaces is one of the most complex tasks in
a DBA's job. There are not normally segment level I/O statistics
available, so a detailed understanding of the physical data model and
its usage characteristics is required. There may also be numerous
competing factors to weigh together.
What are the principles of I/O separation?
Firstly, segments should not be placed together in a tablespace if
they have incompatible I/O patterns. Otherwise your ability to
optimize the I/O for that tablespace, as outlined in our series of
disk configuration tips, will be compromised. There are two particular
access patterns that should be catered for.
Sequential I/O
Segments that exhibit a predominantly sequential I/O pattern, such as
single threaded full table scans or fast full index scans, should be
separated into their own tablespaces. If possible, the datafiles for
these tablespaces should be placed on disks in such a way as to
maximize the probability that there will be no concurrent access to
those disks when these segments are being scanned. Service times can
be dramatically improved if such I/O separation can be achieved,
because very little seek time will be required, and many disk reads
will be satisfied immediately from the track buffers.
Write Intensive I/O
If RAID-5 data storage is a consideration, segments that are written
to intensively, such as by parallel direct loads or bulk updates,
should be separated into distinct tablespaces. If possible, the
datafiles for these tablespaces should be placed on faster mirrored
disk rather than on RAID-5.
There is also a case for separating read-only segments so that their
tablespaces can be made read-only. This does not affect I/O
performance directly, but can have a big impact on backup times.
Secondly, segments should not be placed together if they might need to
be separated during disk load balancing. There are two main reasons
why segments might need to be separated.
Activity
Two segments with heavy I/O requirements should not be placed in the
same tablespace. Otherwise it will not normally be possible to spread
the load by moving datafiles. Even if the tablespace has multiple
datafiles, Oracle will attempt to spread the load evenly between them
by cycling the extents for each segment around the datafiles. While it
is possible to move segments between datafiles, the lack of segment
level I/O statistics makes it difficult to know which segments to move
and where to move them.
Relatedness
In general, it is safest to separate related segments into separate
tablespaces. However, not all query execution plans that access two
related segments cause repeated alternating access between them. For
example, two related tables may reside on the same disks if they are
only ever joined using a sort-merge join or hash join, because these
query execution plans cause the tables to be accessed in series. The
same applies to index merges. However, a nested loops join normally
involves repeated access to first the outer table, then the index on
the inner table, and then the inner table itself. If any 2, or all 3,
of these segments reside on the same disks, head thrashing will occur
and will result in poor I/O service times.
Similarly, an index range scan normally involves repeated alternating
access to the index and the table, and partition parallel execution
plans normally involve concurrent access to multiple partitions of the
same index. These execution plans can also cause disk head thrashing
if related segments have not been duly separated. Because of these
factors, it is safest to always separate related segments into
distinct tablespaces, unless you are confident that the optimizer will
always choose an execution plan that does not involve repeated
alternating access between the segments.
These are not the only factors that should be considered when grouping
segments into tablespaces. If a fixed extent size policy is in place,
then extent size requirements must be considered. Another particularly
important consideration is partitioning for parallel server databases.
Segments that are predominantly accessed from one instance should be
separated from segments that are predominantly accessed from another.
One final consideration is statistics. If major application segments
are each placed into their own tablespace, the V$FILESTAT statistics
can be guaranteed to correspond exactly with the I/O against those
segments. If so, it is possible to monitor trends in the I/O to these
tables and indexes and thus determine when they should be rebuilt.
These statistics can also assist with disk load balancing.
Regards,
Mohammed sheriff.s
+919894022554