I'm in charge of a datawarehouse which we update every weekend. That
means loading lots of tables with
TRUNCATE
INSERT /*+ APPEND PARALLEL */ ... SELECT ... FROM ...
COMMIT
for every table. The volume of the total load is about half to one TB.
Afterwards, when the tables are queried, every now and again an
ORA-01555 happens, sometimes even after days. In my opinion this can
only be because of delayed block cleanout, because loading and then
querying a table is *always* done serially.
If I would load the tables with
DROP TABLE
CREATE TABLE ... AS SELECT ... FROM ...
would the blocks be written pre-cleaned out, thus preventing the
ORA-01555 from happening?
Regards,
Jaap.
Oops, forgot the version: 9.2.0.8 on VMS.
how long does the truncate process take?
How long does the "insert into.. " processing take?
if it takes more than the weekend, then you need to do this
differently.
Have you tried to increase your undo_retention?
Have you tried to add more rollback segments?
ORA-01555 can only be caused by insufficient rollback segments. Make
your transactions MUCH smaller.
A good explanation can be found at: <http://asktom.oracle.com/pls/
asktom/f?p=100:11:0::::p11_question_id:1441804355350>
I am not sure I would agree with truncating a"warehouse" on a weekly
basis.?.? From your description, it sounds like the base tables are
also in the same database??? Why? And how does making a copy in these
"warehouse" tables help your overall system performance?
You might consider breaking it into smaller pieces.
insert into ... select * from oldtab where somedatefield between date0
and date1;
commit;
insert into ... select * from oldtab where somedatefield between date1
and date2;
commit;
insert into ... select * from oldtab where somedatefield between date2
and date3;
commit;
insert into ... select * from oldtab where somedatefield between date3
and date4;
commit;
etc...
and start enough of them at a time in parallel where you get the
throughput but don't saturate your I/O or CPU bandwidth - and make
sure your indexing satisfies the "where" clause.
Or, if you are really good at DCL try using exp/imp
NOT TESTED:
exp test1/test1 file = SYS$OUTPUT log = exp_XXX.log tables = XXX
feedback = 1000000 buffer = 40960000 grants = n constraints = n
indexes = n
compress = n direct = y | imp test2/test2 file=SYS$PIPE tables= XXX
rows=y ignore=y
With 10g and above, using impdp you can do this with NETWORK_LINK
without doing the expdp first.
Hi Jaap,
The next time you see this, check for "transaction table consistent
reads - undo records applied" and see if that is increasing during the
failed query.
Yes, I think CTAS is the only way of creating "clean" blocks since the
COMMIT SCN is already known as part of the DDL operation, all other
DML operations will generate blocks that need to be cleaned out
afterwards.
If you want to stick to the INSERT approach - what do you do with the
tables after loading - is there any gather statistics job running (I
hope so...).
If yes, and you can afford it, try to gather the statistics using
DBMS_STATS serially (I don't know if ANALYZE does a block cleanout, I
guess it does/has to but I haven't check that recently) which might
pose a problem in terms of runtime with that data volume. If you
gather them in parallel, the direct path read performed by the
parallel slaves can *not* do the block clean out persistently,
although it happens during the processing of the data in the PGA of
the parallel slave, the modified block won't be written back to disk.
If you run the DBMS_STATS serially the SELECT performed will do the
delayed block cleanout.
If gathering statistics serially is too expensive then executing a
serial dummy "SELECT ... FROM TAB" after the load is another idea to
force the delayed block cleanout.
Note all this also applies to the new adaptive serial direct path
reads introduced in 11.1 - since you're still on 9.2 it doesn't matter
but is interesting to know that in case the serial direct read is used
a delayed block cleanout is not written back to disk and has to be
repeated by subsequent queries.
See e.g. http://antognini.ch/2009/07/impact-of-direct-reads-on-delayed-block-cleanouts/
Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
Co-author of the "OakTable Expert Oracle Practices" book:
http://www.apress.com/book/view/1430226684
http://www.amazon.com/Expert-Oracle-Practices-Database-Administration/dp/1430226684
The dbms_stats will clean the block after an insert (conventional).
I would add two things to what has been said by Randolf
A table created with insert / * + append * /, has the blocks of the
table
with the "ITL clean" but if there are indexes, the indexes will have
the "dirty ITL, will be
therefore necessary (delayed) block cleanout
select / * + index (.....) * / from table;
because the queries that follow,
if they have a where, probably, will access to the indexes.
Instead, for "Direct Reads"mentioned Randolf,
that happen (probably when it exceeds the threshold
of _small_table_threshold), it is worth adding that,
when Oracle decides to use them, can not do cleanout, the blocks
will be in the PGA, and not SGA. It will be necessary in fact, that
DBWR writes those blocks
on disk, so that the following reading may make the cleanout.
Regards
Donatello Settembrino
That's a good point about the indexes that I forgot to mention,
although I assumed in this case due to the data volume mentioned the
load is actually done without indexes enabled / created.
Note however that there is a subtle point about the blocks generated
by the direct-path insert: Although the lock bytes on the rows are not
set and do not need to be cleaned out, the commit SCN of the ITL entry
is not known at insert time therefore the database has to look up the
commit SCN (obviously by cleaning out a single block) when
encountering such blocks and therefore I think you can still get a
ORA-01555 error even with table blocks loaded via direct-path insert
if the transaction table slot in the undo segment header has been
overridden in the meantime and Oracle can not determine the commit SCN
otherwise.
Hi Randolf,
I do not think that this happens
settembrino@ORA11>select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit
Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
settembrino@ORA11>select name , value from v$parameter where name =
'db_block_size';
NAME VALUE
---------------------------- -----------
db_block_size 16384
I create a table, just a row for block
settembrino@ORA11>create table t (x varchar2(4000), y varchar2(4000),
z varchar2(4000), w varchar2(4000));
Table created.
I display the current SCN of the moment
settembrino@ORA11>SELECT to_char(CURRENT_SCN) current_scn FROM V
$DATABASE;
CURRENT_SCN
----------------------------------------
39099538717
settembrino@ORA11>insert /*+ append */ into t
2 select lpad('x', 4000, 'x'), lpad('y', 4000, 'y'), lpad('z',
4000, 'z'), lpad('w', 4000, 'w')
3 from dual
4 connect by level <= 100;
100 rows created.
The following select, confirm the use of direct-path insert
settembrino@ORA11>select * from t;
select * from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel
I view the scn, before committing
settembrino@ORA11>SELECT to_char(CURRENT_SCN) FROM V$DATABASE;
CURRENT_SCN
----------------------------------------
39099538765
settembrino@ORA11>commit;
Commit complete.
displays the SCN, after the commit
settembrino@ORA11>SELECT to_char(CURRENT_SCN) FROM V$DATABASE;
CURRENT_SCN
----------------------------------------
39099538768
if you look at the SCN of rows in the table (note that this is the
first query on the table)
settembrino@ORA11>set autotrace on;
settembrino@ORA11>select to_char(min(ora_rowscn)),
to_char(max(ora_rowscn)) from t ;
TO_CHAR(MIN(ORA_ROWSCN)) TO_CHAR(MAX(ORA_ROWSCN))
----------------------------------------
----------------------------------------
39099538766 39099538766
..
..
..
Statistics
----------------------------------------------------------
9 recursive calls
1 db block gets
171 consistent gets
100 physical reads
168 redo size
295 bytes sent via SQL*Net to client
251 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SCN in the table is a number > of that before committing (39099538765)
and <= of that immediately after the commit.
Which means that Oracle is successful attributing the SCN blocks of
the table, and without running (delayed)
block cleanout (only 168 bytes of redo.)
If I create a table as the previous test but with 1000 rows instead of
100 rows, I get the same number of bytes of redo
settembrino@ORA11>insert /*+ append */ into t
2 select lpad('x', 4000, 'x'), lpad('y', 4000, 'y'), lpad('z',
4000, 'z'), lpad('w', 4000, 'w')
3 from dual
4 connect by level <= 1000;
. .
. .
. .
Statistics
----------------------------------------------------------
9 recursive calls
1 db block gets
1073 consistent gets
1000 physical reads
168 redo size
295 bytes sent via SQL*Net to client
250 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
as I said, always 168 bytes of redo but this time I have 1000 physical
reads
The insert in direct-path bypass the buffer cache and writes above HWM
It is different for an insert in conventional mode
settembrino@ORA11>drop table t purge;
Table dropped.
settembrino@ORA11>create table t (x varchar2(4000), y varchar2(4000),
z varchar2(4000), w varchar2(4000));
Table created.
settembrino@ORA11>insert into t
2 select lpad('x', 4000, 'x'), lpad('y', 4000, 'y'), lpad('z',
4000, 'z'), lpad('w', 4000, 'w')
3 from dual
4 connect by level <= 100;
100 rows created.
before reading, flush the buffer cache
settembrino@ORA11>alter system flush buffer_cache;
System altered.
settembrino@ORA11>commit;
Commit complete.
The following select will have to delayed block cleanout, because now
the blocks
are on disk
settembrino@ORA11>set autotrace on;
settembrino@ORA11>select count(*) from t ;
COUNT(*)
----------
100
Execution Plan
----------------------------------------------------------
--------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 42 (0)|
| 1 | SORT AGGREGATE | | 1 | |
| 2 | TABLE ACCESS FULL| T | 113 | 42 (0)|
--------------------------------------------------------
Statistics
----------------------------------------------------------
10 recursive calls
1 db block gets
304 consistent gets
127 physical reads
7376 redo size
224 bytes sent via SQL*Net to client
250 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
the amount of redo generated, should confirm that delayed block
cleanout occurs
Regards
Donatello Settembrino
I depends on what you exactly mean by "this" means. Note I didn't say
that delayed block cleanout will happen in the same fashion as for
conventional DML, but you simply can take some block dumps after the
direct-path insert and you'll see that there is no commit SCN
available from the ITL slots, so the block is not really "clean".
If you check the session statistics of your session running the query
after committing the direct-path insert you'll notice that the first
time the session accesses a block from the table it will perform a
block cleanout and generate redo for that single block, from then on
it will re-use the "cached" commit SCN for accessing / processing the
remaining blocks.
The corresponding statistics are:
cleanouts only - consistent read gets (increases by one after first
access in the session)
Commit SCN cached (increases depending on the number of blocks
accessed)
That is also the explanation why you don't see a difference in the
amount of redo generated if you use a larger table - it will still do
a cleanout of a single block to determine the commit SCN.
But you're right that my explanation above was not really sound
regarding direct-path inserts and cleanout since running a query
afterwards does not really clean out every block.
Regards,
Randolf
Hi guys,
What an in-depth discussion! Thank you all for increasing my
knowledge. As far as I know this stuff is not easily found, not even
at the Oracle support site. I make good use of it.
On thing is still puzzling me: to prevent an ORA-01555 due to delayed
block cleanout, in case of the rollback transaction slot being
overwritten, in Note 40689.1 on the Oracle Support site the following
is recommended:
alter session set optimizer_goal = rule;
select count(*) from table_name;
for the same reason for which Randolf Geist recommends gathering
statistics I guess, forcing a visit and cleaning out all the blocks.
But why won't this run into the same ORA-01555? Or is discovering that
the rollback transaction slot isn't there anymore somehow enough when
doing a count(*) or when gathering statistics?
Regards,
Jaap
Well, the basic idea is that you clean the blocks out while the
required information is still in the undo available, so it's all about
timing in principle.
But how about providing more details about your particular case:
1. Do you have indexes enabled during the load or not?
2. Are you sure that the insert is actually a direct-path insert?
There are various reasons why Oracle might silently fall back to
conventional insert without any notice.
You can check the latter by looking at the amount of undo generated
since a direct-path insert will not generate undo (provided that no
indexes are enabled during the load, otherwise undo will be generated
by the necessary index maintenance) or by simply attempting to query
the table after the insert but before the commit - any attempt to
access a table after a direct-path insert within the same transaction
will error out with "ORA-12838: cannot read/modify an object after
modifying it in parallel"
Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
Co-author of the "OakTable Expert Oracle Practices" book:
http://www.apress.com/book/view/1430226684
http://www.amazon.com/Expert-Oracle-Practices-Database-Administration/dp/1430226684
> If you check the session statistics of your session running the query
> after committing the direct-path insert you'll notice that the first
> time the session accesses a block from the table it will perform a
> block cleanout and generate redo for that single block, from then on
> it will re-use the "cached" commit SCN for accessing / processing the
> remaining blocks.
>
> The corresponding statistics are:
> cleanouts only - consistent read gets (increases by one after first
> access in the session)
> Commit SCN cached (increases depending on the number of blocks
> accessed)
>
> That is also the explanation why you don't see a difference in the
> amount of redo generated if you use a larger table - it will still do
> a cleanout of a single block to determine the commit SCN.
>
> But you're right that my explanation above was not really sound
> regarding direct-path inserts and cleanout since running a query
> afterwards does not really clean out every block.
Randolf,
I misunderstood what he meant, with my test I simply wanted to
demonstrate the
difference, the delayed block cleanout using inserts in direct path,
compared to a conventional insert.
I am aware of the existence of a mechanism of "SCN caching" but
honestly
I have helped to identify a" flaw " in the reasoning I used in my
tests and
for this, thank you a lot.
In summary, then ..
settembrino@ORA11> create table t (x varchar2(4000), y varchar2(4000),
z varchar2(4000), w varchar2(4000));
Table created.
settembrino@ORA11>insert /*+ append */ into t
2 select lpad('x', 4000, 'x'), lpad('y', 4000, 'y'), lpad('z',4000,
'z'), lpad('w', 4000, 'w')
3 from dual
4 connect by level <= 100;
100 rows created.
settembrino@ORA11>select * from t;
select * from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in
parallel
settembrino@ORA11>commit;
Commit complete.
-- control the values of the moment for the statistics "cleanouts only
- consistent read gets" and "Commit SCN cached"
settembrino@ORA11>select n.name, m.value
2 from v$mystat m, v$statname n
3 where m.STATISTIC# = n.STATISTIC#
4 and n.name in ('cleanouts only - consistent read gets' ,
'Commit SCN cached');
NAME VALUE
----------------------------------------------------------------
----------
Commit SCN cached 4
cleanouts only - consistent read gets 11
-- I run a full table scan on table
select count(*) from t ;
COUNT(*)
---------------
100
-- At this point I capture the image of these blocks in buffer cache
settembrino@ORA11>select v.dirty, count(*)
2 from v$bh v
3 where v.objd = (select object_id from user_objects where
object_name = 'T')
4 group by v.dirty;
D COUNT(*)
--- ---------------
Y 8
N 99
-- and note that only one block is dirty (I think that the other seven
(8-1) are those used by ASSM to manage the space)
-- after the full table scan, I check the values of the moment for the
statistics "cleanouts only - consistent read gets" and "Commit SCN
cached"
settembrino@ORA11>select n.name, m.value
2 from v$mystat m, v$statname n
3 where m.STATISTIC# = n.STATISTIC#
4 and n.name in ('cleanouts only - consistent read gets' ,
'Commit SCN cached');
NAME VALUE
----------------------------------------------------------------
----------
Commit SCN cached 5
cleanouts only - consistent read gets 12
I think that's you want to tell me in your reply,
or am I wrong?
Regards,
Donatello Settembrino
Yes, quite exactly. You might want to check your test case setup -
your table and data definition very likely causes chained rows and I
wonder if this has an impact on the number of dirty blocks found.
Randolf
Randolf,
I do not think that in my tests are chained rows.
If I look at the statistics before and after the
full table scan
settembrino@ORA11>select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and a.name in ('table scan rows gotten', 'table fetch continued
row', 'table scan blocks gotten');
NAME
VALUE
----------------------------------------------------------------
----------
table scan rows gotten
984564
table scan blocks gotten
14699
table fetch continued
row 17
settembrino@ORA11>select count(*) from t;
COUNT(*)
----------
100
settembrino@ORA11>select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and a.name in ('table scan rows gotten', 'table fetch continued
row', 'table scan blocks gotten');
NAME
VALUE
----------------------------------------------------------------
----------
table scan rows gotten
984664
table scan blocks gotten
14799
table fetch continued
row 17
which confirms that there are no chained rows.
Moreover, using the procedure of Thomas Kyte to
monitor the space used, observe:
settembrino@ORA11>set serveroutput on;
settembrino@ORA11>exec show_space('T', 'SETTEMBRINO', 'TABLE');
Unformatted Blocks ..................... 0
FS1 Blocks (0-25) ..................... 0
FS2 Blocks (25-50) ..................... 0
FS3 Blocks (50-75) ..................... 0
FS4 Blocks (75-100)..................... 0
Full Blocks ..................... 100
Total Blocks............................ 128
Total Bytes............................. 2,097,152
Total MBytes............................ 2
Unused Blocks........................... 21
Unused Bytes............................ 344,064
Last Used Ext FileId.................... 40
Last Used Ext BlockId................... 861,248
Last Used Block......................... 43
PL/SQL procedure successfully completed.
which means that have been allocated 128 blocks,
100 blocks are full("Total Blocks", 1 row
for block), 21 are unused ("Unused Blocks") and
7 (128-100-21) are those used by ASSM to manage
the space
which are also the seven dirty blocks additional
in buffer cache(at least I think)
settembrino@ORA11>select v.dirty, count(*)
2 from v$bh v
3 where v.objd = (select object_id from user_objects where
object_name = 'T')
4 group by v.dirty;
D COUNT(*)
--- ---------
Y 8
N 99
Regards,
Donatello Settembrino
Thanks, I assumed a 8KB default block size but you seem to have 16KB
according to the output, given the default PCTFREE of 10 (I don't see
any non-default setting in your CREATE TABLE statement) your 4 times
4000 bytes must be very close to the limit of free space available in
a 16KB block...
Randolf
Exactly, that's right
By the way - we digress - my comment about PCTFREE is irrelevant in
this case since Oracle will stuff this row into a single block no
matter what the PCTFREE setting is as long as the row fits into the
block.
And: You don't have chained rows but your test case is flawed: A
select count(*) never reports "table fetch continued row" because it
does not have to visit the columns but only accesses the row entries
in the row directory. So even with chained rows your test will not
report any "table fetch continued row" statistics. You would need to
count/access a column that is part of the "chained" row pieces - and
it would have to be nullable, otherwise recent versions of Oracle will
transform a count(col) to a count(*).
Randolf
Thanks for the correction,
when I did the second time the test
I forgot to change the query
(select count (col) from table) then,
the test may be incorrect
for the reasons you highlighted.
Regards,
Donatello