Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DBMS_STATS vs. manual ANALYZE ... desupported?

0 views
Skip to first unread message

Domenic

unread,
Sep 16, 2004, 4:53:12 PM9/16/04
to
Does anyone know *why* Oracle is saying we shouldn't do manual
analyzing via our own scripts with or without dynamic PL/SQL loops?

Instead, they're saying if we don't use dbms_stats, we're going
straight to hell. I find using my own scripts allows me to create
histograms where needed, to analyze more active tables and partitions
more frequently, etc.

Instead, in the 10g docs it says if we don't switch over to
dbms_stats, the optimizer won't *see* our stats collected directly via
ANALYZE in a future release.

What does that mean? I thought dbms_stats just generated ANALYZEs in
a loop behind the scenes. Am I missing something? I thought the end
result was the same and that it could not be determined which method
was used.

Thanks in advance for all replies ...

Domenic

Niall Litchfield

unread,
Sep 16, 2004, 5:44:45 PM9/16/04
to
"Domenic" <dome...@hotmail.com> wrote in message
news:c7e08a19.0409...@posting.google.com...

> Does anyone know *why* Oracle is saying we shouldn't do manual
> analyzing via our own scripts with or without dynamic PL/SQL loops?

Because it is more sensible to have one standard flexible way of collecting
stats?

> Instead, they're saying if we don't use dbms_stats, we're going
> straight to hell.

I'd like to see the reference.

> I find using my own scripts allows me to create
> histograms where needed, to analyze more active tables and partitions
> more frequently, etc.

I don't see any reason why you can't write scripts that use dbms_stats
yourself? We do. All of your requirements above can be met by dbms_stats.

> Instead, in the 10g docs it says if we don't switch over to
> dbms_stats, the optimizer won't *see* our stats collected directly via
> ANALYZE in a future release.
>
> What does that mean? I thought dbms_stats just generated ANALYZEs in
> a loop behind the scenes. Am I missing something? I thought the end
> result was the same and that it could not be determined which method
> was used.

In 8i (might not be true of 817) your understanding is correct. In 9 and
above it isn't.

SQL> create table t2
2 as select * from all_objects
3 ;

Table created.

SQL> alter session set sql_trace=true;

Session altered.

SQL> exec dbms_stats.gather_table_stats(USER,'T2',CASCADE => TRUE);

PL/SQL procedure successfully completed.

SQL> alter session set sql_trace=FALSE;

Session altered.


Then look at the trace file, you'll see a whole bunch of plain old sql that
is used to gather the stats. Stuff like

PARSING IN CURSOR #57 len=2257 dep=1 uid=68 oct=3 lid=68 tim=221966981316
hv=3081725647 ad='67a23b1c'
select /*+ cursor_sharing_exact use_weak_name_resl dynamic_sampling(0)
no_monitoring */ count(*),count("OWNER"),count(distinct
"OWNER"),sum(vsize("OWNER")),substrb(dump(min("OWNER"),16,0,32),1,120),substrb(dump(max("OWNER"),16,0,32),1,120),count("OBJECT_NAME"),count(distinct
"OBJECT_NAME"),sum(vsize("OBJECT_NAME")),substrb(dump(min("OBJECT_NAME"),16,0,32),1,120),substrb(dump(max("OBJECT_NAME"),16,0,32),1,120),count("SUBOBJECT_NAME"),count(distinct
"SUBOBJECT_NAME"),sum(vsize("SUBOBJECT_NAME")),substrb(dump(min("SUBOBJECT_NAME"),16,0,32),1,120),substrb(dump(max("SUBOBJECT_NAME"),16,0,32),1,120),count("OBJECT_ID"),count(distinct
"OBJECT_ID"),sum(vsize("OBJECT_ID")),substrb(dump(min("OBJECT_ID"),16,0,32),1,120),substrb(dump(max("OBJECT_ID"),16,0,32),1,120),count("DATA_OBJECT_ID"),count(distinct
"DATA_OBJECT_ID"),sum(vsize("DATA_OBJECT_ID")),substrb(dump(min("DATA_OBJECT_ID"),16,0,32),1,120),substrb(dump(max("DATA_OBJECT_ID"),16,0,32),1,120),count("OBJECT_TYPE"),count(distinct
"OBJECT_TYPE"),sum(vsize("OBJECT_TYPE")),substrb(dump(min("OBJECT_TYPE"),16,0,32),1,120),substrb(dump(max("OBJECT_TYPE"),16,0,32),1,120),count("CREATED"),count(distinct
"CREATED"),substrb(dump(min("CREATED"),16,0,32),1,120),substrb(dump(max("CREATED"),16,0,32),1,120),count("LAST_DDL_TIME"),count(distinct
"LAST_DDL_TIME"),substrb(dump(min("LAST_DDL_TIME"),16,0,32),1,120),substrb(dump(max("LAST_DDL_TIME"),16,0,32),1,120),count("TIMESTAMP"),count(distinct
"TIMESTAMP"),sum(vsize("TIMESTAMP")),substrb(dump(min("TIMESTAMP"),16,0,32),1,120),substrb(dump(max("TIMESTAMP"),16,0,32),1,120),count("STATUS"),count(distinct
"STATUS"),sum(vsize("STATUS")),substrb(dump(min("STATUS"),16,0,32),1,120),substrb(dump(max("STATUS"),16,0,32),1,120),count("TEMPORARY"),count(distinct
"TEMPORARY"),sum(vsize("TEMPORARY")),substrb(dump(min("TEMPORARY"),16,0,32),1,120),substrb(dump(max("TEMPORARY"),16,0,32),1,120),count("GENERATED"),count(distinct
"GENERATED"),sum(vsize("GENERATED")),substrb(dump(min("GENERATED"),16,0,32),1,120),substrb(dump(max("GENERATED"),16,0,32),1,120),count("SECONDARY"),count(distinct
"SECONDARY"),sum(vsize("SECONDARY")),substrb(dump(min("SECONDARY"),16,0,32),1,120),substrb(dump(max("SECONDARY"),16,0,32),1,120)
from "NIALL"."T2" sample ( 18.7074829932) t
END OF STMT

etc etc. (the above is from 10 but the move started in (at least) 9). As to
what it means - move to DBMS_STATS.

DBMS_STATS is able to gather both more and more accurate information than
analyse (there are exceptions but these are generally minor). In principle
giving the CBO better information to use is A Good Thing.


--
Niall Litchfield
Oracle DBA
http://www.niall.litchfield.dial.pipex.com

--
Niall Litchfield
Oracle DBA
http://www.niall.litchfield.dial.pipex.com


Howard J. Rogers

unread,
Sep 16, 2004, 6:11:18 PM9/16/04
to
Niall Litchfield wrote:

> "Domenic" <dome...@hotmail.com> wrote in message
> news:c7e08a19.0409...@posting.google.com...
>> Does anyone know *why* Oracle is saying we shouldn't do manual
>> analyzing via our own scripts with or without dynamic PL/SQL loops?
>
> Because it is more sensible to have one standard flexible way of
> collecting stats?


I didn't see the original post, I'm sorry... but whilst I agree with Niall's
answer, I'll offer a slightly different one:

Because you can't parallelise the manual analysing (dbms_stats can)
Because you can't dynamically and automatically work out whether you should
collect histograms for a column suffering from data skew or not (dbms_stats
can).

Because, in short, dbms_stats is more functional.

>> Instead, they're saying if we don't use dbms_stats, we're going
>> straight to hell.
>
> I'd like to see the reference.

It *is* strongly recommended to stop using 'analyze table EMP' statements.
And for the sort of good reasons I mentioned.

>> I find using my own scripts allows me to create
>> histograms where needed, to analyze more active tables and partitions
>> more frequently, etc.

Then you are highly unusual, because most people don't know their data that
well to be able to say with confidence 'this needs a histogram, this
doesn't'. So they end up either over- or under-doing it. Dbms_stats has the
smarts to be able to work it out for them, mostly pretty accurately.

Even if I accepted you were that unusual, though: does your manual script
consider histograms are needed just because there is a data skew? Or does
it additionally take account of the contents of the library cache and only
consider a histogram is needed where there is a data skew AND some evidence
of the fact that the affected column has been the target of several
queries?

Because that's what dbms_stats can do, and if your manual script isn't
equally as subtle, then you've been collecting too many histograms, and
slowing your database down as a result. What's the point of collecting a
histogram for a skewed column that is never used as a WHERE predicate??

[Snip the rest of Niall's excellent advice]

Anyway, Domenic: this is the new world, of automated and self-managing
everything. The days of home-brew scripts for all sorts of things are
numbered. You, like the rest of us, are just going to have to grit our
teeth and get used to it. :-)

But it's not just Oracle playing big brother in this particular instance:
there is a real reason for the shift.

Regards
HJR


Niall Litchfield

unread,
Sep 17, 2004, 4:54:12 AM9/17/04
to
Howard J. Rogers wrote:

> Niall Litchfield wrote:
> >> Instead, they're saying if we don't use dbms_stats, we're going
> >> straight to hell.
> >
> > I'd like to see the reference.
>
> It *is* strongly recommended to stop using 'analyze table EMP'
statements.
> And for the sort of good reasons I mentioned.

What I was trying to get at was that I thought Domenic's question was
good because I have seen the advance warning that ANALYZE will be
removed at a later date, but I haven't seen much *from Oracle* saying
why. I agree with all your statements by the way - I should have
explained what I meant by flexible and didn't.

Brian Dick

unread,
Sep 17, 2004, 8:04:57 AM9/17/04
to
It doesn't help matters that Tom Kyte still uses ANALYZE in his Oracle
Magazine articles.

"Niall Litchfield" <niall.li...@dial.pipex.com> wrote in message
news:cie8nk$t...@odah37.prod.google.com...

Niall Litchfield

unread,
Sep 17, 2004, 2:23:31 PM9/17/04
to
"Brian Dick" <bd...@cox.net> wrote in message
news:XkA2d.37$Nq3...@news-srv1.fmr.com...

> It doesn't help matters that Tom Kyte still uses ANALYZE in his Oracle
> Magazine articles.

This surprises me somewhat. Do you have a reference?

Thomas Kyte

unread,
Sep 17, 2004, 3:33:31 PM9/17/04
to
"Brian Dick" <bd...@cox.net> wrote in message news:<XkA2d.37$Nq3...@news-srv1.fmr.com>...
> It doesn't help matters that Tom Kyte still uses ANALYZE in his Oracle
> Magazine articles.
>

i promise not to do that anymore (and all future examples on the site
will use dbms_stats as well).

I instituted that policy recently. it's a couple of extra keystrokes
- but it can make a material difference in the statistic values, and
just confused people.

Howard J. Rogers

unread,
Sep 17, 2004, 5:03:35 PM9/17/04
to
Brian Dick wrote:

> It doesn't help matters that Tom Kyte still uses ANALYZE in his Oracle
> Magazine articles.

Don't we all? (Rhetorical question!)

I know I still do, because it's a bit quicker to type, it's hardwired in the
brain, the syntax is easier, and I'm a curmudgeonly old timer anyway.

It takes up less room on the page, too.

And, not least, analyze works for every version since at least 7. Analyze
commands only kicked in with 8.0 (I think... maybe 8i), and underwent some
significant enhancements in 9i. So if I'm writing a generic paper, I like
to keep things version-agnostic.

I expect Tom did/does similarly.

Regards
HJR

Daniel Morgan

unread,
Sep 17, 2004, 6:01:11 PM9/17/04
to
Domenic wrote:

For the same reason they have said stop using the RBO, stop using LONG,
stop using LONG RAW, etc. Because it has been deprecated and all new
functionality is being incorporated into DBMS_STATS. So, for example,
if you are working with clusters, partitions, IOTs, etc. DBMS_UTILITY
and its procedures are clueless.

--
Daniel A. Morgan
University of Washington
damo...@x.washington.edu
(replace 'x' with 'u' to respond)

Thomas Kyte

unread,
Sep 17, 2004, 8:38:19 PM9/17/04
to
"Niall Litchfield" <niall.li...@dial.pipex.com> wrote in message news:<414b2b9f$0$20248$cc9e...@news-text.dial.pipex.com>...

> "Brian Dick" <bd...@cox.net> wrote in message
> news:XkA2d.37$Nq3...@news-srv1.fmr.com...
> > It doesn't help matters that Tom Kyte still uses ANALYZE in his Oracle
> > Magazine articles.
>
> This surprises me somewhat. Do you have a reference?

I'm not sure if it ever made it into print in the magazine -- could be
-- but I know I'm guilty of doing it with the Q&D (quick and dirty)
examples on asktom in the past......

I shouldn't have (was just being lazy, for an example, "analyze table
t" is less keystrokes).

so, i won't anymore.

Howard J. Rogers

unread,
Sep 17, 2004, 8:55:31 PM9/17/04
to
Thomas Kyte wrote:

Dang! That means I suppose I've got to behave now, too...

:-(
HJR

Jonathan Lewis

unread,
Sep 18, 2004, 2:13:28 AM9/18/04
to

One reason why it's "a good thing" to use dbms_stats
instead of analyze is that some of the results are
different, and the optimizer is likely to be engineered
to assume that the results it uses come from dbms_stats.

For example: avg_col_len in user_tab_columns
is one larger after a dbms_stats call than it is after
an analyze - and it is a figure used in the calculation
of the cost of a hash join or a sort. Changing from
analyze to dbms_stats could (at the boundary cases)
change some of your execution paths.

--
Regards

Jonathan Lewis

http://www.jlcomp.demon.co.uk/faq/ind_faq.html
The Co-operative Oracle Users' FAQ

http://www.jlcomp.demon.co.uk/seminar.html
Optimising Oracle Seminar - schedule updated Sept 2nd

"Niall Litchfield" <niall.li...@dial.pipex.com> wrote in message

news:cie8nk$t...@odah37.prod.google.com...

0 new messages