Follow Up: DETERMINISTIC in user functions

721 views
Skip to first unread message

David

unread,
Jan 10, 2005, 3:28:04 PM1/10/05
to
Hi, two years and one month ago I posted a question in this NG that
was very satisfactorily answered. I am contemplating upgrading the
firms version of Oracle to 10g and find myself asking myself the same
question again, not that it is relevant to the upgrade but more out of
interest. After Googl'ing through the Oracle 10g docs on-line, I've
decided to ask the same question again and importantly seek expert
opinion whether the situation has changed (especially the scenario
outlined by Connor). Highlights from the original message thread
pasted in-line for those of you whose newsservers don't go back this
far.
Thank you
David

From: David (dfair...@hotmail.com)
Subject: DETERMINISTIC in user functions
Date: 2002-12-20 07:53:33 PST

All, just reading Tom Kytes "Expert one-to-one Oracle" and I note on
page 291/2 he says

CREATE OR REPLACE FUNCTION my_soundex(p_string IN VARCHAR2) RETURN
VARCHAR2
DETERMINISTIC
AS
...
...
...
END;
/

Quote: "Notice in this function, I am using a new keyword
DETERMINISTIC. This declares that the above function - when given the
sample inputs - will always return the exact same output. It is needed
in order to create an index on a user written function. You must tell
Oracle that the function is DETERMINISTIC ....."

I've written all my Oracle stored functions in the past without the
DETERMINISTIC keyword, and of-course everything has worked okay. And
I'm far from being alone. There seems to be a minimal amount of Oracle
documentation & other newsgroup discussion available on this, but
Oracle do say quote: "In some cases the optimizer can use a previously
calculated value rather than executing a user-written function. This
is only safe for functions that behave in a restricted manner. The
...".

So, if I am pondering correctly, the behind-the-scenes implementation
of the DETERMINISTIC keyword allows Oracle to cache function results.
Is that it or is there something a little deeper I'm missing. And what
does "In some cases the optimizer...." mean, what cases, or more
specifically what cases doesn't it not cache the output for the same
input.

Comments most welcome.
David


From: Jonathan Lewis (jona...@jlcomp.demon.co.uk)
Subject: Re: DETERMINISTIC in user functions
Date: 2002-12-20 08:10:18 PST

The implication of the manual is that it is
possible for two consecutive calls to the
function with identical parameters to be
satisfied by a single call followed by a
cached look-up.

I've never yet seen anything to suggest
that this ever happens - and Connor
McDonald sent me a note of something
he'd spotted on Metalink recently that
seemed to corroborate this view.

The idea, of course, is that if you are
using a function-based index, then
(for example) you may not need to
call the function for every row you
process when you do 'create index'.


--
Regards

Jonathan Lewis

From: Connor McDonald (connor_...@yahoo.com)
Subject: Re: DETERMINISTIC in user functions
Date: 2002-12-20 13:20:20 PST

Even in the create index scenario I've not managed to get the number
of
executions lower than the number of rows in the table. The marketing
engine of Oracle tell me that:

when you issue

select ...
where function(col) = value

then

a) if there is no index, then you run 'function(' for each row in the
table
b) if there is an equivalent function based index, then the function
does not need to be executed...Voila - determinism in action

Ugh! What a cop out!

hth
connor

Thomas Kyte

unread,
Jan 11, 2005, 8:44:49 AM1/11/05
to
In article <b4cefdce.0501...@posting.google.com>, David says...

>
>Hi, two years and one month ago I posted a question in this NG that
>was very satisfactorily answered. I am contemplating upgrading the
>firms version of Oracle to 10g and find myself asking myself the same
>question again, not that it is relevant to the upgrade but more out of
>interest. After Googl'ing through the Oracle 10g docs on-line, I've
>decided to ask the same question again and importantly seek expert
>opinion whether the situation has changed (especially the scenario
>outlined by Connor). Highlights from the original message thread
>pasted in-line for those of you whose newsservers don't go back this
>far.
>Thank you
>David
>

No, it hasn't changed, however -- there is and has been "scalar subquery
caching" which I frequently use to cut down the calls. Consider:

ops$tkyte@ORA10G> /*
ops$tkyte@ORA10G> drop table t;
ops$tkyte@ORA10G> create table t as select * from all_objects;
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T' );
ops$tkyte@ORA10G> */
ops$tkyte@ORA10G>


1* select count(*), count(distinct object_name), count(distinct owner) from t
ops$tkyte@ORA10G> /

COUNT(*) COUNT(DISTINCTOBJECT_NAME) COUNT(DISTINCTOWNER)
---------- -------------------------- --------------------
48039 28399 27


so, there are 48,039 rows in there -- object_name repeats, only has 28,399
unique ones and owner really repeats. we create a function:


ops$tkyte@ORA10G> create or replace function f( x in varchar2 ) return number
2 DETERMINISTIC
3 as
4 begin
5 dbms_application_info.set_client_info( userenv('client_info')+1 );
6 return 0;
7 end;
8 /

Function created.


it'll just count "how often I've been called". Now we are ready to test. I ran
a set of 4 queries (well, 8 really -- once for object_name, again for owner)
using

a) reset client_info to zero
b) run query
c) remember client_info value in a table for later


Note the queries -- I run basically the same query twice, once with and once
without an inline view:

ops$tkyte@ORA10G> set autotrace traceonly
ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> select object_name, f(object_name) from t;
48039 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=154 Card=48039 Bytes=1152936)
1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=154 Card=48039 Bytes=1152936)

ops$tkyte@ORA10G> exec insert into msg values ( 'First => ' ||
userenv('client_info') );
PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> select object_name, f(object_name) from (select object_name
from t order by object_name);
48039 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=502 Card=48039 Bytes=816663)
1 0 VIEW (Cost=502 Card=48039 Bytes=816663)
2 1 SORT (ORDER BY) (Cost=502 Card=48039 Bytes=1152936)
3 2 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=154 Card=48039
Bytes=1152936)

ops$tkyte@ORA10G> exec insert into msg values ( 'Second => ' ||
userenv('client_info') );
PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> select object_name, (select f(object_name) from dual) from t;
48039 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=154 Card=48039 Bytes=1152936)
1 0 FAST DUAL (Cost=2 Card=1)
2 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=154 Card=48039 Bytes=1152936)

ops$tkyte@ORA10G> exec insert into msg values ( 'Third => ' ||
userenv('client_info') );
PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> exec dbms_application_info.set_client_info(0);
PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> select object_name, (select f(object_name) from dual) from
(select object_name from t order by object_name);
48039 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=502 Card=48039 Bytes=816663)
1 0 FAST DUAL (Cost=2 Card=1)
2 0 VIEW (Cost=502 Card=48039 Bytes=816663)
3 2 SORT (ORDER BY) (Cost=502 Card=48039 Bytes=1152936)
4 3 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=154 Card=48039
Bytes=1152936)


Now, the counts:

ops$tkyte@ORA10G> select * from msg;

TXT
--------------------------------------------------------------------------------
First => 48039
Second => 48039
Third => 33206
Fourth => 28399
First with owner => 48039
Second with owner => 48039
Third with owner => 66
Fourth with owner => 27

8 rows selected.


the "select f(x) from anything" has the function getting called once per row --
no caching happening.

However, the "select (select f(x) from dual) from anything" -- it seriously
reduced the calls

In looking at the first run with object_name -- first we reduced the count to
33,206. That is because the object names were arriving "randomly" and the
database is only caching the last "N" scalar subquery results (i've never
bothered to figure out what N is). We had to rerun the scalar subquery a couple
of times with the same inputs as it was out of the cache. However, ordering the
interior inline view by object_name resulted in perfect "cache hit" in this case
- 28,399 invocations, identical to the number of distinct object_names.

We see a simlar effect with owner -- down to 66 (random arrival) and then we
sorted down to 27 (which implies the cache is smaller than 27, greater than 0 :)

You will see similar results running this example in 9i and 8i.


--
Thomas Kyte
Oracle Public Sector
http://asktom.oracle.com/
opinions are my own and may not reflect those of Oracle Corporation

Jonathan Lewis

unread,
Jan 11, 2005, 10:19:54 AM1/11/05
to

Cute trick.

This looks as if it may be similar to the
mechanism Oracle uses for caching the
results of subqueries that have not been
unnested and therefore run as a FILTER
operation. (If so, the cache is probably
a hash table of 256 rows in 8i and 9i,
and 1024 rows in 10g).

I would ascribe the drop from 48,000
to 33,000 as being due to a side effect
of the large number of times that Oracle
creates an object and a synonym with the
same name as consecutive steps (combined
with the way you created the table, of course).

The 66 / 27 is a bit of a puzzle, though - but
it could be explained if
a) there were a couple of owner names
that collided in the hash table but
b) the table was almost sorted by owner
name anyway.

I can't emulate your test, as I obviously have
a different 10g install - but it would be interesting
to get the owner name output each time the
function was called, just to check the theory.
After about the first 25 calls, I would expect
to see just a couple of names alternating.


--
Regards

Jonathan Lewis

http://www.jlcomp.demon.co.uk/faq/ind_faq.html
The Co-operative Oracle Users' FAQ

http://www.jlcomp.demon.co.uk/seminar.html
Public Appearances - schedule updated Dec 23rd 2004


"Thomas Kyte" <tk...@oracle.com> wrote in message
news:115451089.0...@drn.newsguy.com...

Thomas Kyte

unread,
Jan 11, 2005, 10:41:37 AM1/11/05
to
In article <cs0qqq$eho$1...@titan.btinternet.com>, Jonathan Lewis says...

>
>
>
>Cute trick.
>
>This looks as if it may be similar to the
>mechanism Oracle uses for caching the
>results of subqueries that have not been
>unnested and therefore run as a FILTER
>operation. (If so, the cache is probably
>a hash table of 256 rows in 8i and 9i,
>and 1024 rows in 10g).
>
>I would ascribe the drop from 48,000
>to 33,000 as being due to a side effect
>of the large number of times that Oracle
>creates an object and a synonym with the
>same name as consecutive steps (combined
>with the way you created the table, of course).
>
>The 66 / 27 is a bit of a puzzle, though - but
>it could be explained if
> a) there were a couple of owner names
> that collided in the hash table but
> b) the table was almost sorted by owner
> name anyway.
>

yes it is...


>I can't emulate your test, as I obviously have
>a different 10g install - but it would be interesting
>to get the owner name output each time the
>function was called, just to check the theory.
>After about the first 25 calls, I would expect
>to see just a couple of names alternating.
>

run this -- flipping 76 to 75. 75 seems "magic", but that doesn't explain the
27/66 above.

But basically -- it is a cache that works best if the bind variables to the
scalar subquery comes to it sorted.

set echo on
set linesize 121

drop table t;
create table t as
select rownum x, mod(rownum,76) y
from all_objects
where rownum <= 5000;


create or replace function f( x in number ) return number
as
begin


dbms_application_info.set_client_info( userenv('client_info')+1 );

return 0;
end;
/

set autotrace traceonly
exec dbms_application_info.set_client_info(0);
select (select f(y) from dual) from (select * from t order by x);
exec dbms_output.put_line( userenv('client_info') );

exec dbms_application_info.set_client_info(0);
select (select f(y) from dual) from (select * from t order by y);
exec dbms_output.put_line( userenv('client_info') );

set autotrace off

Jonathan Lewis

unread,
Jan 11, 2005, 4:10:02 PM1/11/05
to

It looks like the mechanism is the one
used for filter subqueries. The number
76 isn't special, it just happens to be
unlucky.

Oracle is building a hash table of
(driving value, subquery result)
When you create the table with mod(rownum,76)
the value 75 appears in the column, and 75 happens
to collide with 48. On a collision, Oracle doesn't
create an overflow, it simply doesn't store the
newer result.

Try:
update t set y = 76 where y = 75;
or
update t set y = 76 where y = 75;

Either change stops the collision.

Your 75/76 test doesn't reproduce on 8i or 9i,
by the way because they use a hash table of
256 elements, and 10g uses a hash table of 1,024
(according to my best guess).


Regards

Jonathan Lewis

http://www.jlcomp.demon.co.uk/faq/ind_faq.html
The Co-operative Oracle Users' FAQ

http://www.jlcomp.demon.co.uk/seminar.html
Public Appearances - schedule updated Dec 23rd 2004


"Thomas Kyte" <tk...@oracle.com> wrote in message

news:115458097.0...@drn.newsguy.com...

Thomas Kyte

unread,
Jan 11, 2005, 4:28:41 PM1/11/05
to
In article <cs1fba$c41$1...@titan.btinternet.com>, Jonathan Lewis says...

>
>
>It looks like the mechanism is the one
>used for filter subqueries. The number
>76 isn't special, it just happens to be
>unlucky.
>
>Oracle is building a hash table of
>(driving value, subquery result)
>When you create the table with mod(rownum,76)
>the value 75 appears in the column, and 75 happens
>to collide with 48. On a collision, Oracle doesn't
>create an overflow, it simply doesn't store the
>newer result.
>
>Try:
> update t set y = 76 where y = 75;
>or
> update t set y = 76 where y = 75;
>

do (a) or do (a) :)

>Either change stops the collision.
>
>Your 75/76 test doesn't reproduce on 8i or 9i,
>by the way because they use a hash table of
>256 elements, and 10g uses a hash table of 1,024
>(according to my best guess).
>
>
>Regards
>
>Jonathan Lewis
>
>http://www.jlcomp.demon.co.uk/faq/ind_faq.html
>The Co-operative Oracle Users' FAQ
>
>http://www.jlcomp.demon.co.uk/seminar.html
>Public Appearances - schedule updated Dec 23rd 2004
>
>

thanks -- that's neat. never could figure out what the cache size was, that
clears it up rather nicely....


(this will be an example I use when I say "i learn something *new* about Oracle
each and every day :)

Reply all
Reply to author
Forward
0 new messages