If you have :
2 columns on a hi-traffic transactional table
- one ("some_id") high cardinality
- and one ("yes_no") low cardinality
You use both columns in SQL - "some_id" is used in JOIN and "yes_no" as a
predicate
Would you create normal index on BOTH , just "some_id" or a composite indice
on "some_id" + "yes_no" ?
thanks
I would be testing the possibilities and determine which one provides
the 'best' execution plan. 10gR2 doesn't say much, as that covers the
base release (10.2.0.1) through 10.2.0.3; optimizer behaviour may have
changed between the base and the most current release.
David Fitzjarrell
In the same SQL, or in separate SQLs?
> Would you create normal index on BOTH , just "some_id" or a composite
> indice on "some_id" + "yes_no" ?
I generally wouldn't bother with a single column index on "yes_no".
I'd make one on (some_id,yes_no) and forget it. If it forces itself
back into your attention due to performance issues, then I'd look
at explain plans.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
> "userjohn" <user...@mailinator.com> wrote:
>> 10gR2
>>
>> If you have :
>> 2 columns on a hi-traffic transactional table - one ("some_id") high
>> cardinality
>> - and one ("yes_no") low cardinality
>>
>> You use both columns in SQL - "some_id" is used in JOIN and "yes_no" as
>> a predicate
>
> In the same SQL, or in separate SQLs?
Always in the same?
Always in separate?
You want to optimize all separate SQL?
You'll have to be more precise about your purpose.
Can't really answer the question without a bit more info but some things to
consider include:
Is this one SQL statement or are we talking about separate / differing SQL
statements ?
Could the some_id column also be used as a predicate or will all resultant
values of some_id be of interest ?
Is some_id a PK or a FK in this table ?
Are there other columns in the table (or could an index organized table be
considered) ? If there are other columns, how frequently are they referenced
?
How large is this table and how large is the table (or tables) that it joins
with ?
What's the data distribution in the yes_no column ? Is there a value which
constitutes the vast percentage of rows or is yes and no reasonably evenly
distributed ?
Note by placing the yes_no column first in a composite column, you might be
able to compress the index a tad (depends on actual length of column
values). Unlikely the other way around.
Note by placing the yes_no column first in a composite column, you can
potentially use a skip-scan access if you happen to not reference yes_no in
a predicate. Unlikely the other way around. So I would consider a composite
index on yes_no, some_id a more likely combination than the other way around
if you go down the composite index path although it does depend on the
distribution of data on the yes_no column and how likely the column is
referenced.
Some things to consider anyways.
Cheers
Richard
Anyway you have to consult the execution plans of actual statements.
Jan
Maybe. But if you look only for some_id, then a regular range scan
on (some_id,yes_no) would be better than the skip scan on (yes_no,some_id),
wouldn't it?
And if you specify just yes_no, the cluster factor would probably dictate
against using the index at all, no matter which way around it is.
> And if you specify just yes_no, the cluster factor would probably dictate
> against using the index at all, no matter which way around it is.
It depends on the value distribution of the yes_no column. I think it's
more a matter of the histogram than the clustering factor. If you have
only a few yes values i.e. the CBO should use the index in col = 'Y'
conditions. The clustering factor could be very good even in a two value
column if you sort the table by the value. Imagine a "to be processed
flag" where new lines to be processed are always appended and afterwards
processed. This gives a perfect low clustering factor, but using the
index is only usefull for "to be processed" = yes.
Jan
> Note by placing the yes_no column first in a composite column, you can
> potentially use a skip-scan access if you happen to not reference yes_no
> in a predicate. Unlikely the other way around. So I would consider a
> composite index on yes_no, some_id a more likely combination than the
> other way around if you go down the composite index path although it does
> depend on the distribution of data on the yes_no column and how likely
> the column is referenced.
>
> Some things to consider anyways.
>
> Cheers
>
> Richard
>
Just a little warning - last time I checked the skip-scan was
not considered as an option for the second table in a nested
loop join with index.
--
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html
Best regards.
Jan
dammit I already tried real hard to be precise ! ;)