Separate Index or Composite Indice ?

userjohn

unread,

Aug 30, 2007, 12:15:39 PM8/30/07

to

10gR2

If you have :
2 columns on a hi-traffic transactional table
- one ("some_id") high cardinality
- and one ("yes_no") low cardinality

You use both columns in SQL - "some_id" is used in JOIN and "yes_no" as a
predicate

Would you create normal index on BOTH , just "some_id" or a composite indice
on "some_id" + "yes_no" ?

thanks

fitzj...@cox.net

unread,

Aug 30, 2007, 12:43:44 PM8/30/07

to

I would be testing the possibilities and determine which one provides
the 'best' execution plan. 10gR2 doesn't say much, as that covers the
base release (10.2.0.1) through 10.2.0.3; optimizer behaviour may have
changed between the base and the most current release.

David Fitzjarrell

xho...@gmail.com

unread,

Aug 30, 2007, 3:47:19 PM8/30/07

to

"userjohn" <user...@mailinator.com> wrote:
> 10gR2
>
> If you have :
> 2 columns on a hi-traffic transactional table
> - one ("some_id") high cardinality
> - and one ("yes_no") low cardinality
>
> You use both columns in SQL - "some_id" is used in JOIN and "yes_no" as a
> predicate

In the same SQL, or in separate SQLs?

> Would you create normal index on BOTH , just "some_id" or a composite
> indice on "some_id" + "yes_no" ?

I generally wouldn't bother with a single column index on "yes_no".
I'd make one on (some_id,yes_no) and forget it. If it forces itself
back into your attention due to performance issues, then I'd look
at explain plans.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB

Jerome Vitalis

unread,

Aug 31, 2007, 7:01:08 AM8/31/07

to

On Thu, 30 Aug 2007 19:47:19 +0000, xhoster wrote:

> "userjohn" <user...@mailinator.com> wrote:
>> 10gR2
>>
>> If you have :
>> 2 columns on a hi-traffic transactional table - one ("some_id") high
>> cardinality
>> - and one ("yes_no") low cardinality
>>
>> You use both columns in SQL - "some_id" is used in JOIN and "yes_no" as
>> a predicate
>
> In the same SQL, or in separate SQLs?

Always in the same?

Always in separate?
You want to optimize all separate SQL?

You'll have to be more precise about your purpose.

Richard Foote

unread,

Aug 31, 2007, 9:24:45 AM8/31/07

to

"userjohn" <user...@mailinator.com> wrote in message
news:13ddrbf...@corp.supernews.com...

Can't really answer the question without a bit more info but some things to
consider include:

Is this one SQL statement or are we talking about separate / differing SQL
statements ?

Could the some_id column also be used as a predicate or will all resultant
values of some_id be of interest ?

Is some_id a PK or a FK in this table ?

Are there other columns in the table (or could an index organized table be
considered) ? If there are other columns, how frequently are they referenced
?

How large is this table and how large is the table (or tables) that it joins
with ?

What's the data distribution in the yes_no column ? Is there a value which
constitutes the vast percentage of rows or is yes and no reasonably evenly
distributed ?

Note by placing the yes_no column first in a composite column, you might be
able to compress the index a tad (depends on actual length of column
values). Unlikely the other way around.

Note by placing the yes_no column first in a composite column, you can
potentially use a skip-scan access if you happen to not reference yes_no in
a predicate. Unlikely the other way around. So I would consider a composite
index on yes_no, some_id a more likely combination than the other way around
if you go down the composite index path although it does depend on the
distribution of data on the yes_no column and how likely the column is
referenced.

Some things to consider anyways.

Cheers

Richard

Jan Krueger

unread,

Aug 31, 2007, 2:58:00 PM8/31/07

to

xho...@gmail.com wrote:
> "userjohn" <user...@mailinator.com> wrote:
>> 10gR2
>>
>> If you have :
>> 2 columns on a hi-traffic transactional table
>> - one ("some_id") high cardinality
>> - and one ("yes_no") low cardinality
>>
>> You use both columns in SQL - "some_id" is used in JOIN and "yes_no" as a
>> predicate
>
> In the same SQL, or in separate SQLs?
>
>> Would you create normal index on BOTH , just "some_id" or a composite
>> indice on "some_id" + "yes_no" ?
>
> I generally wouldn't bother with a single column index on "yes_no".
> I'd make one on (some_id,yes_no) and forget it. If it forces itself
> back into your attention due to performance issues, then I'd look
> at explain plans.
>

I would consider creating it the other way around (yes_no, some_id)
because it
allows an index skip scan to be performed if you need to look only for
some_id. Don't forget to have histograms on the columns.

Anyway you have to consult the execution plans of actual statements.

Jan

xho...@gmail.com

unread,

Aug 31, 2007, 3:20:36 PM8/31/07

to

Jan Krueger <j...@stud.uni-hannover.de> wrote:
> xho...@gmail.com wrote:
> > "userjohn" <user...@mailinator.com> wrote:
> >> 10gR2
> >>
> >> If you have :
> >> 2 columns on a hi-traffic transactional table
> >> - one ("some_id") high cardinality
> >> - and one ("yes_no") low cardinality
> >>
> >> You use both columns in SQL - "some_id" is used in JOIN and "yes_no"
> >> as a predicate
> >
> > In the same SQL, or in separate SQLs?
> >
> >> Would you create normal index on BOTH , just "some_id" or a composite
> >> indice on "some_id" + "yes_no" ?
> >
> > I generally wouldn't bother with a single column index on "yes_no".
> > I'd make one on (some_id,yes_no) and forget it. If it forces itself
> > back into your attention due to performance issues, then I'd look
> > at explain plans.
> >
> I would consider creating it the other way around (yes_no, some_id)
> because it
> allows an index skip scan to be performed if you need to look only for
> some_id.

Maybe. But if you look only for some_id, then a regular range scan
on (some_id,yes_no) would be better than the skip scan on (yes_no,some_id),
wouldn't it?

And if you specify just yes_no, the cluster factor would probably dictate
against using the index at all, no matter which way around it is.

Jan Krueger

unread,

Sep 1, 2007, 3:05:17 AM9/1/07

to

True, skip scan would double the effort - one range scan for yes, one
for no. But you have only one index to maintain. It depends...

> And if you specify just yes_no, the cluster factor would probably dictate
> against using the index at all, no matter which way around it is.

It depends on the value distribution of the yes_no column. I think it's
more a matter of the histogram than the clustering factor. If you have
only a few yes values i.e. the CBO should use the index in col = 'Y'
conditions. The clustering factor could be very good even in a two value
column if you sort the table by the value. Imagine a "to be processed
flag" where new lines to be processed are always appended and afterwards
processed. This gives a perfect low clustering factor, but using the
index is only usefull for "to be processed" = yes.

Jan

Jonathan Lewis

unread,

Sep 2, 2007, 10:48:40 AM9/2/07

to

"Richard Foote" <richar...@nospam.bigpond.com> wrote in message
news:xEUBi.29001$4A1....@news-server.bigpond.net.au...

> Note by placing the yes_no column first in a composite column, you can
> potentially use a skip-scan access if you happen to not reference yes_no
> in a predicate. Unlikely the other way around. So I would consider a
> composite index on yes_no, some_id a more likely combination than the
> other way around if you go down the composite index path although it does
> depend on the distribution of data on the yes_no column and how likely
> the column is referenced.
>
> Some things to consider anyways.
>
> Cheers
>
> Richard
>

Just a little warning - last time I checked the skip-scan was
not considered as an option for the second table in a nested
loop join with index.

--
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Jan Krueger

unread,

Sep 2, 2007, 2:56:10 PM9/2/07

to

Jonathan Lewis wrote:
> "Richard Foote" <richar...@nospam.bigpond.com> wrote in message
> news:xEUBi.29001$4A1....@news-server.bigpond.net.au...
>
>> Note by placing the yes_no column first in a composite column, you can
>> potentially use a skip-scan access if you happen to not reference yes_no
>> in a predicate. Unlikely the other way around. So I would consider a
>> composite index on yes_no, some_id a more likely combination than the
>> other way around if you go down the composite index path although it does
>> depend on the distribution of data on the yes_no column and how likely
>> the column is referenced.
>>
>> Some things to consider anyways.
>>
>> Cheers
>>
>> Richard
>>
>
> Just a little warning - last time I checked the skip-scan was
> not considered as an option for the second table in a nested
> loop join with index.
>

Was not considered means you have traced the optimizer (10053 event) or
you haven't seen it in the execution plan?

Best regards.

Jan

userjohn

unread,

Sep 4, 2007, 2:50:30 PM9/4/07

to

> In the same SQL, or in separate SQLs?

Yes, in SAME SQL

dammit I already tried real hard to be precise ! ;)