This is a question about database design philosophy and opinion more
than anything else.
Imagine we have a person table
ID (int), Title(VARCHAR(22)), Gender(Char(1)), State(Char(2)),
First_Name..........
Something along those lines.
Now, we all know that it makes no sense to have a gender table - a
simple check constraint = 'M' or 'F' is all that's required.
For Title, the situation is slightly different. You could have Ms.,
Mr., Mrs. - but what happens if you want to add 'Fr.', 'Col.',
'Excellency', 'Her Imperial Highness', 'His Holiness'? All, of course
in the knowledge that you would never have more than 20-25 entries in
the Title field.
What's better in this case?
To create a table called Person_Title and develop a user screen to
maintain it (adding one record every year or so, and then very few for
years), or to permit in these circumstances the maintainer of the
tables to issue DDL against the Person table and add to the
check_constraint clause (said maintainance being done at night or
something like that). What about indexes on this table - IOT?
Bitmap index on the Person_Table on the Gender field?
Assuming the State field has the American States - say 50 - 55 entries
which are unlikely to change - which is better - a long check
constraint or a referencing table?
Where does one start drawing the line between check constraints and
tables - i.e. the maintainance and verification of Foreign Keys?
Any ideas, discussions, URL's, references or other stuff welcome.
Paul...
--
plinehan __at__ yahoo __dot__ __com__
XP Pro, SP 2,
Oracle, 9.2.0.1.0 (Enterprise Ed.)
Interbase 6.0.1.0;
When asking database related questions, please give other posters
some clues, like operating system, version of db being used and DDL.
The exact text and/or number of error messages is useful (!= "it didn't work!").
Thanks.
Furthermore, as a courtesy to those who spend
time analysing and attempting to help, please
do not top post.
I personally like to have everything in lookup tables; this acts as
documentation as well as allowing for easy updating.
On the other hand I would be making Gender a constraint (what about
Unknown, depending on you system and its audience you may need this).
Gender is a nicely self describing field and M, F and U would make
sense as a value to someone running SQL.
As for states... I must take issue with you saying the state of the USA
are unlikely to change. On a historical basis the US is well overdue
for another state, haven't had any newbies since 1959... Australia on
the other hand has had the same states since its inception 105 years
ago and is unlikely to change; so based on that you should definitely
put that in a table ;-)
Seriously, I think we just need to take pragmatic approaches to these
things
Hardcoded check constraints should be limited to two or three values.
The basic question is:
where do I put it? Do I put in a function, or do I put it in a table?
If you think the # of values is going to change, or the values are
going to alter, and you don't want to have to change the app or the
check constraints: put them in a table.
If the number of values is more than 3: always put them in a table.
You might forget a state or miscount them.
--
Sybrand Bakker, Senior Oracle DBA
> If you think the # of values is going to change, or the values are
> going to alter, and you don't want to have to change the app or the
> check constraints: put them in a table.
Well, what I was asking was when it is appropriate to
change the app by modifying check constraints on
a table - is the rule about no DDL in an application
an *_ABSOLUTE_* one, and if not, what are the
circumstances under which it should be allowed?
> If the number of values is more than 3: always put them in a table.
> You might forget a state or miscount them.
Ah, I see. This is a perspective I had forgotten about
when framing my original post - I forgot to think about
those application programmers for example who will
have to write reports on the data in the app - it's true
enough that anything beyond two or three is probably
too much - I mean gender is OK but what about title?
The part of my post which hasn't been addressed is
what sort of indexes to put on the Person table - i.e.
a bitmap index if it's only two values - what about higher
numbers - what's the rule of thumb for the cardinality
of bitmap indexes?
Also, the construction of these "minor" tables - should
they be indexed, and if so, what sort of index should be
used - and if there are differences, why?
Paul...
<snip>
>The part of my post which hasn't been addressed is
>what sort of indexes to put on the Person table - i.e.
>a bitmap index if it's only two values - what about higher
>numbers - what's the rule of thumb for the cardinality
>of bitmap indexes?
>
>Also, the construction of these "minor" tables - should
>they be indexed, and if so, what sort of index should be
>used - and if there are differences, why?
>
>
>Paul...
>
You use bitmap indexes on foreign key fields of a table when you want
to access the table by this fields combinedly (if that's an english
word), for instance with the question:
how many persons are 'Female' with the title 'His Holyness'.
Your query would look up these descriptions in gender table and title
table and with the related codes access the persons table. These
queries are typical for a datawarehouse. In this case access on the
foreign key fields could be solved by the optimizer by first combining
the bitmap indexes (for which these indexes are extremely suited) and
with the resulting rowid's accessing the table itself.
In an OLTP environment the access is usually the other way around: you
insert/update a record in the persons table and the codes are checked
against the title table and gender table. For this you don't need any
indexes on the foreign key fields. And if you want your foreign key
fields indexed anyway (there are issues when updating the primary key
in a minor table), you probably never will do a combined search so a
normal index will do. Apart from this, the persons table will be
subject to many changes, for which bitmap indexes are not suited. Even
locking would no longer be row based!
If you use the database schema to enforce the key field values in the
persons table, you have to define the key fields as foreign keys and
the related fields in the minor tables as primary keys, which will
create an index for each primary key, explicitly or implicitly.
If you use code to check if a key field value is allowed (which I
would strongly advice against), you can choose if you want to create
an index on the minor tables. If the table is big, this is the smart
thing to do (so the optimizer can use the index for a lookup), if it
is small, who cares? So do it in any case, that way you don't have to
wreck your brains on this issue for every table.
Jaap.
> You use bitmap indexes on foreign key fields of a table when you want
> to access the table by this fields combinedly (if that's an english
> word), for instance with the question:
> how many persons are 'Female' with the title 'His Holyness'.
Yes, I understand this, but my question is really about when
do you start/stop doing this? As I understand it, bitmap
indices are to be used when the cardinality of the column
is low - but what is low? Is it 10 values? 20? 50? Or is it
the cardinality with respect to the total number of fields?
50% (M/F)? 33% (M/F/U)? 10% (Title)? 2% (US States)?
> Your query would look up these descriptions in gender table and title
> table and with the related codes access the persons table. These
> queries are typical for a datawarehouse. In this case access on the
> foreign key fields could be solved by the optimizer by first combining
> the bitmap indexes (for which these indexes are extremely suited) and
> with the resulting rowid's accessing the table itself.
I understand this - my question about *_when_* it should
be done remains.
> In an OLTP environment the access is usually the other way around: you
> insert/update a record in the persons table and the codes are checked
> against the title table and gender table. For this you don't need any
> indexes on the foreign key fields. And if you want your foreign key
> fields indexed anyway (there are issues when updating the primary key
> in a minor table), you probably never will do a combined search so a
> normal index will do. Apart from this, the persons table will be
> subject to many changes, for which bitmap indexes are not suited. Even
> locking would no longer be row based!
Why would you not do the index? When creating a new person
for example, while they might change other attributes like their
address, their gender is hardly likely to change! Nor is title,
(apart from Ms. to Mrs., which is less and less nowadays anyway).
As I understand it, Oracle doesn't automatically create an
index against foreign key columns - there are RDBMS's
that do.
> If you use the database schema to enforce the key field values in the
> persons table, you have to define the key fields as foreign keys and
> the related fields in the minor tables as primary keys, which will
> create an index for each primary key, explicitly or implicitly.
Does it do this automatically? Suppose you have a gender field
which you don't want indexed, for example because it might
be skewed?
> If you use code to check if a key field value is allowed (which I
> would strongly advice against),
It would be OK in a trigger, surely?
> you can choose if you want to create
> an index on the minor tables. If the table is big, this is the smart
> thing to do (so the optimizer can use the index for a lookup), if it
> is small, who cares? So do it in any case, that way you don't have to
> wreck your brains on this issue for every table.
My question is what sort of index do you have on the minor
tables - IOT or ordinary? My understanding is that it should
be IOT - am I wrong?
Paul...
> Jaap.
On 25 May 2006 06:13:34 -0700, plin...@yahoo.com wrote:
>
>
>> You use bitmap indexes on foreign key fields of a table when you want
>> to access the table by this fields combinedly (if that's an english
>> word), for instance with the question:
>
>> how many persons are 'Female' with the title 'His Holyness'.
>
>
>Yes, I understand this, but my question is really about when
>do you start/stop doing this? As I understand it, bitmap
>indices are to be used when the cardinality of the column
>is low - but what is low? Is it 10 values? 20? 50? Or is it
>the cardinality with respect to the total number of fields?
>50% (M/F)? 33% (M/F/U)? 10% (Title)? 2% (US States)?
>
The cardinality is the number of distinct values in that particular
key. For me: anything less below 5 but definitely 10.
>
>> Your query would look up these descriptions in gender table and title
>> table and with the related codes access the persons table. These
>> queries are typical for a datawarehouse. In this case access on the
>> foreign key fields could be solved by the optimizer by first combining
>> the bitmap indexes (for which these indexes are extremely suited) and
>> with the resulting rowid's accessing the table itself.
>
>
>I understand this - my question about *_when_* it should
>be done remains.
>
>
>> In an OLTP environment the access is usually the other way around: you
>> insert/update a record in the persons table and the codes are checked
>> against the title table and gender table. For this you don't need any
>> indexes on the foreign key fields. And if you want your foreign key
>> fields indexed anyway (there are issues when updating the primary key
>> in a minor table), you probably never will do a combined search so a
>> normal index will do. Apart from this, the persons table will be
>> subject to many changes, for which bitmap indexes are not suited. Even
>> locking would no longer be row based!
>
>
>Why would you not do the index?
Bitmapped indices have their penalty as the entire key (in the index)
needs to be updated.
Apart from that: in this particular case, assuming a non-skewed
distribution, the index isn't going to be used in queries anyway.
When creating a new person
>for example, while they might change other attributes like their
>address, their gender is hardly likely to change! Nor is title,
>(apart from Ms. to Mrs., which is less and less nowadays anyway).
>
>
>As I understand it, Oracle doesn't automatically create an
>index against foreign key columns - there are RDBMS's
>that do.
>
>> If you use the database schema to enforce the key field values in the
>> persons table, you have to define the key fields as foreign keys and
>> the related fields in the minor tables as primary keys, which will
>> create an index for each primary key, explicitly or implicitly.
>
>Does it do this automatically?
If there is no index on that column it does.
Suppose you have a gender field
>which you don't want indexed, for example because it might
>be skewed?
If the column is skewed it is a *perfect* candidate for an index.
But then of course you could index only individual values by using a
function based index instead of a bitmap. Works like a charm.
>
>> If you use code to check if a key field value is allowed (which I
>> would strongly advice against),
>
>
>It would be OK in a trigger, surely?
>
That would be very inefficient as compared to a check constraint or a
foreign key.
>
>> you can choose if you want to create
>> an index on the minor tables. If the table is big, this is the smart
>> thing to do (so the optimizer can use the index for a lookup), if it
>> is small, who cares? So do it in any case, that way you don't have to
>> wreck your brains on this issue for every table.
>
>
>My question is what sort of index do you have on the minor
>tables - IOT or ordinary? My understanding is that it should
>be IOT - am I wrong?
>
>
It probably should be IOT.
>
>Paul...
>
>
>> Jaap.
Is the data a simple attribute or an entity? If the data is an entity,
then put it in its own table. If the data is a simple attribute, then a
check constraint is fine.
Take for example, a state. There are 50 of them in the US. To me, a
state is an entity in its own right. The state has a name (attribute
#1), and can have other attributes as well if you care to use them. The
only attribute you care about may be just the state name. Or maybe you
want the 2-letter abbreviation as well.
To me, GENDER is not an entity. It is an attribute of the person. Maybe
that's just my opinion....but I'd use a check constraint on the GENDER
attribute.
So entities deserve their own table. Attributes can be enforced with a
simple CHECK constraint. But then there may be some things that you may
have difficulty deciding if it is just an attribute or an entity on its
own......
HTH,
Brian
--
===================================================================
Brian Peasland
oracl...@nospam.peasland.net
http://www.peasland.net
Remove the "nospam." from the email address to email me.
"I can give it to you cheap, quick, and good.
Now pick two out of the three" - Unknown
That's an interesting question.
It reflects the fact that Oracle does not allow
you to define Domains - so tables are the
next best thing (although you might be able
to do something with user-defined types).
However, if you guarantee that a certain list
of values is the domain for just one column in
just one table then a constraint makes sense.
If you want to use the same domain for multiple
columns, then the only safe option is to make
it a table so that you can avoid the risk errors
in maintaining multiple copies of the same list
of values in different constraints.
As a table, the domain probably ought to be
a single column IOT. Of course, if you have
a 'translation' or 'description' for each
domain value, then it isn't really a domain,
it is an entity and ought to be represented
by a table anyway.
--
Regards
Jonathan Lewis
http://www.oracle.com/technology/community/oracle_ace/ace1.html#lewis
The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html
Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
>Or maybe you want the 2-letter abbreviation as well.
Funny how this can suddenly be a major attribute change.
http://zip4.usps.com/zip4/zcl_0_landing_state.htm
jg
--
@home.com is bogus.
Needed for robot wars:
http://news.yahoo.com/s/ap/20060525/ap_on_hi_te/honda_robot