Do you object to AutoNumber type for Primary Key?

Danny Lesandrini

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

I just read Peter Vogel's editorial in this month's copy of SMART Access.
I'm new to the journal and to Peter Vogel's opinions.

He stated:
"If you've read some of my Working SQL columns, you know that I
object_strongly_to_generating_arbitrary_data to use as a key field (for
instance, AutoNumber fields). My feeling is that your primary keys should
always be based on real data in the table."

He went on to discuss an example where he helped a conference attendee strip
the AutoNumber field out of a table to make a relationship to a parent table
more direct.

I don't mean to start any sort of Primary Key religious argument, but I
would like to get a feeling from the NG about the selection of fields for
Primary Key.

Personally, I ALWAYS lean toward AutoNumber. I inherited one database where
the primary key of the primary table is FullName. This string value is
propagated throughout all child tables and is a key search field. The
database is a hog and takes forever to prepare some reports. After seeing
the results of such a poor decision, I would NEVER make that mistake myself.

That's my opinion, but I'm I could be misled. Let me know your views.

Danny Lesandrini
dlesa...@hotmail.com

Jim Gray

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

This is a religous war.

I also read Smart Access, and really like it. However, they are not the
database gods they would like to believe they are. I have inherited
databases from some of the so called gods (not Smart Access's, but very
famous in the Access Community), and they SUCK. The gods that created my
stuff did not understand things like normalization, indexes, primary keys,
and so forth. The are however, friends of Microsoft, they are insiders, and
they have written books. In short, they have made a name for themselves.

I believe you will find people split pretty much evenly in both camps. I
don't really mind either method, except in the case of concatenated primary
keys. I am currently (this minute) struggling with an inherited database
that has a relationship that depends on three fields (among others).
tblDevice and tblDeviceModel. The Device table has all kinds of useful
information like Location, Department, and so forth. The DeviceModel table
identifies the characteristics, Name,ModelNumber,DeviceType, NumberPorts.
It takes the first three fields to come up with a unique key (according to
the users). The origianl developer did not like autonumber keys, so he
carried all three fields in both tables. Therefore, the DeviceModel table
has three fields requried for identification and one field that is useful
information. Doesn't make sense to me.

Another thing is the issue you mention about propagation through the child
tables. What about when the name changes, now you have to make sure that
you have some way of performing a cascading update through all of the child
tables. Access makes this easy, but SQL Server requires triggers to be
used. If you modify the database structure then you must also remember to
modify the trigger. Other products may require some sort of code to be
modified that isn't really obvious (Thats the way it was in the old days
working on mainframe products). I have seen changes implemented where these
updates were forgotten, causing data problems later on.

One database class that I attended suggested that the primary key should be
completely arbitrary, it should not have any meaning at all to anybody.
However, your table should still have a field or combination of fields that
can serve as a candidate key (possible primary) in order to be properly
normalized, and you should implement constraints to make sure that
uniqueness is enforced on the candidate key.

Just some thoughts from a dummy,
Jim Gray

Danny Lesandrini <data...@goldeninter.net> wrote in message
news:o6az3.150$916....@news.uswest.net...

Abbot Cooper

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

Danny Lesandrini wrote in message ...

>I just read Peter Vogel's editorial in this month's copy of SMART Access.
>I'm new to the journal and to Peter Vogel's opinions.
>
>He stated:
>"If you've read some of my Working SQL columns, you know that I
>object_strongly_to_generating_arbitrary_data to use as a key field (for
>instance, AutoNumber fields). My feeling is that your primary keys should
>always be based on real data in the table."

I would disagree vehemently... I've been down that road before. I had a
patient tracking database that I wrote about two years ago (actually not
quite that long ago -- I'm embarrassed to say how recently it really was
<g>) which of course includes the possibility that a patient will receive
care more than once, and often times more than once in the same *day*. So, I
used the patient's ID# and the time of entry into the system as my primary
key. What a nightmare...

To anyone reading this, I implore you to never use a date/time field as a
primary key, it is a disaster and a waste of your time. AutoNumbers serve
the purpose of linking data from one table to another as a PK/FK
relationship. This is (IMNSHO) *entirely* a seperate matter from unique
indexes of fields, which is a question of enforcing business rules. Want to
make it so that each patient can only have one record for each day? Fine,
make a unique index on their patient#, but use the AutoNumber as your PK.
Otherwise, what happens when you later decide that you want to allow
multiple records for each patient day? Well, you've got to *entirely*
re-design your db in order to account for this because your FK's are all
linked back to something *other than an AutoNumber*!! If you had used the
AutoNumber in the first place you would only need to deal with those issues
related to the additional column of information (namely the new date/time
field) and set the unique index to include that extra field as well.

>He went on to discuss an example where he helped a conference attendee
strip
>the AutoNumber field out of a table to make a relationship to a parent
table
>more direct.
>
>I don't mean to start any sort of Primary Key religious argument, but I
>would like to get a feeling from the NG about the selection of fields for
>Primary Key.
>
>Personally, I ALWAYS lean toward AutoNumber. I inherited one database
where
>the primary key of the primary table is FullName. This string value is
>propagated throughout all child tables and is a key search field. The
>database is a hog and takes forever to prepare some reports. After seeing
>the results of such a poor decision, I would NEVER make that mistake
myself.

Agreed -- go with what you are doing now.

>That's my opinion, but I'm I could be misled. Let me know your views.

I don't think you are misled, but others may disagree. Stick to yer guns,
gather them thar' wagons! Let the PK/AutoNumber Wars commence!!!!
(again.....)

Seriously though, this *has* been discussed ad nauseum here... you can check
www.deja.com for a rousing thread about it, I think it was a year ago or so.
Nevermind -- here it is:
<URL:http://x22.deja.com/getdoc.xp?AN=441069196>

Another good thread was "AutoNumber Madness" from last October which got
into all sorts of yucky issues of which everyone should be mindful:
<URL:http://x22.deja.com/getdoc.xp?AN=402005407>

HTH.

--
Abbot Cooper
Web: http://www.cyriv.com
Email: cooper_N...@mediaone.net
Remove the "_NoSpam_" from above

George Shears

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

Abbot Cooper <cooper_N...@mediaone.net> wrote in message
news:rsqetq...@news.supernews.com...

> Danny Lesandrini wrote in message ...

Recap -
Danny : What about those Autonumbers as Primary Keys?
Abbot: They can be a very good thing.

Abbot, you said it quite well. I can't add more. Just adding my thumbs-up
FWIW.

George

Steven

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

Depends on the situation, and objectives. I would never use a name,
description, date/time as a primary key, or part of a primary key. Many
systems I use AutoNumber, some I use a Code/Id. Example

Security Portfolio System:

Transaction Table - many records - use autonumber.

Transaction Type - few types - use code (e.g. "sl" for sale). Code stored
in transaction table.

Reason - user will be designing own queries. Much simplier for them to use
a transaction code to filter, sort records; then have them join two tables
based on a system generate transaction type ID, and the filter on the joined
table. Given the few number of types, and that the types are static, it
makes more sense to break the "strick" rule of using autonumbers.

Steven

Danny Lesandrini wrote in message ...

[snip]

Tkdjr

unread,

Sep 2, 1999, 3:00:00 AM9/2/99

to

>Personally, I ALWAYS lean toward AutoNumber. I inherited one database where
>the primary key of the primary table is FullName. This string value is
>propagated throughout all child tables and is a key search field. The
>database is a hog and takes forever to prepare some reports. After seeing
>the results of such a poor decision, I would NEVER make that mistake myself.

>That's my opinion, but I'm I could be misled. Let me know your views.
>

Have been doing db design/implementation for 16 years and specifically access
for 5 years. Almost Always use autonumber as key field. It provides an
absolute, non duplicateable key. If other indexes are needed I build and use
those as necessary. Often the end user does not see the autonumbeer key.

Cheers
Tracy

peter...@my-deja.com

unread,

Sep 5, 1999, 3:00:00 AM9/5/99

to

Well, since it's my opinion that started this,
let me get involved.

The first thing to realize is that my objection
to an autonumber field is design based. In the
relational database theory, a table can not have
two identical records--if you do, then the table
is not a relation. That being the case, my choice
is to turn the fields that cause the records to
be unique into the primary key, rather than
introduce another another field. It seems to me
that the additional "made up" autonumber data
item is just being added to make the design work,
a bad reason to do anything.

Notice some of the results of this attitude: I
don't mind a record with three key fields and one
data field. It's not clear to me why the data
fields have to outnumber the key fields. Nor do I
see why records with fewer key fields are
inherently better than records with lots of key
fields. Heck, I've had some junction tables that
created many-to-many relationships between two
tables where the primary key consisted of all the
fields in the table. So, given that kind of
attitude towards multiple key fields, you can see
why I don't rush towards autonumber fields.

Here's some practical reasons for not using
autonumber fields:
- autonumber fields increase the accesses to the
database. Let us say that I want to add a sales
order header record and a sales order detail
record. If the primary key for the sales order is
just another piece of data, I can build both
records independently of each other (though
referential integrity may force me to add the
header before the detail). If the PK is an
autonumber field, then I have to add the order
header first. But then I have to retrieve the
order header record to find out what value was
generated for the primary key. Given that primary
key, I can complete the order detail record and
add it. Over a slow link like the internet that's
awkward.
- when I go into to rent a video the clerk behind
the counter always asks me my membership number.
Why? Because the primary key for the database is
some arbitrary number. I say that I don't know my
membership number, so they ask for my phone
number, and I get my video. Why don't we just use
my phone number and name as the primary key and
save a step when I want a video?
- people who use autonumber fields as, for
instance, the sales order number, frequently end
up writing a lot of roccoco code because they
want to avoid gaps in the series or start the
number at a specific point, or other issues. If
your primary key is real data, these problems
tend to disappear.

I think that arbitrary numbers, like the
autonumber field, made a lot of sense when we
were trying to make things easy on the computer
and didn't care to make it easy for the user. Now
that we have the computing power available, I
think that we should make life easy for the user.

Having said all that, Richard Campbell has (I
think) a reasonably compelling argument for using
autonumber fields. Richard's argument is that the
requirement for records to be unique doesn't
reflect anything about the real world. The
requirement for records to be unique, Richard
points out, is part of the relational database
theory. As a result, it shouldn't be surprising
that the real data doesn't meet our database
design needs and, so, we must generate data to
meet the requirements of the theory. My claim is
that relational database theory reflects the real
nature of data but I'm not sure how to
demonstrate that.

Finally, I think I muddied the waters in my
editorial. I felt that the problem with the
database wasn't so much that it used an
autonumber field (though I think that's a bad
idea) but that it didn't use a foreign key back
to the first table in the series.

Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.

rod_c...@my-deja.com

unread,

Sep 6, 1999, 3:00:00 AM9/6/99

to

Been quite a bit said, but here's a couple of extra thoughts.

Record deletion leaves gaps.

Migrating tables with autonumbers is a disaster. We found this out
during development of a project where a series of tables had to be
modified. Customer's data needed to be migrated to the new table
structures, but the Autonumber keys caused problems. Solution - used
ordinary number, and on the forms for the ID's used the dMax()+1 as the
default. Problems solved.

Fallon, Tom

unread,

Sep 6, 1999, 3:00:00 AM9/6/99

to

Rod

I agree with DMAX()+1.

If you include it as a default on a form it works a treat and you can
avoid having gaps in the numbers.

If you use AutoNum you get gaps if the user escapes out from data entry.
The next record is assigned a number 2 higher than the last actual
record - only way to avoid that is to compact/repair.

Tom

Doug Hutcheson

unread,

Sep 7, 1999, 3:00:00 AM9/7/99

to

Peter,
I use magic numbers only as a last resort and hate autonumbers.
Like you, I prefer indexing on real data.
I do, however, tend to restrict my keys to short fields, to avoid index
bloat - for instance, I NEVER include a name or phone number in a PK,
because it is inefficient in storage.
If the data in question cannot easily be differentiated without resorting to
long strings or decimal numbers, I then invent a number, referring back to a
table of 'next numbers' and updating as I go.
I hate autonumbers also because of the referential integrity problems if the
table is exported, appended or converted to replicable - all your relational
logic goes bye-bye in the new version/copy/replica.
Just my $0.02
Doug

--
Dev Ashish's FAQ & Help site: http://www.mvps.org/access/
Michael (michka) Kaplan's site http://www.trigeminal.com
Microsoft support: http://support.microsoft.com/support
News archive: www.deja.com.
------------------------------------------
I heed advice, but ignore attacks, so don't bother to flame.
------------------------------------------
peter...@my-deja.com wrote in message <7qutg1$jui$1...@nnrp1.deja.com>...

Danny Lesandrini

unread,

Sep 7, 1999, 3:00:00 AM9/7/99

to

Peter:

Thank you for responding.

It is impressive to me that I can ask a question about an editorial in a
magazine and get a response from the author. It gives me a feeling of
connection and it makes your opinion more credible, in as much as you are
willing to engage in an interactive discussion with peer-professionals.

I would also like to, once again, express my appreciation for this
newsgroup. It has become an important, daily resource for me that has
contributed to my development as an IT professional and I am grateful to all
contributors.

--

Danny Lesandrini
dlesa...@hotmail.com

<peter...@my-deja.com> wrote in message
news:7qutg1$jui$1...@nnrp1.deja.com...

George Shears

unread,

Sep 7, 1999, 3:00:00 AM9/7/99

to

<rod_c...@my-deja.com> wrote in message
news:7qv2hg$n3r$1...@nnrp1.deja.com...

> Been quite a bit said, but here's a couple of extra thoughts.
>
> Record deletion leaves gaps.

Autonumbers should never be thought of as having any "meaning" to the user.
Gaps are therefore irrelevant. Used properly, they avoid (a) bloat in the
subsidiary tables resulting from multiple field foreign keys and (b) the
necessity to implement obscure user generated ID Numbers or fragile code
generated (and still ultimately meaningless) artificial ID numbers.

> Migrating tables with autonumbers is a disaster. We found this out
> during development of a project where a series of tables had to be
> modified. Customer's data needed to be migrated to the new table
> structures, but the Autonumber keys caused problems. Solution - used
> ordinary number, and on the forms for the ID's used the dMax()+1 as the
> default. Problems solved.

I have done a number of the migrations you describe using autonumbers.
Never had a major problem. It _does_ need to be thought out in advance.

Is DMax() +1 working well for you? Is this a multiuser environment?

George

Danny Lesandrini

unread,

Sep 7, 1999, 3:00:00 AM9/7/99

to

Well said! I agree wholeheartedly.

Danny Lesandrini
dlesa...@hotmail.com

peter...@my-deja.com

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

I haven't participated in this community for a number of years (other
than lurking) but, you're right, its always been a group more concerned
about doing good work than doing each other. Not a lot of ego, lot of
commitment to helping.

peter...@my-deja.com

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

I tend to feel that the reason that relational theory works so well is
because it matches well to the real world of most data (e.g. it doesn't
work well with hierarchical data). In the same way that a geometry of
curved space does a great job of describing figures on a curved
surface, and simple arithmetic does well with groups of objects, I feel
that the relational theory is a close model for reality. As a result,
when I add surrogates and depart from the real data, I worry. Is there
really a mismatch between reality and the needs of the data model or
have I failed to understand the data?

peter...@my-deja.com

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

Robin Stoddart-Stones

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

I was always taught the formula K=DU3

A Key is data-independent, unique, unlimited and unchanging.

The problem with multiple field keys is that they often confuse the
intention, a unique entity, with the attributes (Name, address,
Phonenumber etc). When the attributes change, the Key changes even if
the entity has not; for example, change of name, address or phone
number, none of which change the fundamental identity). This violates
data-independence and unchangeability.

A standard autonumber is neither data-independent ( it relies on
number of records / deletions / compactions) nor unchanging. It can be
altered by deleting the data and restoring from text backup thus
changing existing information even though it purports to be identical

My preferred method is to create a 'permanent' field to contain the
key with a synchronisation routine to check for data integrity. I
have used the synchronised autonumber (for convenience), the DMAX
function where the autonumber is likely to need frequent,
re-synchronisation and a NextValue table where Dmax is slow or not
usable and multiple keys only where the attributes are unchanging. ( I
got caught on partnumbers once, where the part can have multiple
sources, same part, different manufacturers' part numbers.)

Have Fun
Robin

On Wed, 1 Sep 1999 07:43:30 -0600, "Danny Lesandrini"
<data...@goldeninter.net> wrote:

>I just read Peter Vogel's editorial in this month's copy of SMART Access.
>I'm new to the journal and to Peter Vogel's opinions.
>
>He stated:
>"If you've read some of my Working SQL columns, you know that I
>object_strongly_to_generating_arbitrary_data to use as a key field (for
>instance, AutoNumber fields). My feeling is that your primary keys should
>always be based on real data in the table."
>

>Personally, I ALWAYS lean toward AutoNumber. I inherited one database where

>the primary key of the primary table is FullName. This string value is
>propagated throughout all child tables and is a key search field. The
>database is a hog and takes forever to prepare some reports. After seeing
>the results of such a poor decision, I would NEVER make that mistake myself.
>
>That's my opinion, but I'm I could be misled. Let me know your views.
>

>Danny Lesandrini
>dlesa...@hotmail.com
>
>
>
I

Radu Lascae

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

Not having read the editorial that began this thread, and missing the
original message, I can't judge whether you failed to understand the
data.

What I strongly believe is that each and everyone of us usually fails to
understand the _reality_. Without trying to be philosophycal here, what
my opinion boils down to is this: given the fact that no model can match
reality exactly, and also the available technology - which, in effect,
provides the means for modelling - there may be occasions when finding a
natural key for a given set of data is impossible. If it is not a
limitation of the mathematical model - of which I don't know enough, I
don't have Cobb's book next to my bed <g> - then it is a limitation of
the modelling tool.

An example: the maximum number of fields in a Jet defined index is 10.
If it's not 10, there will always be some arbitrary number. If the
natural key requires more then 10 fields, a surogate is necessary.
Whether the key in this case should be an Autonumber or something else
is not relevant to the issue at hand. And I'm not going into the issues
of speed or handling such indeces here.

Thanks for an interesting thread,
Radu Lascae

Abbot Cooper

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

PaulvG wrote in message <7r644v$j02$1...@zonnetje.nl.uu.net>...
>
>E.g. we can ask ourselves: is this desk I'm sitting at really the
>same desk that I left some minutes ago when I went for a coffee?

Reminds me of a debate in a philosophy class I had some years back. Someone
opined that I could not say with absolute certainty that there really was a
blizzard outside (there was). I insisted that I did indeed know for a fact
that snow was outside our window at that very moment, and that it was not
something else other than snow. This led to an interesting existential
argument, until I offered to go outside and bring back some of the erstwhile
substance being debated and throw it into the face of anyone who wanted to
insist that it was not *really* snow... Not surprisingly, the
existentialists came rapidly crashing back to earth.

--
Abbot Cooper
Cyber-River Solutions: The knowledge, experience, and ingenuity
to meet your unique business needs. http://www.cyriv.com
Email: cooper_N...@mediaone.net

peter...@my-deja.com

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

Another good case for an arbitrary key, though not an autonumber field.
I've always assumed that the drive for an unchanging key was a
practical one: Prior to database vendors implementing Cascade Update,
keeping foriegn keys in sync when a primary key changed was a royal
pain. With the advent of Cascade Update (or triggers), I had assumed
that the need for an unchanging key went away.

Your approach is that there is an entity out there and data about that
entity shows up in several tables. As a result you need a surrogate--
the single, unchanging key--to represent that entity and tie all of the
information about it together.

You post made me realize that my approach is that the database
represents a web of data, the relationships between the items giving
rise to information. For me, a database is a world unto itself that has
useful references back to the world that we live in. As a result, I
don't think of the data as representing entities: For me, its just
data. This probably accounts for my dislike for entity relationship
modeling as opposed to other methods like FORML.

peter...@my-deja.com

unread,

Sep 9, 1999, 3:00:00 AM9/9/99

to

It's interesting, I find, that all discussions of databases (or maybe
it's just the ones that I get involved with) end up bouncing back and
forth between big issues and little issues. I teach a relational
database course for Learning Tree and whenever we do the course the
class always end up having to discuss:
- the business problem (little)
- the social and political structure of the company (big)
- technical performance issues (little)
- the structure of reality (big)
I think that's why I like database design so much: It's all about
everything.

Ian Burns

unread,

Sep 9, 1999, 3:00:00 AM9/9/99

to

Abbot

This reminds me of the story about Dr Samuel Johnson who, sitting in a coffee
house in 18th century London with some cronies, was discussing the
substantiality or otherwise of things. His pals argued that the table only
existed in their heads, but old Sam got up and kicked the table saying 'I
refute it thus!'.

History doesn't relate whether the table had an autonumber primary key.

Ian

In article <rtdbjl...@news.supernews.com>,

"Abbot Cooper" <cooper_N...@mediaone.net> wrote:
> PaulvG wrote in message <7r644v$j02$1...@zonnetje.nl.uu.net>...
> >
> >E.g. we can ask ourselves: is this desk I'm sitting at really the
> >same desk that I left some minutes ago when I went for a coffee?
>
> Reminds me of a debate in a philosophy class I had some years back. Someone
> opined that I could not say with absolute certainty that there really was a
> blizzard outside (there was). I insisted that I did indeed know for a fact
> that snow was outside our window at that very moment, and that it was not
> something else other than snow. This led to an interesting existential
> argument, until I offered to go outside and bring back some of the erstwhile
> substance being debated and throw it into the face of anyone who wanted to
> insist that it was not *really* snow... Not surprisingly, the
> existentialists came rapidly crashing back to earth.
>
> --
> Abbot Cooper

Abbot Cooper

unread,

Sep 9, 1999, 3:00:00 AM9/9/99

to

PaulvG wrote in message <7r7vib$ljv$1...@zonnetje.nl.uu.net>...
>Abbot Cooper wrote in message ...

>>PaulvG wrote in message <7r644v$j02$1...@zonnetje.nl.uu.net>...
>>>
>>>E.g. we can ask ourselves: is this desk I'm sitting at really the
>>>same desk that I left some minutes ago when I went for a coffee?
>>
>>Reminds me of a debate in a philosophy class I had some years back.
Someone
>>opined that I could not say with absolute certainty that there really was
a
>>blizzard outside (there was). I insisted that I did indeed know for a fact
>>that snow was outside our window at that very moment, and that it was not
>>something else other than snow. This led to an interesting existential
>>argument, until I offered to go outside and bring back some of the
erstwhile
>>substance being debated and throw it into the face of anyone who wanted to
>>insist that it was not *really* snow... Not surprisingly, the
>>existentialists came rapidly crashing back to earth.
>

>Sorry for expressing myself so poorly, that's just the way it
>is. I definitely didn't mean this problem, which is hardly
>related to the matter.The one I meant is known as the labeling
>problem.

I don't think you expressed yourself poorly at all. Yes, this was the
argument being made in the class: whether it should be labeled snow. In
other words, is this snow as we know it or is it something else. "It's snow,
trust me..." was my reaction. "Yes, it's the same desk -- I'm pretty darn
sure". But we digress...

>But let's skip metaphysics and let's look into the Patient
>entity in a hospital. What are candidate keys here?

There aren't any.

>Keep in mind that people get born, die, marry, move and lie.
>Sometimes items like SSN, name, age and sex are indecisive or
>loose. There's no way around some kind of surrogate.

*Many* of our patients are foreigners (including many illegal aliens I am
sure) and as such do not have SSN's. We also get many Jane/John Doe's who
are mystery people from some horrible accident and we don't know *anything*
about them save for their sex.

>AFAICS this is exactly what hospital information systems do.
>I suspect the patient# in the example you posted in this thread
>is also a surrogate, although it may be based on some
>provisional patient data.

Bottom line -- you make a surrogate and that's that. You fill in the related
information as best you can. It is of course not beyond the bounds of reason
to expect that some people may wind up with multiple surrogates, but that is
life in a large operation.

>Even more compelling arguments for surrogate keys can be made
>in more technical areas like dimensional modeling, data
>warehousing and OLAP.Is there really an alternative to
>surrogates for PK's in dimension tables?
>
>BTW, I support Peter Vogel's position that users shouldn't be
>bothered with surrogates, if possible.

Agreed - the surrogate is generally not seen by the user, although the
patient ID example above is an exception. It is the only feasible way to
refer to a patient for recordkeeping/internal communication purposes
(although obviously the patient is always referred to as Mr./Ms. FooFoo in
their presence in order to be polite).

Abbot Cooper

unread,

Sep 9, 1999, 3:00:00 AM9/9/99

to

Ian Burns wrote in message <7r8793$7j6$1...@nnrp1.deja.com>...

>Abbot
>
>This reminds me of the story about Dr Samuel Johnson who, sitting in a
coffee
>house in 18th century London with some cronies, was discussing the
>substantiality or otherwise of things. His pals argued that the table only
>existed in their heads, but old Sam got up and kicked the table saying 'I
>refute it thus!'.
>
>History doesn't relate whether the table had an autonumber primary key.

Another story comes to mind:

A young Vietnamese woman in Saigon in the 60's came home to the countryside
to explain to her elderly mother that she needed a birth certificate or some
other documentation of her birth in order to obtain papers to work on a U.S.
military base. The mother's response (paraphrasing of course):

"That's the stupidest think I ever heard 'needing proof that you were born'.
If they doubt that you were born, poke them in the eye with a sharp stick.
What more proof do they need?"

peter...@my-deja.com

unread,

Sep 9, 1999, 3:00:00 AM9/9/99

to

I can certainly see a need for surrogates (though not autonumbers) and,
as Paul suggests, we're on a continuum here from "only as a last
resort" on my side to "a very useful tool" on the other. This thread
(like many others in this group) isn't so much about arguing as
clarifying and understanding what we each think--even when we disagree.

Just to be clear on how far out on the continuum I am, I would prefer
not to use a surrogate even in the patient system. I was involved in a
similair situation with a system that tracked criminals who,
surprisingly enough, don't give their real name when arrested (or
sometimes give no name at all). So, a preliminary name is assigned
until the real name reveals itself. When the real name shows up,
Cascade Update (actually, triggers since the DBMS didn't support
Cascade Update) took care of updating all the foreign keys and allowed
this particular arrest to be tied in with the offenders other arrests.
So, we avoided a surrogate on the offender record (name, birthdate, and
date of first arrest, in that order, was our key: should we ever have
two offenders with the same name and birthdate and first arrest, we're
baked).

Of course, there is a problem of how to refer to this arrest. While an
arrest might start life as "Officer Friendly's" arrest of "Huckleberry
Hound" on "September 7, 1999", that data might change. We also had a
counter field to indicate how many times that particular person had
been arrested by that officer on that day (I did a search for the
highest value this achieved and turned up 7 on one offender: this guy
was either really stupid or really unlucky).

Anyway, let's say the offender's name turns out not to be Huckleberry
Hound, we find out the real name, and correct the arrest record. The
record now becomes "Officer Friendly's" arrest of "Jim Jones"
on "September 7, 1999". If you weren't aware of the offender's name
change how would you find the case again? I mean, even the officer and
date of arrest could be changed if that information had been entered in
error. So, we assigned each arrest a case number that wouldn't change
as the information in the case changed. People would always be able to
find their way back to the arrest using the case number.

Interesting enough, I added some instrumentation to see how the case
record was accessed. Every once in a while, someone would type in a
case number and go directly to the case. More often, someone would list
all the cases for an officer or day or offender and, from that list, go
to the case they wanted. I've often wondered what would have happened
if we had eliminated the case number and just used Officer, Offender,
Date, Counter as our primary key.

Abbot Cooper

unread,

Sep 9, 1999, 3:00:00 AM9/9/99

to

peter...@my-deja.com wrote in message <7r8mv1$jnn$1...@nnrp1.deja.com>...

>I can certainly see a need for surrogates (though not autonumbers) and,
>as Paul suggests, we're on a continuum here from "only as a last
>resort" on my side to "a very useful tool" on the other. This thread
>(like many others in this group) isn't so much about arguing as
>clarifying and understanding what we each think--even when we disagree.

Amen to that...

>Just to be clear on how far out on the continuum I am, I would prefer
>not to use a surrogate even in the patient system.

Well, when you are dealing with a hospital system which is nearly 190 years
old now (obviously not all that data is in the _current_ system <bg>) and
which counts patient visits in the *millions* per year, deciding between
surrogates vs. composite keys quickly becomes an easy decision!!

<SNIP>

>Of course, there is a problem of how to refer to this arrest. While an
>arrest might start life as "Officer Friendly's" arrest of "Huckleberry
>Hound" on "September 7, 1999", that data might change. We also had a
>counter field to indicate how many times that particular person had
>been arrested by that officer on that day (I did a search for the
>highest value this achieved and turned up 7 on one offender: this guy
>was either really stupid or really unlucky).

ROFLMAO.

George Shears

unread,

Sep 9, 1999, 3:00:00 AM9/9/99

to

Something about your reply reminded me of a fellow I did some work for in
the 80's. Name was Jim Smith. Every year (or was it every other year??)
Jim would attend the National (or was it International??) "Jim Smith
Convention". Yep. Thousands (or was it Hundreds??) of Jim Smiths
descending upon some hapless community. Makes ya think. :)

George

<peter...@my-deja.com> wrote in message
news:7r8mv1$jnn$1...@nnrp1.deja.com...

<snip>

> should we ever have
> two offenders with the same name and birthdate and first arrest, we're
> baked).

<snip>

Robin Stoddart-Stones

unread,

Sep 10, 1999, 3:00:00 AM9/10/99

to

The truth lies somewhere between the absolutes at either end.

Part of the philosophy of K=DU3 is that you have to look at the
essential nature of what you are doing. If changing the attributes
produces changes in the entity recorded, not reflected in the actual
entity, then your model is not reflecting the independent status
correctly.

You can tie yourself up with semantics as well. If the original entity
is unique then to describe that entity through its attributes is to
create the surrogate. Even the word entity can vary from the
particular object, my usage in this context, to the class of object
which appears to be yours. Both are correct in the context of this
discussion,

I describe you as Peter Vogel, a unique entity. I don't describe you
as the male that lives on the hill with the yellow hair and the
Corvette. ( Sorry if any of these details are correct :-) )

Reality also employs both data and relationship analysis. That someone
is popular is an evaluation of the one to many relationship of that
entity to many others, with the attribute on the relationship.

However few ( if any) databases make the relationship available for
analysis. One tends to make junction tables to embody the explicit
relationship. In the same way a record is only the expression of a
permanent relationship between entities.

Have Fun
Robin

m_a...@my-deja.com

unread,

Sep 10, 1999, 3:00:00 AM9/10/99

to

Peter:
I now know why I do this work. Thanks.
Moshe
In article <7r6tdo$atv$1...@nnrp1.deja.com>,

peter...@my-deja.com wrote:
> It's interesting, I find, that all discussions
of databases (or maybe
> it's just the ones that I get involved with)
end up bouncing back and
> forth between big issues and little issues. I
teach a relational
> database course for Learning Tree and whenever
we do the course the
> class always end up having to discuss:
> - the business problem (little)
> - the social and political structure of the
company (big)
> - technical performance issues (little)
> - the structure of reality (big)
> I think that's why I like database design so
much: It's all about
> everything.
>

peter...@my-deja.com

unread,

Sep 11, 1999, 3:00:00 AM9/11/99

to

Whether an entity has an existance or is just an acclomeration of
attributes (all of which, potentially, could change over time) is the
real essence of the existential dilemma that's cropped in one of the
discussions of this thread. If the table is real, then it has an
existance independent of anyone sensing its attributes. On the other
hand, if the table is just a collection of sensory attributes (which is
all that Dr. Johnson proved) and all of those attributes can change
(size, color, shape), where's the table?

Not a theoretical issue, either. I started work on a system to track
valves for a refinery. Originally, I had the impression that every
valve had a brass tag welded to it that identified valve. Well, it
turned out that the valves were regularly opened up and the components
inside swapped with other valves or parts out of the warehouse. The
valve "shell" (with the tag) could also be swapped out and a new shell
wrapped around the components. A valve was really a collection of spare
parts (attributes) all located in the same physical space. And, by the
way, the whole valve could be picked up and moved somewhere else so
even the physical space could change. There really was no "valve"
there. Fortunately, the project got cancelled. A real life example of
the "grandfather's axe" problem: the axe hung over the fireplace
is "grandfather's axe", though the handle has been changed three times
and the head twice since grandfather died.

peter...@my-deja.com

unread,

Sep 11, 1999, 3:00:00 AM9/11/99

to

Fabulous thread.

Mark Clements

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

I have been working in a hospital (although on a much smaller scale) dealing
with patients in a local psycotherapy department. I too had the problem
that the patients may have several names (due to marriage, alias, nickname
etc.). I found the only real way to ensure that they could be located
adequately by the users was to store the current 'default' name (ie the one
used in all correspondance, reports etc.) in the main patient record, but
also store old or alternative names in a separate table. Thus when a user
searches for a patient using an out of date name the correct record can
still be found.

Mark Clements

<peter...@my-deja.com> wrote in message
news:7r8mv1$jnn$1...@nnrp1.deja.com...

> I can certainly see a need for surrogates (though not autonumbers) and,
> as Paul suggests, we're on a continuum here from "only as a last
> resort" on my side to "a very useful tool" on the other. This thread
> (like many others in this group) isn't so much about arguing as
> clarifying and understanding what we each think--even when we disagree.
>

> Just to be clear on how far out on the continuum I am, I would prefer

> not to use a surrogate even in the patient system. I was involved in a
> similair situation with a system that tracked criminals who,
> surprisingly enough, don't give their real name when arrested (or
> sometimes give no name at all). So, a preliminary name is assigned
> until the real name reveals itself. When the real name shows up,
> Cascade Update (actually, triggers since the DBMS didn't support
> Cascade Update) took care of updating all the foreign keys and allowed
> this particular arrest to be tied in with the offenders other arrests.
> So, we avoided a surrogate on the offender record (name, birthdate, and

> date of first arrest, in that order, was our key: should we ever have

> two offenders with the same name and birthdate and first arrest, we're
> baked).
>

> Of course, there is a problem of how to refer to this arrest. While an
> arrest might start life as "Officer Friendly's" arrest of "Huckleberry
> Hound" on "September 7, 1999", that data might change. We also had a
> counter field to indicate how many times that particular person had
> been arrested by that officer on that day (I did a search for the
> highest value this achieved and turned up 7 on one offender: this guy
> was either really stupid or really unlucky).
>

> Anyway, let's say the offender's name turns out not to be Huckleberry
> Hound, we find out the real name, and correct the arrest record. The
> record now becomes "Officer Friendly's" arrest of "Jim Jones"
> on "September 7, 1999". If you weren't aware of the offender's name
> change how would you find the case again? I mean, even the officer and
> date of arrest could be changed if that information had been entered in
> error. So, we assigned each arrest a case number that wouldn't change
> as the information in the case changed. People would always be able to
> find their way back to the arrest using the case number.
>
> Interesting enough, I added some instrumentation to see how the case
> record was accessed. Every once in a while, someone would type in a
> case number and go directly to the case. More often, someone would list
> all the cases for an officer or day or offender and, from that list, go
> to the case they wanted. I've often wondered what would have happened
> if we had eliminated the case number and just used Officer, Offender,
> Date, Counter as our primary key.
>
>

rod_c...@my-deja.com

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

In article <7r3ck4$3tv0$1...@newssvr04-int.news.prodigy.com>,

"George Shears" <gjsh...@prodigy.net> wrote:
> <rod_c...@my-deja.com> wrote in message
> news:7qv2hg$n3r$1...@nnrp1.deja.com...
> > Been quite a bit said, but here's a couple of extra thoughts.
> >

> > Record deletion leaves gaps.
>
> Autonumbers should never be thought of as having any "meaning" to the
user.
> Gaps are therefore irrelevant. Used properly, they avoid (a) bloat
in the subsidiary tables resulting from multiple field foreign keys and
(b) the necessity to implement obscure user generated ID Numbers or
fragile code generated (and still ultimately meaningless) artificial ID
numbers.

True, autonumbers should have precicely NO MEANING at all. i believe
they should be used solely for "internal tracking" - i.e. they are
never seen by the user. However, when I am developing a database, I
like everything to have some semblance of a "meaningful relationship"
[in the plutonic database way :-)] so that it is easier forme to
understand the data

>
> > Migrating tables with autonumbers is a disaster. We found this out
> > during development of a project where a series of tables had to be
> > modified. Customer's data needed to be migrated to the new table
> > structures, but the Autonumber keys caused problems. Solution -
used
> > ordinary number, and on the forms for the ID's used the dMax()+1 as
the
> > default. Problems solved.
>
> I have done a number of the migrations you describe using autonumbers.
> Never had a major problem. It _does_ need to be thought out in
advance.
>
> Is DMax() +1 working well for you? Is this a multiuser environment?
>
> George
>

> Yes, but the problems occur when migrating different tables which
have used Autonumbers as IDs. Access renumbers the autonumber fields
and hence all our relationships were gone. What a disater that was:
hence I shy away from Autonumbers.

Most of our clients like to see the ID number continued from their
previous systems (manual or computerised). A typicla ID is, for
example, D99-145, where the subsequent ID is D99-146. We use this
formula on the forms to creat ID's:

=DMax("[NextID]","qryNextID")

where the query is:

SELECT "D" & Max(Year(Date()) & "-" & Format((Val(Right
([EnquirerID],4)))+1,"0000")) AS NextID
FROM tblEnquirer
GROUP BY Val(Mid([EnquirerID],2,4))
HAVING (((Val(Mid([EnquirerID],2,4)))=Year(Date())));

We had to concer the year (99) to the full year (1999) so that next
year (2000) will still be in sequence.

Regards,

Rod

rod_c...@my-deja.com

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

In article <6YaB3.481$IP5....@news.uswest.net>,

"Danny Lesandrini" <dlesa...@hotmail.com> wrote:
> Well said! I agree wholeheartedly.
>
> Danny Lesandrini
> dlesa...@hotmail.com
>

You agree with what??? or whom???

Read my response to George.

rod_c...@my-deja.com

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

In article <37d4e1b4...@news.btinternet.com>,

rst...@gmsproject.win-uk.net wrote:
> I have used the synchronised autonumber (for convenience), the DMAX
> function where the autonumber is likely to need frequent,
> re-synchronisation and a NextValue table where Dmax is slow or not
> usable and multiple keys only where the attributes are unchanging.

OK I'm ignorant of this. So what's the "NextValue table" all about?

Robin Stoddart-Stones

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

The NextValue table is a single record table with a field to represent
every indepedent key that you need to generate.

You program a procedure (for a particular table) to open the nextValue
table , add 1 to the particular field for that key, return the value
and close the table.

Because the recordset is small and the operation is very brief the
procedure can check if the table is open , delay and restrike quite
quickly, so conflict time is reduced. All it does is issue sequential
unique IDs (sort of DMax without the big table overhead).

If you have multiusers, you can use unbound transactions and generate
unique keys fairly safely for use in transactions.

While it reduces open and locktimes in generation of keys, it doesn't
stop other user problems like two people entering the same record,
updating the same record or deleting the same record simultaneously!

Have Fun
Robin

Robin Stoddart-Stones

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

I think we are agreed, there are occasions when the changing nature of
the entity warrants a foreign key and cases where the essentially
unchanging nature of the entity demands a unique key.

One client I worked for was generating foreign keys for milestones for
a major project. The key was based on 2 # for the performing
authority, 2# for the signatory authority 2# for the accepting
authority and a numeric for the position of the milestone on the
performing authorities milestone chart. (which would change as each
performing authority had multiple projects, multiple milestones of
which this was one)

At the milestone review ( over 1700 of these in the project) I was
appalled / convulsed to find the discussion went like this:

We are now on page 6 item 7, APADFS21, Finish out Terminal, which last
month was CPADFS21 (page 5 item 17) , construct Terminal. Progress?

We are 90% finished and expect to finish next week. That's 2 weeks
ahead of schedule. We are handing the next stage to department FS.

Department FS comment?

We will take over the Milestone which will become FSDFTR17 Set to work
Terminal. We anticipate that the setting to work will have problems,
so we are extending the duration by 4 weeks to 16 weeks. This will
give us an end date for the milestone of 6 July next year. We estimate
that it will be Page 9 item 8 in the report.

Nothing I could do could convince these people that
1) They were talking about a sequence of activities, not a single
milestone ( The milestone was the date at the end, hence 16 weeks gave
a date in the middle of next year)
2) That for comparison purposes, you should not have to keep track of
the page number /item (last month) in order to compare elements of the
same thing.
3) That the achievement they were talking about was a constant unique
item.

And all because someone had persuaded them that changeable foreign
keys was the way to go!

(It took 18 months of persuasion to get the system changed, and the
moment I left the project the permanent employee changed it all back.
I'd failed to notice the real purpose of the system...clearing
yardarms)

have fun
Robin

Cathy Morgan

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

As a new Access learner, I'd appreciate an explanation of what it means to
"migrate tables." Can't find it in Help or any of the books I have. Is
this something I'm likely to want to do? (I'm making a database for my own
single PC, single user business.)

George Shears

unread,

Sep 12, 1999, 3:00:00 AM9/12/99

to

Responses Interspersed.

rod_c...@my-deja.com wrote in message <7rfa14$7sd$1...@nnrp1.deja.com>...
->In article <7r3ck4$3tv0$1...@newssvr04-int.news.prodigy.com>,
-> "George Shears" <gjsh...@prodigy.net> wrote:
->> <rod_c...@my-deja.com> wrote in message
->> news:7qv2hg$n3r$1...@nnrp1.deja.com...
->> > Been quite a bit said, but here's a couple of extra thoughts.
->> >
->> > Record deletion leaves gaps.
->>
->> Autonumbers should never be thought of as having any "meaning" to the
->user.
->> Gaps are therefore irrelevant. Used properly, they avoid (a) bloat
->in the subsidiary tables resulting from multiple field foreign keys and
->(b) the necessity to implement obscure user generated ID Numbers or
->fragile code generated (and still ultimately meaningless) artificial ID
->numbers.
->
->True, autonumbers should have precicely NO MEANING at all. i believe
->they should be used solely for "internal tracking" - i.e. they are
->never seen by the user. However, when I am developing a database, I
->like everything to have some semblance of a "meaningful relationship"
->[in the plutonic database way :-)] so that it is easier forme to
->understand the data

->

Not a problem. If it helps, do it. Didn't mean to say that AutoNumber
"should" be used. Only that there are plenty of times when it could.

->>
->> > Migrating tables with autonumbers is a disaster. We found this out
->> > during development of a project where a series of tables had to be
->> > modified. Customer's data needed to be migrated to the new table
->> > structures, but the Autonumber keys caused problems. Solution -
->used
->> > ordinary number, and on the forms for the ID's used the dMax()+1 as
->the
->> > default. Problems solved.
->>
->> I have done a number of the migrations you describe using autonumbers.
->> Never had a major problem. It _does_ need to be thought out in
->advance.
->>
->> Is DMax() +1 working well for you? Is this a multiuser environment?
->>
->> George
->>
-> Yes, but the problems occur when migrating different tables which
->have used Autonumbers as IDs. Access renumbers the autonumber fields
->and hence all our relationships were gone. What a disater that was:
->hence I shy away from Autonumbers.

->

You can append to an Autonumber field and preserve the original values. For
that matter, if the Autonumber Field is not the PK (or at least
Indexed/Unique) you can even append Duplicates into the Autonumber field.
Now merging different, but similarly structured, data sets (i.e. individual
employee contact DBs into a central company wide Contact DB) is more complex
but the issue of duplicate entries (i.e. individual contacts) essentially
overshadows any PK issues.

->Most of our clients like to see the ID number continued from their
->previous systems (manual or computerised). A typicla ID is, for
->example, D99-145, where the subsequent ID is D99-146.

No objection there whatsoever. Whether a client desire or your own instinct
that this approach makes the DB more usable, this is a fine approach.

->We use this formula on the forms to creat ID's:

->
->=DMax("[NextID]","qryNextID")
->
->where the query is:
->
->
->SELECT "D" & Max(Year(Date()) & "-" & Format((Val(Right
->([EnquirerID],4)))+1,"0000")) AS NextID
->FROM tblEnquirer
->GROUP BY Val(Mid([EnquirerID],2,4))
->HAVING (((Val(Mid([EnquirerID],2,4)))=Year(Date())));
->
->
->We had to concer the year (99) to the full year (1999) so that next
->year (2000) will still be in sequence.

The only real concern here is the possibility of collisions when 2 users are
adding a record at or about the same time. Both Arvin Meyer and Henry
Craven (among others) have posted some really good comments and approaches
on this in the past (and I think possibly in connection with this thread).

George

rod_c...@my-deja.com

unread,

Sep 13, 1999, 3:00:00 AM9/13/99

to

In article <7rg310$6bn$1...@news.cyberhighway.net>,

Migration is simply moving data from one table to another. For
example, if a table structure has to be changed during development,
migrate the data from the old table to the new. This usually happens
when development takes place off-site, and there are either many
changes or lots of data.

That's all. The only problem we've ever encountered is with auto-
number fields.

Regards,

Rod

rod_c...@my-deja.com

unread,

Sep 13, 1999, 3:00:00 AM9/13/99

to

In article <7rgds4$2g6k$1...@newssvr03-int.news.prodigy.com>,
"George Shears" <gjsh...@prodigy.net> wrote:
> Responses Interspersed.

>
>
> You can append to an Autonumber field and preserve the original
values. For
> that matter, if the Autonumber Field is not the PK (or at least
> Indexed/Unique) you can even append Duplicates into the Autonumber
field.
> Now merging different, but similarly structured, data sets (i.e.
individual
> employee contact DBs into a central company wide Contact DB) is more
complex
> but the issue of duplicate entries (i.e. individual contacts)
essentially
> overshadows any PK issues.

OK - accepted, but the cross-table issue still exists.

> The only real concern here is the possibility of collisions when 2
users are
> adding a record at or about the same time. Both Arvin Meyer and Henry
> Craven (among others) have posted some really good comments and
approaches
> on this in the past (and I think possibly in connection with this
thread).
>
> George
>

Thanks. Currently single user, but will have to reconsider whenever
going multi-user. BTW, I like the option from Robin Stoddart-Stones.

Regards,

ROD

Chris Georgiou - TZOTZIOY

unread,

Sep 13, 1999, 3:00:00 AM9/13/99

to

On Sun, 12 Sep 1999 08:45:12 GMT, rumours say that
rst...@gmsproject.win-uk.net (Robin Stoddart-Stones) might have
written:

>The NextValue table is a single record table with a field to represent
>every indepedent key that you need to generate.

Hm... no need to be a single row table; why not be a table with one row
for every counter you need? Wouldn't it be easier to add counters
should the need be? Table structure something like CounterID,
CounterValue

>You program a procedure (for a particular table) to open the nextValue
>table , add 1 to the particular field for that key, return the value
>and close the table.

Having a multiple row table, simplifies the function; just pass it a
CounterID as an argument, you get a CounterValue for result.
--
I am Greeks and I speak England very best,
TZOTZIOY, ICQ# 13397953
(when e-mailing delete the 'deletethispart' part)

Doug Hutcheson

unread,

Sep 14, 1999, 3:00:00 AM9/14/99

to

Chris,
This works well, but as concurrency increases, so does that chance of user A
locking the record for id1, thus locking the page when user B tries to get
id 2.
Not serious, but worth remembering.
I actually use distinct tables for each of the incrementable keys I maintain
this way, to avoid even that remote possibility.
Cheers,
Doug

--
Dev Ashish's FAQ & Help site: http://www.mvps.org/access/
Michael (michka) Kaplan's site http://www.trigeminal.com
Microsoft support: http://support.microsoft.com/support
News archive: www.deja.com.
------------------------------------------
I heed advice, but ignore attacks, so don't bother to flame.
------------------------------------------
Chris Georgiou - TZOTZIOY wrote in message
<6g7dN35LEsG4zZ...@4ax.com>...