should entities in a database be named using singular or plural

metaperl

unread,

Aug 14, 2009, 8:37:20 AM8/14/09

to

Is there a good *reason* why entities in a database should be named
using singular instead of plural?

I can certainly see why columns should be singular, but I'm not so
sure about entities.

For example, in this schema:
http://github.com/metaperl/DBIx--Shootout/blob/555a5dc775690fd55a0b9d6237d2f3e2ec8905b4/lib/DBIx/Shootout/GMAX/schema.pdf

we see that the database designer chose plural names for entities and
singular names for columns.

A similar issue exists in programming languages. You dont have an
array of integers. You have an array in which each element is an
integer.

I guess it all starts with "wrong english" - a bucket of apples is
really a bucket in which each element is an apple.

Terrence Brannon

unread,

Aug 14, 2009, 8:48:19 AM8/14/09

to

On Aug 14, 8:37 am, metaperl <metap...@gmail.com> wrote:
> Is there a good *reason* why entities in a database should be named
> using singular instead of plural?

Microsoft seems to think there is not: http://msdn.microsoft.com/en-us/library/dd129510(VS.85).aspx

"""
Tables contain collections of entities and should have a plural name
to reflect this. For example, a table of addresses should be named
AddressesTable rather than AddressTable.
"""

Walter Mitty

unread,

Aug 14, 2009, 10:37:36 AM8/14/09

to

"metaperl" <meta...@gmail.com> wrote in message
news:7bc8bd4c-18e7-47c8...@c29g2000yqd.googlegroups.com...

> Is there a good *reason* why entities in a database should be named
> using singular instead of plural?
>

It's subjective. I use plural names for tables, most of the time. There
are other people who insist on singular table names.

About the only thing I'm going to support is consistency. Once a choice has
been made about naming conventions for a given project, people who name
things should adhere to the convention. That makes for easier reading later
on.

Terrence Brannon

unread,

Aug 14, 2009, 11:41:06 AM8/14/09

to

On Aug 14, 10:37 am, "Walter Mitty" <wami...@verizon.net> wrote:

>
> About the only thing I'm going to support is consistency.

Well, one thing I noticed in that sample schema <http://github.com/
metaperl/DBIx--Shootout/blob/555a5dc775690fd55a0b9d6237d2f3e2ec8905b4/
lib/DBIx/Shootout/GMAX/schema.pdf> is that they made the m-to-n
linkage tables singular, because there is only one linking record
between the two tables.

Terrence Brannon

unread,

Aug 14, 2009, 11:44:59 AM8/14/09

to

On Aug 14, 8:37 am, metaperl <metap...@gmail.com> wrote:

> Is there a good *reason* why entities in a database should be named
> using singular instead of plural?
>
> I can certainly see why columns should be singular, but I'm not so
> sure about entities.

I think I see the reasoning for singular entity names... the entity is
the ROW, not the TABLE of rows...

Walter Mitty

unread,

Aug 14, 2009, 4:51:23 PM8/14/09

to

"Terrence Brannon" <meta...@gmail.com> wrote in message
news:bb973e60-0d5a-46b3...@c34g2000yqi.googlegroups.com...

The entity instance is the row. The entity type is the table.

dami

unread,

Aug 17, 2009, 4:53:54 AM8/17/09

to

On 14 août, 22:51, "Walter Mitty" <wami...@verizon.net> wrote:
> "Terrence Brannon" <metap...@gmail.com> wrote in message

In "Data Modeling Essentials" (Simsion&Witt, Elsevier, §10.18.1
"Naming Conventions") : "Entity class names must be singular and
noncollective (e.g. "Employee" or "Employee Transaction", but not
"Employees", "Employee Table", nor "Employee History").

But not everybody agrees --- google for "data modeling table name
singular", there are many links. See for example
http://discuss.joelonsoftware.com/default.asp?joel.3.685443.15
and http://discuss.fogcreek.com/joelonsoftware/default.asp?cmd=show&ixPost=5904

--CELKO--

unread,

Aug 18, 2009, 8:51:44 AM8/18/09

to

>> we see that the database designer chose plural names for entities and singular names for columns. <<

That is straight out of ISO-11179. The rules are based on the
principle that you name a data element for WHAT it is by its nature,
so most things are scalars and they are singular. If it is a plural
or collective thing like a table, then it gets a collective or plural
name.

Tables are used as sets, not one at a time like records in a file.

Employee = Bad
Employees = Better
Personnel = Best

I have a few chapters on this in SQL PROGRAMMING STYLE, if you want
the details.

Lennart

unread,

Aug 19, 2009, 7:03:47 AM8/19/09

to

On 14 Aug, 14:48, Terrence Brannon <metap...@gmail.com> wrote:
[...]

>
> Microsoft seems to think there is not:http://msdn.microsoft.com/en-us/library/dd129510(VS.85).aspx
>

A page full of ... surprises

* Append the Word “Table” to the End of Table Names
* Create a Primary Key Named Id

perhaps there is more, I did not read any further ...

/Lennart

Walter Mitty

unread,

Aug 19, 2009, 8:01:58 AM8/19/09

to

"Lennart" <erik.lenna...@gmail.com> wrote in message
news:ac943c75-81d8-46ac...@c1g2000yqi.googlegroups.com...

/Lennart

It's interesting to compare the rationale for creating a primary key named
ID with the relational view of data described by Codd in 1970.

Here's a note from the paper:

"An arbitrary primary key, such as Id, is referred to as a surrogate key.
The alternative would be to use a natural key that is part of the data
itself (such as a government-issued identification number). Surrogate keys
are preferred, because it is sometimes difficult to accurately find a column
or combination of columns that will maintain uniqueness. In addition,
natural keys can be much larger than surrogate keys, especially if the
natural keys are text based or composed of multiple columns. Large keys
increase I/O demands on the table and its indexes, decreasing database
throughput."

And here's an excerpt from the 1970 paper introducing the relational model
of data for use with databases.

"The relational view (or model) of data described in Section 1 appears to be
superior in several respects to the
graph or network model [3,4] presently in vogue for noninferential systems.
It provides a means of describing data
with its natural structure only-that is, without superimposing any
additional structure for machine representation
purposes."

The automatic assignment of an unnatural key called ID in every table raises
the question of whether the Oslo design is really following the relational
model of data, or whether ID is not being used in a way that mimics the
record location (pointer) that would have been used in a graph or network
model. It's also instructive to note that, in the paper, the ID is
described as identifying the row in which it resides. Natural keys might be
used to uniquely specify a row, but they identify the subject matter item
that the data in the row represents. This subject matter item would be an
instance of an entity or an instance of a relationship among entities if you
follow the earlier comments in the paper.

It is, as the paper states, sometimes difficult to accurately find a column
or combination of column that will maintain uniqueness. Yes, it is
difficult. Data analysis is difficult. Building a database around data
that is inadeqately understood is also problematic, but at a different
level. If an ID PK will prevent duplicate rows, but not prevent duplicate
table entries for a single instance of an entity, you've defeated one of the
major reasons for having a primary key. If you add a unique constraint on a
natural key to the PK constraint on ID, you lose some of the perfromance
advantages claimed for the use of ID.

--CELKO--

unread,

Aug 19, 2009, 2:44:27 PM8/19/09

to

The paper also states that the surrogate key is created and maintained
by the system, not the user nor is it ever seen by the user. Think of
idexing, perfect hashing, etc. as ways to locate physical data.
Unfortunately, auto-incremenation in many SQLs is not a surrogate key
at all -- it is an exposed phsyical locator (physical row number,
record count or whatever).

Ed Prochak

unread,

Aug 20, 2009, 11:48:04 PM8/20/09

to

On Aug 19, 8:01 am, "Walter Mitty" <wami...@verizon.net> wrote:
> "Lennart" <erik.lennart.jons...@gmail.com> wrote in message

I think the comments reflects Codd's evolving development of the
relational model. So I do not think he really advocated use of ID.

And I especially agree with the last paragraph. Database design is
difficult. It is not simply a question of throwing a few tables with
ID columns together. And an ID key can be more trouble than it is
worth.

Ed

Walter Mitty

unread,

Aug 21, 2009, 8:32:57 AM8/21/09

to

"Ed Prochak" <edpr...@gmail.com> wrote in message
news:2dcf3426-a361-4ef4...@z31g2000yqd.googlegroups.com...

On Aug 19, 8:01 am, "Walter Mitty" <wami...@verizon.net> wrote:

>I think the comments reflects Codd's evolving development of the
> relational model. So I do not think he really advocated use of ID.

Right. Nothing I've read or cited suggests otherwise. Certainly not the
paper from which I cited the paragraph.

My intent was to make a contrast between Codd's idea of the relational model
and the design patterns advocated in the "Oslo paper".

Codd's idea of the relational model did evolve over time. But, to my
knowledge, none of that evolution affects the use of ID keys.

>And I especially agree with the last paragraph. Database design is
>difficult. It is not simply a question of throwing a few tables with
>ID columns together. And an ID key can be more trouble than it is
>worth.

What I said is that analysis is difficult. A lot of people confuse data
analysis with database design. Database design is also somewhat difficult,
but it can be learned. Someone once said that it's easier to understand 100
tables than 100,000 lines of code. It's also easier to design 100 tables
than to design the object model that will eventually result in 100,000 line
of code. Some learners settle for easy rules like the Oslo pattern instead
of learning harder rules that give better results.

And I agree that the universal ID key can be more trouble than it's worth.

Today's situation is complicated by the fact that many developers are
developing products for sale to some presumed future market of buyers.
There is no available existing body of data, and there are no subject matter
experts to interview, in order to find out how they really use the data. So
the developer doesn't do analysis the way the classical database designer
would have. Instead of talking to people who understand the data, he
engages in "imagineering", and invents his own understanding of the data,
along with inventing the database design.

Database design involves invention. Data analysis involves discovery. They
ain't the same.

--CELKO--

unread,

Aug 21, 2009, 11:03:49 AM8/21/09

to

>> I think the comments reflects Codd's evolving development of the relational model. So I do not think he really advocated use of ID. <<

You cabn see a lot of Codd's changes over time. PRIMARY KEY was based
on the keys needed for a tape file merge or other sequential file to
work; later, he realizes that "no key is more equal" than others, to
make fun of Orwell. In Relational Model #2, he has two kinds of
NULLs. Etc. He became more abstract over time, as he realized the
assumptions that we were carrying into RDBMS thinking.