Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help me convince them this is bad!!!

1 view
Skip to first unread message

Molly

unread,
Oct 4, 2006, 10:08:57 AM10/4/06
to
I have an IT manager that thinks he knows everything when it comes to
database design. Unfortunately, he's not familiar with relational
databases but thinks he knows it all. He's come up with a really crazy
design, and I've got to come up with all the reasons it's not right. I
want to have as much ammo as possible so we're not stuck supporting
something that will never perform well.

We're in the process of moving some functionality from a Z/OS DB2
database to a RS/6000 DB2. Currently the Z/OS database has a table
that has a four column primary key. We've determined that one of these
keys is not needed on the new system, which would leave us with a three
column key. However, the manager wants to make it a two column key, be
concatenating two of the source columns (separated by double commas)
into just one column on the new system. Searches can be done on either
of the source columns or both.

I know searching this will be horrific, with tons of LIKE statements,
and indexing will be next to useless. But I need as many reasons as
possible to not take this approach. I don't want to end up being
blamed for poor performance down the road.

Thanks in advance!

Serge Rielau

unread,
Oct 4, 2006, 10:23:27 AM10/4/06
to
What is is rational behind the idea?
It can't be memory since all his savings (the length) are thrown away
again in the ',,'

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ondemandbusiness/conf2006/

Molly

unread,
Oct 4, 2006, 10:34:04 AM10/4/06
to
He wants to make this a generic table, that can be used for different
types of data in the future. If he keeps the columns generic he can
use if for different data elements later down the road, which is how a
lot of the Z/OS systems were developed. We're in the process of moving
to a new system, and data mapping is next to impossible, because
different areas use the same columns for very different things. There
is no current plan to use this table for anything else, and if there
are additional requirements in the future, it would more than likely be
new tables or new columns.

Serge Rielau

unread,
Oct 4, 2006, 11:40:20 AM10/4/06
to
*shudder*
If he wants schema-chaos, why not go all out and use XML?

P. Adhia

unread,
Oct 4, 2006, 12:18:00 PM10/4/06
to
Molly wrote:
> He wants to make this a generic table, that can be used for different
> types of data in the future. If he keeps the columns generic he can
> use if for different data elements later down the road, which is how a
> lot of the Z/OS systems were developed.

I have had my share of this struggle with "Data Analysts/Modellers" (I
am a DBA) who want to create so called "scalable/flexible data models".
>From my experience, 80% flexibility is never used during application's
useful life and 80% of the new requirements are the ones that
weren't/couldn't be envisioned and cannot be handled by the built in
flexibility anyway. I share your concerns about performance and also
unnecessary complexity that it will add to applications.

People need to be realistic when designing database looking at current
and short-term business needs (and not 10 years down the road).
Personally, I believe applications need to be generic and not
databases. I bet your manager is more a programmer (who probably just
discovered OO) than a DBA. If your manager insists on genericizing
database, maybe using OO extensions (typed tables) is the way to go
rather than reusing tables/columns for multiple purposes.

Just my two cents.

P Adhia

--CELKO--

unread,
Oct 5, 2006, 9:30:58 AM10/5/06
to
>> He wants to make this a generic table, that can be used for different types of data in the future. <<

Pretty clearly the concatenated column is a total violation of 1NF.
Likewise, a "generic table" a violation of logic and RDBMS. I call
such things "Britney Spears, Automobiles and Squid" tables to show that
they cannot even have a basic ISO-11179 data element name. "To be is
to be something in particular; to be nothing in particular is to be
nothing at all" -- Aristotle

Here is a little "cut *& paste" I give people who want ot do a EAV
design. Someone like your boss posted this:

CREATE TABLE EAV -- no key declared
(key_col VARCHAR (10) NULL,
attrib_value VARCHAR (50) NULL);

INSERT INTO EAV VALUES ('LOCATION','Bedroom');
INSERT INTO EAV VALUES ('LOCATION','Dining Room');
INSERT INTO EAV VALUES ('LOCATION','Bathroom');
INSERT INTO EAV VALUES ('LOCATION','courtyard');
INSERT INTO EAV VALUES ('EVENT','verbal aggression');
INSERT INTO EAV VALUES ('EVENT','peer');
INSERT INTO EAV VALUES ('EVENT','bad behavior');
INSERT INTO EAV VALUES ('EVENT','other');

CREATE TABLE EAV_DATA -note lack of constraints, defaults, DRI
(id INTEGER IDENTITY (1,1) NOT NULL,
bts_id INTEGER NULL,
key_col VARCHAR (10) NULL,
attrib_value VARCHAR (50) NULL );

INSERT INTO EAV_DATA VALUES (1, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'bad behavior');
INSERT INTO EAV_DATA VALUES (2, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'verbal aggression');
INSERT INTO EAV_DATA VALUES (3, 'LOCATION', 'courtyard');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'peer');

Ideally, the result set of the query would be Location Event count
(headings if possible)

Bedroom verbal aggression 1
Bedroom peer 0
Bedroom bad behavior 0
Bedroom other 2
Dining Room verbal aggression 0
Dining Room peer 0
Dining Room bad behavior 0
Dining Room other 0
Bathroom verbal aggression 0
Bathroom peer 0
Bathroom bad behavior 0
Bathroom other 0
courtyard verbal aggression 0
courtyard peer 1
courtyard bad behavior 0
courtyard other 1

Also, if possible, another query would return this result set. (I think
I know how to do this one.)

Location Event count
Bedroom verbal aggression 1
Bedroom other 2
courtyard peer 1
courtyard other 1

Here is an answer From: Thomas Coleman

SELECT Locations.locationvalue, Events.eventvalue,
(SELECT COUNT(*)
FROM (SELECT LocationData.locationvalue, EventData.eventvalue

FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
FROM eav_data AS TD1
WHERE TD1.key = 'location') AS LocationData
INNER JOIN
(SELECT TD2.bts_id, TD2.value AS eventvalue
FROM eav_data AS TD2
WHERE TD2.key = 'event'
) AS EventData
ON LocationData.bts_id = EventData.bts_id
) AS CollatedEventData
WHERE CollatedEventData.locationvalue = Locations.locationvalue
AND CollatedEventData.eventvalue = Events.eventvalue
FROM (SELECT T1.value AS locationvalue
FROM EAV AS T1
WHERE T1.key = 'location') AS Locations,
(SELECT T2.value AS eventvalue
FROM EAV AS T2
WHERE T2.key = 'event') AS Events
ORDER BY Locations.locationvalue, Events.eventvalue ,
SELECT Locations.locationvalue, Events.eventvalue
(SELECT COUNT(*)
FROM (SELECT LocationData.locationvalue, EventData.eventvalue

FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
FROM eav_data AS TD1
WHERE TD1.key = 'location') AS LocationData
INNER JOIN
(SELECT TD2.bts_id, TD2.value AS eventvalue
FROM eav_data AS TD2
WHERE TD2.key = 'event') AS EventData
ON LocationData.bts_id = EventData.bts_id)
AS CollatedEventData
WHERE CollatedEventData.locationvalue = Locations.locationvalue
AND CollatedEventData.eventvalue = Events.eventvalue)
FROM (SELECT T1.value AS locationvalue
FROM EAV AS T1
WHERE T1.key = 'location') AS Locations,
(SELECT T2.value AS eventvalue
FROM EAV AS T2
WHERE T2.key = 'event') AS Events;

Is the same thing in a proper schema as:

SELECT L.locationvalue, E.eventvalue, COUNT(*)
FROM Locations AS L, Events AS E
WHERE L.btd_id = E.btd_id
GROUP BY L.locationvalue, E.eventvalue;

The reason that I had to use so many subqueries is that those entities
are all plopped into the same table. There should be separate tables
for Locations and Events.

The column names are seriously painful Don't use "key" and "value" for
column names. It means that the developer *has* surround the column
name with double quotes for everything which is a serious pain.

There is such a thing as "too" generic. There has to be some structure
or everything becomes nothing more than a couple of tables called
"things". The real key (no pun intended) is commonality. Is there a
pattern to the data that they want to store? It may not be possible to
create one structure to rule them all and in the darkness bind them.

"To be is to be something in particular; to be nothing in particular is
to be nothing." --Aristole

All data integrity is destroyed. Any typo becomes a new attribute or
entity. Entities are found missing attributes, so all the reports are
wrong.

Try to write a single CHECK() constraint that works for all the
attributes of those 30+ entities your users created because you were
too dumb or too lazy to do your job. It can be done! You need a case
expression almost 70 WHEN clauses for a simple invoice and order system
when I tried it as an exercise.

Try to write a single DEFAULT clause for 30+ entities crammed into one
column. Impossible!

Try to set up DRI actions among the entities. If you thought the WHEN
clauses in the single CASE expression were unmaintainable, wait until
you see the "TRIGGERs from Hell" -- Too bad that they might not fit
into older SQL Server which had some size limits. Now maintain it.

For those who are interested, there are couple of links to articles I
found on the net:

Generic Design of Web-Based Clinical Databases
http://www.jmir.org/2003/4/e27­/

The EAV/CR Model of Data Representation
http://ycmi.med.yale.edu/nadka­rni/eav_CR_contents.htm

An Introduction to Entity-Attribute-Value Design for Generic
Clinical Study Data Management Systems
http://ycmi.med.yale.edu/nadka­rni/Introduction%20to%20EAV%20­systems.htm


Data Extraction and Ad Hoc Query of an Entity- Attribute- Value
Database
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...


Exploring Performance Issues for a Clinical Database Organized Using
an Entity-Attribute-Value Representation
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...

Brian Tkatch

unread,
Oct 5, 2006, 9:52:43 AM10/5/06
to

I had that once. I had to fight everything to get the database designed
my way, though i did have to give up on one or two points he was
adamant on. Basically, i told him that either he was the DBA and i was
his lackey, or i was the DBA and he'd submit requests to me which i
might reject. It was a small company, so YMMV.

I like to look at (most) TABLEs as objects. Each TABLE is a class, each
row an instantiation of that class, which means that each COLUMN is an
attribute of the class. Each COLUMN, therefore, must define one thing,
and one thing only, because an amalgamation is not an attribute. Also,
having a copy of a COLUMN is a very "Bad Thing"(tm), because it will
most likely change, and soon become a nightmare. If a composite column
is desired, use GENERATED ALWAYS AS to have the database itself keep it
correct. Otherwise, the *possibility* for contradiction of data is an
option, and that destroys the very integrity of a database.

B.

0 new messages