Query help

Jack

unread,

May 5, 2005, 11:04:58 AM5/5/05

to

This group is always awesome. Thanks in advance for your help. I need to
put together some recordsets for a charting application. I think this is
everything (ddl, data, expected result set.) Thanks again.

CREATE TABLE [dbo].[Tracking] (
[key_] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[value_] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL )

insert into tracking VALUES ('LOCATION','Bedroom')
insert into tracking VALUES ('LOCATION','Dining Room')
insert into tracking VALUES ('LOCATION','Bathroom')
insert into tracking VALUES ('LOCATION','courtyard')
insert into tracking VALUES ('EVENT','verbal aggression')
insert into tracking VALUES ('EVENT','peer')
insert into tracking VALUES ('EVENT','bad behavior')
insert into tracking VALUES ('EVENT','other')

CREATE TABLE [dbo].[Tracking_DATA] (
[ID_] [int] IDENTITY (1, 1) NOT NULL ,
[bts_ID_] [int] NULL ,
[key_] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[value_] [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO

insert into tracking_DATA VALUES (1, 'LOCATION', 'Bedroom')
insert into tracking_DATA VALUES (1, 'EVENT', 'other')
insert into tracking_DATA VALUES (1, 'EVENT', 'bad behavior')
insert into tracking_DATA VALUES (2, 'LOCATION', 'Bedroom')
insert into tracking_DATA VALUES (2, 'EVENT', 'other')
insert into tracking_DATA VALUES (2, 'EVENT', 'verbal aggression')
insert into tracking_DATA VALUES (3, 'LOCATION', 'courtyard')
insert into tracking_DATA VALUES (3, 'EVENT', 'other')
insert into tracking_DATA VALUES (3, 'EVENT', 'peer')

Ideally, the result set of the query would be
Location Event count (headings if possible)
Bedroom verbal aggression 1
Bedroom peer 0
Bedroom bad behavior 0
Bedroom other 2
Dining Room verbal aggression 0
Dining Room peer 0
Dining Room bad behavior 0
Dining Room other 0
Bathroom verbal aggression 0
Bathroom peer 0
Bathroom bad behavior 0
Bathroom other 0
courtyard verbal aggression 0
courtyard peer 1
courtyard bad behavior 0
courtyard other 1

Also, if possible, another query would return this result set. (I think I
know how to do this one.)
Location Event count
Bedroom verbal aggression 1
Bedroom other 2
courtyard peer 1
courtyard other 1

Jens Süßmeyer

unread,

May 5, 2005, 11:20:28 AM5/5/05

to

Sorry, im stuck in your DDL, could you explain briefly what these tables are
for ?
I can´t see the connection between the estimated result and the source data.

Jens SUessmeyer.

"Jack" <ja...@jack.net> schrieb im Newsbeitrag
news:ACqee.514$2J6.319@lakeread06...

Jens Süßmeyer

unread,

May 5, 2005, 11:20:28 AM5/5/05

to

Sorry, im stuck in your DDL, could you explain briefly what these tables are
for ?
I can´t see the connection between the estimated result and the source data.

Jens SUessmeyer.

"Jack" <ja...@jack.net> schrieb im Newsbeitrag
news:ACqee.514$2J6.319@lakeread06...

Jack

unread,

May 5, 2005, 12:28:17 PM5/5/05

to

Thanks for helping me.
Basically, a script uses the 'tracking' table to build a web page. Each
key_ value is used to create a separate section of the form. The value_ is
used to create a checkbox that the user can select. The 'tracking_Data'
table contains the values that the user selected. I am using all of this
data to build a result set that will be used for a graphing tool.

"Jens Süßmeyer" <Jens@Remove_this_For_Contacting.sqlserver2005.de> wrote in
message news:%23vt8ObY...@TK2MSFTNGP14.phx.gbl...

Thomas Coleman

unread,

May 5, 2005, 12:43:37 PM5/5/05

to

This is a serioius heinous data structure which is why querying for the data you
want is so difficult.

Here's a solution:

Select Locations.LocationValue, Events.EventValue
, (
Select Count(*)
From (
Select LocationData.LocationValue, EventData.EventValue
From (
Select TD1.Bts_id, TD1.[value] As LocationValue
From dbo.Tracking_Data As TD1
Where TD1.[key] = 'LOCATION'
) As LocationData
Join (
Select TD2.Bts_id, TD2.[value] As EventValue
From dbo.Tracking_Data As TD2
Where TD2.[key] = 'EVENT'
) As EventData
On LocationData.Bts_Id = EventData.Bts_Id
) As CollatedEventData
Where CollatedEventData.LocationValue = Locations.LocationValue
And CollatedEventData.EventValue = Events.EventValue
)
From (
Select T1.[value] As LocationValue
From dbo.Tracking As T1
Where T1.[key] = 'LOCATION'
) As Locations
, (
Select T2.[value] As EventValue
From dbo.Tracking As T2
Where T2.[key] = 'EVENT'
) As Events
Order By Locations.LocationValue, Events.EventValue

Let me make a few suggestions:

1. The reason that I had to use so many subqueries is that those entities are
all lopped into the same table. There should be separate tables for Locations
and Events.

2. The column names are seriously painful. Beyond the fact that I personally
hate underscores in column names, using underscores at the end of the column
name is really non-intuitive. I removed them for my example and came across the
next column name faux pas. Don't use "key" and "value" for column names. It
means that the developer *has* surround the column name with square brackets for
everything which is a serious pain.

HTH

Thomas

Jens Süßmeyer

unread,

May 5, 2005, 12:51:19 PM5/5/05

to

Whew, Hats off for your query, i sure don´t know if that works, but reading
the structure of the tables also gave me the creeps.

@OriginalPoster: Perhaps you have the possibility to redesign your database
structure. as time goes by you will run in more than one issue with this
DB-structure. Yo if you have the chance and the power to makes some design
chances this would make life easier to you in the future. (...just my two
pence...)

Jens Suessmeyer.

"Thomas Coleman" <replyi...@anywhere.com> schrieb im Newsbeitrag
news:e5qhKGZU...@TK2MSFTNGP09.phx.gbl...

Jack

unread,

May 5, 2005, 1:31:13 PM5/5/05

to

Thanks for looking at this.
You wrote :

>The reason that I had to use so many subqueries is that those entities are
all lopped into the same table. There should be separate tables for
Locations
> and Events.

I agree with you, but the challenge I am having is I have to make this
application generic. The 'Locations' and 'Events' table may be called
something different for another customer. Plus, we will not know how many
different items that will get tracked. In my example, there are only two,
but in the real world, there could be thirty. That is why I have everything
in one table keyed by 'key_' (which I agree with you that the name should
change.)

I definitely need help with data structure stuff, so your suggestions are
welcome and appreciated to make this work.

"Thomas Coleman" <replyi...@anywhere.com> wrote in message
news:e5qhKGZU...@TK2MSFTNGP09.phx.gbl...

Thomas Coleman

unread,

May 5, 2005, 1:41:27 PM5/5/05

to

There is such a thing as "too" generic. There has to be some structure or
everything becomes nothing more than a couple of tables called "things". The
real key (no pun intended) is commonality. Is there a pattern to the data that
they want to store? It may not be possible to create one structure to rule them
all and in the darkness bind them.

Thomas

"Jack" <ja...@jack.net> wrote in message news:DLsee.522$2J6.105@lakeread06...

--CELKO--

unread,

May 5, 2005, 4:24:44 PM5/5/05

to

>> the challenge I am having is I have to make this application
generic. <<

ARRRRGHH!! There is no such thing in RDBMS. You have missed all the
foundations. The idea is that you have a known data model of an
external reality that you manipulate. Users do not get to make
elephants fall out of the sky.

>> The 'Locations' and 'Events' table may be called something different
for another customer. <<

Then yuou would them appropriately, not generically!!

>> Plus, we will not know how many different items that will get
tracked. <<

Then you have no done your job as a data modeler, have you?

>> That is why I have everything in one table keyed by 'key_' (which I
agree with you that the name should change.) <

"To be is to be something in particular; to be nothing in particular is
to be nothing." --Aristole

The name of this design flaw is EAV (entity-attribute-value). It falls
apart in a year in production work two ways:

1) The queries look like the nightmare that Mr. Coleman posted only
*orders of magnitude* -- did you hear that?? repeat *orders of
magnitude* -- more complex and slower. The overhead of type conversion
alone will kill a real application. Or did you write an entire
temporal and arithmetic library for VARCHAR(255) strings that does not
use CAST()?

2) All data integrity is destroyed. Any typo becomes a new attribute
or entity. Entities are found missing attributes, so all the reports
are wrong.

3) Try to write a single CHECK() constraint that works for all the
attributes of those 30+ entities your users created because you were
too dumb or too lazy to do your job. It can be done! You need a case
expression almost 70 WHEN clauses for a simple invoice and order system
when I tried it as an exercise.

4) Try to write a single DEFAULT clause for 30+ entities crammed into
one column. Impossible!

5) Try to set up DRI actions among the entities. If you thought the
WHEN clauses in the single CASE expression were unmaintainable, wait
until you see the "TRIGGERs from Hell" -- Too bad that they might not
fit into older SQL Server which had some size limits. Now maintain it.

I bill $1000-$2000 per day to look at and fix this exact design
problem. Let me know if your boss wants to pay me, fix the problem in
shop or go out of business. It is that kind of mistake.

Sericinus hunter

unread,

May 6, 2005, 11:09:37 AM5/6/05

to

--CELKO-- wrote:
>
> The name of this design flaw is EAV (entity-attribute-value). It falls
> apart in a year in production work two ways:
>
> 1) The queries look like the nightmare that Mr. Coleman posted only
> *orders of magnitude* -- did you hear that?? repeat *orders of
> magnitude* -- more complex and slower. The overhead of type conversion
> alone will kill a real application. Or did you write an entire
> temporal and arithmetic library for VARCHAR(255) strings that does not
> use CAST()?
>
> 2) All data integrity is destroyed. Any typo becomes a new attribute
> or entity. Entities are found missing attributes, so all the reports
> are wrong.

I know I am entering a dangerous territory, looking like a mouse
in front of a lion, but let me add my two cents.
Yes, OO is crap; RDBMS is not for object inheritance and all those
'generic' things; if one needs EAV that means he doesn't know his
data model and so on.
Unfortunately, in reality there are situations when you don't know
all the entity attributes beforehand and/or most of them may or may
not apply to a specific entity. Yet still you want to keep your data
in a database. What is the answer given the fact, that popular
available databases are relational?
I understand what you say about why you think this is bad. But many
problems you mention can be addressed, and there are successful
production examples which run for decades now.

For those who are interested, there are couple of links to articles
I found on the net:

Generic Design of Web-Based Clinical Databases
http://www.jmir.org/2003/4/e27/

The EAV/CR Model of Data Representation
http://ycmi.med.yale.edu/nadkarni/eav_CR_contents.htm

An Introduction to Entity-Attribute-Value Design for Generic
Clinical Study Data Management Systems
http://ycmi.med.yale.edu/nadkarni/Introduction%20to%20EAV%20systems.htm

Data Extraction and Ad Hoc Query of an Entity— Attribute— Value Database
http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=9824799

Exploring Performance Issues for a Clinical Database Organized Using
an Entity-Attribute-Value Representation
http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=10984467

Mikito Harakiri

unread,

May 6, 2005, 3:21:03 PM5/6/05

to

Sericinus hunter wrote:
> For those who are interested, there are couple of links to articles
> I found on the net:
>
> Generic Design of Web-Based Clinical Databases
> http://www.jmir.org/2003/4/e27/
>
> The EAV/CR Model of Data Representation
> http://ycmi.med.yale.edu/nadkarni/eav_CR_contents.htm
>
> An Introduction to Entity-Attribute-Value Design for Generic
> Clinical Study Data Management Systems
>
http://ycmi.med.yale.edu/nadkarni/Introduction%20to%20EAV%20systems.htm
>

> Data Extraction and Ad Hoc Query of an Entity- Attribute- Value

Database
>
http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=9824799
>
> Exploring Performance Issues for a Clinical Database Organized Using
> an Entity-Attribute-Value Representation
>
http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=10984467

All those junk articles go into large detail how to manage your data in
EAV form, but fall short writing a simple query like

select Receptor_Type, Efferent_Neuron, count(*)
from NeuronNetwork
group by Receptor_Type, Efferent_Neuron

Try writing this query in EAV representation (without pivoting EAV to
normal relation).

Sericinus hunter

unread,

May 6, 2005, 3:40:29 PM5/6/05

to

Mikito Harakiri wrote:
> Sericinus hunter wrote:
>
>>For those who are interested, there are couple of links to articles
>>I found on the net:

[...]

> All those junk articles go into large detail how to manage your data in
> EAV form, but fall short writing a simple query like
>
> select Receptor_Type, Efferent_Neuron, count(*)
> from NeuronNetwork
> group by Receptor_Type, Efferent_Neuron
>
> Try writing this query in EAV representation (without pivoting EAV to
> normal relation).

I understand. They say somewhere that this design is not
particularly good for data analysis, but is rather for data
collection and storing. For efficient complex queries data
must be converted to conventional form.
Rather than that, why do you think it is junk?

--CELKO--

unread,

May 6, 2005, 3:53:26 PM5/6/05

to

>> Unfortunately, in reality there are situations when you don't know
all the entity attributes beforehand and/or most of them may or may not
apply to a specific entity. Yet still you want to keep your data in a
database. What is the answer given the fact, that popular available
databases are relational? <<

When gathering requirements, I like to use plain text myself. I have
good search engines for text, thigns that support an outline, etc .
However, I see more people starting to use XML tools for this phase of
the project. The advantate they get is that they can manipulate their
hierarchy easily and regroup their notes.

Michael C#

unread,

May 6, 2005, 9:22:17 PM5/6/05

to

Most people and businesses don't store data for the sheer love of storing
data (Library of Congress excepted). Most entities store data for the
purpose of turning it into actionable information, which is achieved at the
most basic level through queries. The purpose of a good, solid relational
database design is to facilitate this. Creating a "generic" database that
requires conversion of your entire schema to a relational form before you
can extract usable information from it is forcing the nail with a
screwdriver. Sure, turn it around and use the handle and you can force it
to work... As long as the handle doesn't break... And why in the world
would you when there's a perfectly good hammer right there in the toolbox?

Keep in mind that SQL is not a "data collection" tool. Take a look at the
ANSI Standard, or even BOL. Notice the complete lack of specialized input
and output support that you get in application languages. All this is
handled by applications on the front end, which feed the data into your SQL
Server for later retrieval and analysis.

All this is why - as Mikito so eloquently stated - the EAV model is junk.

"Sericinus hunter" <ser...@flash.net> wrote in message
news:NKPee.2831$Yg4...@newssvr17.news.prodigy.com...

Sericinus hunter

unread,

May 8, 2005, 12:02:31 PM5/8/05

to

OK, I think I got the picture. Thank you.