Account Options

  1. Sign in
Google Groups Home
« Groups Home
design question - so many columns....
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  6 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
CoreyB  
View profile  
 More options Jun 14 2007, 3:05 pm
Newsgroups: microsoft.public.sqlserver.server
From: CoreyB <unc27...@yahoo.com>
Date: Thu, 14 Jun 2007 12:05:53 -0700
Local: Thurs, Jun 14 2007 3:05 pm
Subject: design question - so many columns....
Got a situation where our main, core data in our main tables consist
of 15-20 normal columns (dates, integers, varchar, etc.) and then
several "sets" of booleans.  Example - health risk factors.....high
blood pressure, diabetes, depression.....up to 20 risk factors, let's
say.  The user can choose none, all, or any combo in between in these
sets of booleans.  Well each of those fundamentally is just a bit
column, with a zero or one, attached to the record concerning the
individual person/event record.  Well what if I have 8 or 9 "sets" of
these boolean questions?  This results in 150-200+ columns in my
table.  I know sql can handle up to 1024, and these data really belong
on this record with this individual person/event.

I just wanted to see what others thought from the design perspective.
Any suggestions???


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
SB  
View profile  
 More options Jun 15 2007, 12:09 am
Newsgroups: microsoft.public.sqlserver.server
From: SB <othell...@yahoo.com>
Date: Thu, 14 Jun 2007 21:09:08 -0700
Local: Fri, Jun 15 2007 12:09 am
Subject: Re: design question - so many columns....
On Jun 15, 1:05 am, CoreyB <unc27...@yahoo.com> wrote:

> Got a situation where our main, core data in our main tables consist
> of 15-20 normal columns (dates, integers, varchar, etc.) and then
> several "sets" of booleans.  Example - health risk factors.....high
> blood pressure, diabetes, depression.....up to 20 risk factors, let's
> say.  The user can choose none, all, or any combo in between in these
> sets of booleans.  Well each of those fundamentally is just a bit
> column, with a zero or one, attached to the record concerning the
> individual person/event record.  Well what if I have 8 or 9 "sets" of
> these boolean questions?  This results in 150-200+ columns in my
> table.  I know sql can handle up to 1024, and these data really belong
> on this record with this individual person/event.

> I just wanted to see what others thought from the design perspective.
> Any suggestions???

That's quite easy. Let's say you have one column called 'Risk Factor'
that can take 20 possible values. You can store values like
'Diabetes,High Blood Pressure,Depression'. This is one solution and I
would prefer this. Another solution is you can have a varchar(20) with
values like 10001000 etc where you set a bit for particular risk
factor. You have to know the ordinal position for a particular risk
and if that bit is set or not.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
SB  
View profile  
 More options Jun 15 2007, 12:29 am
Newsgroups: microsoft.public.sqlserver.server
From: SB <othell...@yahoo.com>
Date: Thu, 14 Jun 2007 21:29:56 -0700
Local: Fri, Jun 15 2007 12:29 am
Subject: Re: design question - so many columns....
On Jun 15, 10:09 am, SB <othell...@yahoo.com> wrote:

and of course if you want to be relational then you have a row for
each 'Risk Factor'. So instead of growing the table horizontally you
grow them vertically. So if a person has 'Diabetes' and 'High Blood
Pressure' there are 2 rows for that person. Then you can group by a
particular 'Risk Factor'. For example, find all patients where 'Risk
Factor' is 'Diabetes' and sum how much they spent etc on medication
etc.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
CoreyB  
View profile  
 More options Jun 15 2007, 8:41 am
Newsgroups: microsoft.public.sqlserver.server
From: CoreyB <unc27...@yahoo.com>
Date: Fri, 15 Jun 2007 05:41:29 -0700
Local: Fri, Jun 15 2007 8:41 am
Subject: Re: design question - so many columns....
I've seen that approach used in some other places in reading online,
mainly with surveys & questionarres.  But some issues I have with it
are.....

1 - The relational form of a tall skinny table, with each answer as a
row seems good if you may not have data for every question/element, so
you save space on the questions that aren't answered.  We will likely
have data for each element.  Which means we'll have billions of
rows......per year.  The size will start to become an issue after
several years.
2 - Storage.....At first glance the wide tables seem like they'll be
larger.  But a lot of the columns are bits, which are optimized for
storage, so if I have a wide table with 32 bit columns, they'll only
take up 4 bytes of storage.  But if I store the same answers (1 or 0)
in another table as rows in a generic catch-all varchar column, and
then have two IDs of int or bigint tying them back to the question &
respondent, that's already 8 or 9 bytes per row minimum * 32 answers =
250+ bytes just for the one respondent.  And there's way more than 32
- probably 100 or so.
3 - Data integrity.  If everything has its own column in a wide table,
then you can make sure that a bit is a bit, and a date is a datetime,
and an integer is an int.  But if everything is put in one generic
column, then you give up a little bit of ground on the data integrity,
and then I'm depending on the ETL process, or the developer to
validate all data types.

As crappy as it sounds, the wide table looks better to me in this
situation.  Unless someone out here can talk me out of it.

On Jun 15, 12:29 am, SB <othell...@yahoo.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Yura Shalak  
View profile  
 More options Jun 15 2007, 3:58 pm
Newsgroups: microsoft.public.sqlserver.server
From: Yura Shalak <yu...@nospam.nospam>
Date: Fri, 15 Jun 2007 15:58:58 -0400
Local: Fri, Jun 15 2007 3:58 pm
Subject: Re: design question - so many columns....

Well, bit fields create a searching an filtering nightmares (ask me how
I know :-).)

I don't believe there is a bullet-proof solution.
I would consider even a de-normalized 1-to-1 relationship architecture
option.
For example, having a Patient table (PatientID + personal data), then
RiskFactors table (PatientID + 20+ risk factor bit fields), etc. Since
in most cases  you search either for a singe patient or for group of
patients that meet certain criteria, massive multi-table joins will not
be required too often.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
SB  
View profile  
 More options Jun 17 2007, 11:32 pm
Newsgroups: microsoft.public.sqlserver.server
From: SB <othell...@yahoo.com>
Date: Sun, 17 Jun 2007 20:32:13 -0700
Local: Sun, Jun 17 2007 11:32 pm
Subject: Re: design question - so many columns....
On Jun 15, 6:41 pm, CoreyB <unc27...@yahoo.com> wrote:

I think the first solution should work where you store the values as
coma (or any other delimiter) separated values. For example for
certain risk factor, for a customer values can be: diabetes,high blood
pressure.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »