I've been asked to present an alternate schema. My first thought was
to normalize the 5 tables into 1 with a source andI could just include
the source in the ORDER BY of queries to get the preferred source.
But then I thought since each source could have 30Million rows maybe I
would loose performance over the existing schema. So, could somebody
point me towards a reference source that may cover this type of
topic? Of course, any opinions (on this issue) would be appreciated
as well.
Thanks,
David
I am giving a try and I do not know how it works.. but here I go..
Create a View combining all different Tables and do a query on the view for
your search data. Does this makes any sense?
--
Srinivas Nadella
Sr. App Analyst - Hyderabad
India
Something ike this? Cram all the data that you have into staging
table
CREATE TABLE Foobar
(foo_key INTEGER NOT NULL,
source_nbr INTEGER NOT NULL
CHECK (source_nbr BETWEEN 1 AND 5),
PRIMARY KEY (foo_key, source_nbr),
etc.);
Assuming sources are ranks from 1 to 5, pull out the most trusted for
each key
SELECT foo_key, etc.
FROM Foobar AS F1
WHERE source_nbr =
(SELECT MIN(source_nbr)
FROM Foobar AS F2
WHERE F1.foo_key = F2.foo_key);
This is the best we can do without more specs.
Thank you Mr. Celko, that is similar to (and probably better than)
what I was thinking, although I would appreciate it if you could point
me to some reference source that may examine the performance
differences between the 5 table/join/coalesce solution they are using
and 1 large table.
As far as more specs, it is a financial services company and the
sources are bloomberg, s&p, etc. Storing data for asset backed
securities so more than simply ticker and price. Current tables have
150+ columns.
I would have posted current DDL but it would just make you cry. No
foriegn keys, IDENTITY primary keys, stores computed columns in
tables, stores a column that is a concatination of computed columns,
bleh.
David