Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss
Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Question regarding multiple data sources and coalesce

3 views
Skip to first unread message

david

unread,
Jun 13, 2007, 12:06:32 AM6/13/07
to
I inherited a db that is pulling data from 5 different sources and
storing them in 5 different tables. The tables all have the same
basic data, minor differences based on source. The company currently
creates a "summary" table by joining the 5 source tables and using
coalesce( ) to display data from preferred data sources if it is
available from there.

I've been asked to present an alternate schema. My first thought was
to normalize the 5 tables into 1 with a source andI could just include
the source in the ORDER BY of queries to get the preferred source.
But then I thought since each source could have 30Million rows maybe I
would loose performance over the existing schema. So, could somebody
point me towards a reference source that may cover this type of
topic? Of course, any opinions (on this issue) would be appreciated
as well.

Thanks,

David

Srinivas Nadella

unread,
Jun 13, 2007, 4:00:02 AM6/13/07
to
Hi

I am giving a try and I do not know how it works.. but here I go..

Create a View combining all different Tables and do a query on the view for
your search data. Does this makes any sense?

--
Srinivas Nadella
Sr. App Analyst - Hyderabad
India

--CELKO--

unread,
Jun 13, 2007, 5:51:08 PM6/13/07
to
>> I've been asked to present an alternate schema. My first thought was to normalize the 5 tables into 1 with a source and I could just include the source in the ORDER BY of queries to get the preferred source. <<

Something ike this? Cram all the data that you have into staging
table

CREATE TABLE Foobar
(foo_key INTEGER NOT NULL,
source_nbr INTEGER NOT NULL
CHECK (source_nbr BETWEEN 1 AND 5),
PRIMARY KEY (foo_key, source_nbr),
etc.);

Assuming sources are ranks from 1 to 5, pull out the most trusted for
each key

SELECT foo_key, etc.
FROM Foobar AS F1
WHERE source_nbr =
(SELECT MIN(source_nbr)
FROM Foobar AS F2
WHERE F1.foo_key = F2.foo_key);

This is the best we can do without more specs.

david

unread,
Jun 13, 2007, 10:43:56 PM6/13/07
to
On Jun 13, 5:51 pm, --CELKO-- <jcelko...@earthlink.net> wrote:
>
> This is the best we can do without more specs.

Thank you Mr. Celko, that is similar to (and probably better than)
what I was thinking, although I would appreciate it if you could point
me to some reference source that may examine the performance
differences between the 5 table/join/coalesce solution they are using
and 1 large table.

As far as more specs, it is a financial services company and the
sources are bloomberg, s&p, etc. Storing data for asset backed
securities so more than simply ticker and price. Current tables have
150+ columns.

I would have posted current DDL but it would just make you cry. No
foriegn keys, IDENTITY primary keys, stores computed columns in
tables, stores a column that is a concatination of computed columns,
bleh.

David


0 new messages