Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Question regarding multiple data sources and coalesce

7 views

Skip to first unread message

david

unread,

Jun 13, 2007, 12:06:32 AM6/13/07

I inherited a db that is pulling data from 5 different sources and
storing them in 5 different tables. The tables all have the same
basic data, minor differences based on source. The company currently
creates a "summary" table by joining the 5 source tables and using
coalesce( ) to display data from preferred data sources if it is
available from there.

I've been asked to present an alternate schema. My first thought was
to normalize the 5 tables into 1 with a source andI could just include
the source in the ORDER BY of queries to get the preferred source.
But then I thought since each source could have 30Million rows maybe I
would loose performance over the existing schema. So, could somebody
point me towards a reference source that may cover this type of
topic? Of course, any opinions (on this issue) would be appreciated
as well.

Thanks,

David

Srinivas Nadella

unread,

Jun 13, 2007, 4:00:02 AM6/13/07

I am giving a try and I do not know how it works.. but here I go..

Create a View combining all different Tables and do a query on the view for
your search data. Does this makes any sense?

--
Srinivas Nadella
Sr. App Analyst - Hyderabad
India

--CELKO--

unread,

Jun 13, 2007, 5:51:08 PM6/13/07

>> I've been asked to present an alternate schema. My first thought was to normalize the 5 tables into 1 with a source and I could just include the source in the ORDER BY of queries to get the preferred source. <<

Something ike this? Cram all the data that you have into staging
table

CREATE TABLE Foobar
(foo_key INTEGER NOT NULL,
source_nbr INTEGER NOT NULL
CHECK (source_nbr BETWEEN 1 AND 5),
PRIMARY KEY (foo_key, source_nbr),
etc.);

Assuming sources are ranks from 1 to 5, pull out the most trusted for
each key

SELECT foo_key, etc.
FROM Foobar AS F1
WHERE source_nbr =
(SELECT MIN(source_nbr)
FROM Foobar AS F2
WHERE F1.foo_key = F2.foo_key);

This is the best we can do without more specs.

david

unread,

Jun 13, 2007, 10:43:56 PM6/13/07

On Jun 13, 5:51 pm, --CELKO-- <jcelko...@earthlink.net> wrote:
>
> This is the best we can do without more specs.

Thank you Mr. Celko, that is similar to (and probably better than)
what I was thinking, although I would appreciate it if you could point
me to some reference source that may examine the performance
differences between the 5 table/join/coalesce solution they are using
and 1 large table.

As far as more specs, it is a financial services company and the
sources are bloomberg, s&p, etc. Storing data for asset backed
securities so more than simply ticker and price. Current tables have
150+ columns.

I would have posted current DDL but it would just make you cry. No
foriegn keys, IDENTITY primary keys, stores computed columns in
tables, stores a column that is a concatination of computed columns,
bleh.

David

0 new messages