Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Staging Area Design

0 views
Skip to first unread message

Chris Leroquais

unread,
Aug 24, 2006, 3:33:49 AM8/24/06
to
Hi there,

For ETL purposes, I'm wondering whether it would be better to:
- grouping all my heterogenous source systems tables into a same Staging
Database

or

- Using a dedicated Staging Database for each source system

Thanks,

Chris


lu...@iqato.com

unread,
Aug 28, 2006, 3:31:04 PM8/28/06
to
Chris Leroquais wrote:
> For ETL purposes, I'm wondering whether it would be better to:
> - grouping all my heterogenous source systems tables into a same Staging
> Database
> or
> - Using a dedicated Staging Database for each source system

It depends on your hardware and on your version of SQL Server.

In the best scenario you have SQL Server 2005 Enterprise and your
databases are located on a RAID-10 group striping on a lot of disks
(like a SAN). In this case you should use a single database and use
table partitions as needed.

If you don't have SQL 2005 Enterprise then you should make different
databases. If you don't have a RAID-10 array you should also locate the
databases on different disks to optimize I/O.

Myles.M...@gmail.com

unread,
Aug 29, 2006, 7:27:20 AM8/29/06
to
Hello Chris,

I usually go for one staging database. Staging is a useful way of
validating records as they come into to the ETL process. It's the
scratch or work space for the ETL to use, before populating your ODS or
Star schemas.

Make sure you select the correct recovery model of the staging db.
Simple recovery model is the easiest to implement, however bulk load
will give you the best prefromance.

I am not sure about the SAN advice. Other than it is good advice for
setting up mid-large (>300 GB) scale system. However there are a lot of
discussions on whether dedicated disk is better for database solutions.
So if you don't have a SAN for you system I would not worry about it.

What version of SQL server are you using?

Hope this helps,

Myles Matheson
Data Warehouse Architect
http://bi-on-sql-server.blogspot.com/

lu...@iqato.com

unread,
Aug 29, 2006, 9:26:03 AM8/29/06
to
> I am not sure about the SAN advice. Other than it is good advice for
> setting up mid-large (>300 GB) scale system.

SAN is not a solution designed only for large databases, it is also
very handy when performance is critical. Size does not always matter.

> However there are a lot of
> discussions on whether dedicated disk is better for database solutions.

Yep, there is also a lot of discussions about Elvis whereabouts...

Seriously, a LUN of 14 disks in a dedicated RAID-10 array will offer a
performance that a single disk can't possibly achieve. Even a 4-disks
array will be faster. Thanks to the striping, the workload is spread
across all the disks (actually half of them).

Good performance can be achieved with a traditional RAID controller;
however the read/write cache and the impressive speed (up to 4G/s on FC
these days) are features that make SAN the best choice for databases
storage. As long as there is some room in the budget of course.

In many (or most) scenarios the SAN can be just too expensive. But as
far as the performance goes, this is the best choice.

> So if you don't have a SAN for you system I would not worry about it.

I agree that SAN can be overkill in some situations. However, you have
no idea what are the load and performance requirements, so I wonder how
you can make such a statement.

> Hope this helps,
>
> Myles Matheson
> Data Warehouse Architect

A data warehouse architect that is not sure that SAN are a good
solution for ETL databases... Quite puzzling.

0 new messages