Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to handle physical deletion from source system

155 views
Skip to first unread message

Doug

unread,
Oct 16, 2005, 10:39:59 PM10/16/05
to
I would like to see if someone can share their experienc in handling
deleted records from legacy source system in data warehouse. I don't see
much coverage on this topic in Kimball literatures on how to handle this in
ETL and model design. Did I miss something?


Nenshad Bardoliwalla

unread,
Oct 16, 2005, 11:25:20 PM10/16/05
to
Hi Doug,

This is an excellent question. In my experience, the event of deleting
records can be a very valuable source of information about a business
process, and thus, it is very useful to capture the event in the data
warehouse. I have typically handled this by adding a DELETED column to
the fact or dimension table that stores a value of Y or N (or 0 or 1)
for deleted versus valid records.

Then, from the end-user tool, during query execution, you can modify
your query criteria to check for records that are marked deleted versus
valid. To ensure adequate performance, make sure that the DELETED
column is indexed using the appropriate technique for your database.
In Oracle DBs, low cardinality columns like this are usually retrieved
most efficiently by using bitmap indexes.

Note that for auditing purposes, this method works very well, because
the underlying integrity of the systems are not challenged by using
flags to mark deleted records. It is entirely possible to see the
deleted records with the appropriate query.

Good Luck,

Nenshad

Doug

unread,
Oct 17, 2005, 12:19:26 AM10/17/05
to
Did you have to create an offseting measures in the deleted fact record?
Was the end/power user receptive or remember to the fact that they have to
include this flag?

Thanks in advance.
Doug
"Nenshad Bardoliwalla" <nen...@gmail.com> wrote in message
news:1129519520.0...@o13g2000cwo.googlegroups.com...

Myles.M...@gmail.com

unread,
Oct 18, 2005, 6:56:35 AM10/18/05
to
Hello Doug,

I would look into Time Variance as a concept to handle this situation.
have a look at this post on my blog it will help you get started.

check out:
http://bi-on-sql-server.blogspot.com/2005/07/news-group-post-slowly-changing.html

Myles Matheson
Data Warehouse Architect
http://bi-on-sql-server.blogspot.com/

Doug

unread,
Oct 20, 2005, 12:15:52 AM10/20/05
to
Thanks. I totally agree the Type1/2/3 slowing changing dimension approach.
Are you suggesting to use the slowing change dimension approach to solve the
"slowing change/deleted" fact?

I was thinking using an offseting fact record because the the fact tables
has a lot more rows (in the order of 100 million) than that of dimension (in
the order of 1000 to 10,000) typically.
It seems to be much more expensive in terms of storage and space to maintain
the Start Data, End Date and the flag in the fact.
Also, it will require two role playing time dimensions, one for start date
and another for end date, unless I degenerate the dates directly into the
fact record.

This is why I was leaning towards using an offseting fact record to achieve
the deletion.

e.g. Surrogate key, fkey1, fkey2, fkey3 that link to 1/3/2005, 100 will
become
Surrogate key +1, fkey1, fkey2, fkey3 that link to 1/4/2005, -100.

Any comments or thoughts? Thanks in advance.


<Myles.M...@gmail.com> wrote in message
news:1129632995.2...@g44g2000cwa.googlegroups.com...

Myles.M...@gmail.com

unread,
Oct 20, 2005, 4:02:17 AM10/20/05
to
Hello Doug,

Deletes can be a real pain. I tend not to have them in the fact table
and identify them in ODS in my data warehouses. The main reason is
performance. Having a flag tagging the status of a record will force a
table scan with some RDBMS. This is expensive with large tables.

In your case I would use the type 2 concept. This would allow you to
identify records that are no longer current. But be warned you will
have to educate your users to understand the types of transactions in
the fact table.

I have one question do your users want to report on delete
transactions?

0 new messages