Our customer (of our ecommerce system) wants to be able to preserve deleted entities in the database so that they can do reporting, auditing etc.
The system is quite complex where each end user can belong to multiple institutional affiliations (which can purchase on behalf of the user). The end user also has a rich trail of past transactions affiliations etc. Thus in the schema each user entity is related to many others which in turn relate to yet others and so on.
In the past when a user was deleted all of his complex relationships were also deleted in a cascading fashion. But now the customer wants us to add a "deleted" flag to each user so that a user is never _really_ deleted but instead his "deleted" flag is set to true. The system subsequently behaves as if the user did not exist but the customer can still do reports on deleted users.
I pointed out that it is not as simple as that because the user entity is related to many, many others so we would have to add this "deleted" flag to every relationship and every other entity and thus have "deleted" past purchases, "deleted" affiliations - a whole shadow schema full of such ghost entities. This would overtime degrade performance since now each query in the system has to add a clause: "where deleted = 0".
I assume this is a standard problem since many organizations must have this need of preserving deleted records (for legal or other reasons). I tried to talk them into creating a simple audit file where all the deletions will be recorded in XML but they were not too happy with that.
Is there a more satisfying solution to this than have this "deleted" flag?
In Oracle you could partition the main table (from where all 'cascaded' is coming from) into deleted/undeleted records. Put a view on top of the table just as original table would look like with where clause 'undeleted' and the undeleted partition would always be used. Only certain versions of oracle allow rows to migrate from partition to partition, u can look it up.
However, you would still suffer from 'performance' issues on the other tables, if your data is truly huge.
Best solution would probably be to separate all deleted/undeleted data into separate tables, and built a union view on top of them for reporting purposes.
my2c Someone might have a better idea.
On 13 May 2004 11:32:21 -0700, robertbrown1...@yahoo.com (Robert
Brown) wrote: >Our customer (of our ecommerce system) wants to be able to preserve >deleted entities in the database so that they can do reporting, >auditing etc.
>The system is quite complex where each end user can belong to multiple >institutional affiliations (which can purchase on behalf of the user). >The end user also has a rich trail of past transactions affiliations >etc. Thus in the schema each user entity is related to many others >which in turn relate to yet others and so on.
>In the past when a user was deleted all of his complex relationships >were also deleted in a cascading fashion. But now the customer wants >us to add a "deleted" flag to each user so that a user is never >_really_ deleted but instead his "deleted" flag is set to true. The >system subsequently behaves as if the user did not exist but the >customer can still do reports on deleted users.
>I pointed out that it is not as simple as that because the user entity >is related to many, many others so we would have to add this "deleted" >flag to every relationship and every other entity and thus have >"deleted" past purchases, "deleted" affiliations - a whole shadow >schema full of such ghost entities. This would overtime degrade >performance since now each query in the system has to add a clause: >"where deleted = 0".
>I assume this is a standard problem since many organizations must have >this need of preserving deleted records (for legal or other reasons). >I tried to talk them into creating a simple audit file where all the >deletions will be recorded in XML but they were not too happy with >that.
>Is there a more satisfying solution to this than have this "deleted" >flag?
>Thanks for your help,
>- robert
....... We use Oracle 8.1.7.4 on Solaris 2.7 boxes remove NSPAM to email
Em Thu, 13 May 2004 11:32:21 -0700, Robert Brown escreveu:
> I pointed out that it is not as simple as that because the user > entity is related to many, many others so we would have to add this > "deleted" flag to every relationship and every other entity and thus > have "deleted" past purchases, "deleted" affiliations - a whole > shadow schema full of such ghost entities.
I really see no problem. It is just a few entities that are, not quite deleted but actually 'deactivated'. All the others exist in relation to these, thus a simple join with a selection clause will do.
There is a performance hit, but this is the price of doing business...
The alternative would be to have a real shadow database or schema where data would be moved when deleted from the main one, but I guess the performance hit would be worse, besides all the application changes and added maintainance.
-- Leandro Guimarães Faria Corsetti Dutra +55 (11) 5685 2219 Av Sgto Geraldo Santana, 1100 6/71 +55 (11) 5686 9607 04.674-000 São Paulo, SP BRASIL http://br.geocities.com./lgcdutra/
> I pointed out that it is not as simple as that because the user entity > is related to many, many others so we would have to add this "deleted" > flag to every relationship and every other entity and thus have > "deleted" past purchases, "deleted" affiliations - a whole shadow > schema full of such ghost entities. This would overtime degrade > performance since now each query in the system has to add a clause: > "where deleted = 0".
Blanket statements like this are rarely true.
> I assume this is a standard problem since many organizations must have > this need of preserving deleted records (for legal or other reasons). > I tried to talk them into creating a simple audit file where all the > deletions will be recorded in XML but they were not too happy with > that.
So you think file is better than DBMS?
> Is there a more satisfying solution to this than have this "deleted" > flag?
Can I suggest that there is a modelling problem? I can imagine customer having multiple timestamp columns, for example
table customer ( ... DOB DATE, married DATE, divorced DATE, died DATE )
but can't possibly see why you need "is_alive", "is_married" boolean columns.
> In Oracle you could partition the main table (from where all > 'cascaded' is coming from) into deleted/undeleted records. Put a view > on top of the table just as original table would look like with where > clause 'undeleted' and the undeleted partition would always be used. > Only certain versions of oracle allow rows to migrate from partition > to partition, u can look it up.
> However, you would still suffer from 'performance' issues on the other > tables, if your data is truly huge.
> Best solution would probably be to separate all deleted/undeleted data > into separate tables, and built a union view on top of them for > reporting purposes.
> my2c > Someone might have a better idea.
> On 13 May 2004 11:32:21 -0700, robertbrown1...@yahoo.com (Robert > Brown) wrote:
> >Our customer (of our ecommerce system) wants to be able to preserve > >deleted entities in the database so that they can do reporting, > >auditing etc.
> >The system is quite complex where each end user can belong to multiple > >institutional affiliations (which can purchase on behalf of the user). > >The end user also has a rich trail of past transactions affiliations > >etc. Thus in the schema each user entity is related to many others > >which in turn relate to yet others and so on.
> >In the past when a user was deleted all of his complex relationships > >were also deleted in a cascading fashion. But now the customer wants > >us to add a "deleted" flag to each user so that a user is never > >_really_ deleted but instead his "deleted" flag is set to true. The > >system subsequently behaves as if the user did not exist but the > >customer can still do reports on deleted users.
> >I pointed out that it is not as simple as that because the user entity > >is related to many, many others so we would have to add this "deleted" > >flag to every relationship and every other entity and thus have > >"deleted" past purchases, "deleted" affiliations - a whole shadow > >schema full of such ghost entities. This would overtime degrade > >performance since now each query in the system has to add a clause: > >"where deleted = 0".
> >I assume this is a standard problem since many organizations must have > >this need of preserving deleted records (for legal or other reasons). > >I tried to talk them into creating a simple audit file where all the > >deletions will be recorded in XML but they were not too happy with > >that.
> >Is there a more satisfying solution to this than have this "deleted" > >flag?
> >Thanks for your help,
> >- robert
> ....... > We use Oracle 8.1.7.4 on Solaris 2.7 boxes > remove NSPAM to email
>Thanks for your answer. This particular customer is using Oracle but >our software is supported on SQL server as well.
>> In Oracle you could partition the main table (from where all >> 'cascaded' is coming from) into deleted/undeleted records. Put a view >> on top of the table just as original table would look like with where >> clause 'undeleted' and the undeleted partition would always be used. >> Only certain versions of oracle allow rows to migrate from partition >> to partition, u can look it up.
>> However, you would still suffer from 'performance' issues on the other >> tables, if your data is truly huge.
>> Best solution would probably be to separate all deleted/undeleted data >> into separate tables, and built a union view on top of them for >> reporting purposes.
>> my2c >> Someone might have a better idea.
>> On 13 May 2004 11:32:21 -0700, robertbrown1...@yahoo.com (Robert >> Brown) wrote:
>> >Our customer (of our ecommerce system) wants to be able to preserve >> >deleted entities in the database so that they can do reporting, >> >auditing etc.
>> >The system is quite complex where each end user can belong to multiple >> >institutional affiliations (which can purchase on behalf of the user). >> >The end user also has a rich trail of past transactions affiliations >> >etc. Thus in the schema each user entity is related to many others >> >which in turn relate to yet others and so on.
>> >In the past when a user was deleted all of his complex relationships >> >were also deleted in a cascading fashion. But now the customer wants >> >us to add a "deleted" flag to each user so that a user is never >> >_really_ deleted but instead his "deleted" flag is set to true. The >> >system subsequently behaves as if the user did not exist but the >> >customer can still do reports on deleted users.
>> >I pointed out that it is not as simple as that because the user entity >> >is related to many, many others so we would have to add this "deleted" >> >flag to every relationship and every other entity and thus have >> >"deleted" past purchases, "deleted" affiliations - a whole shadow >> >schema full of such ghost entities. This would overtime degrade >> >performance since now each query in the system has to add a clause: >> >"where deleted = 0".
>> >I assume this is a standard problem since many organizations must have >> >this need of preserving deleted records (for legal or other reasons). >> >I tried to talk them into creating a simple audit file where all the >> >deletions will be recorded in XML but they were not too happy with >> >that.
>> >Is there a more satisfying solution to this than have this "deleted" >> >flag?
>> >Thanks for your help,
>> >- robert
>> ....... >> We use Oracle 8.1.7.4 on Solaris 2.7 boxes >> remove NSPAM to email
> In Oracle you could partition the main table (from where all > 'cascaded' is coming from) into deleted/undeleted records. Put a view > on top of the table just as original table would look like with where > clause 'undeleted' and the undeleted partition would always be used. > Only certain versions of oracle allow rows to migrate from partition > to partition, u can look it up.
> However, you would still suffer from 'performance' issues on the other > tables, if your data is truly huge.
> Best solution would probably be to separate all deleted/undeleted data > into separate tables, and built a union view on top of them for > reporting purposes.
> my2c > Someone might have a better idea.
> On 13 May 2004 11:32:21 -0700, robertbrown1...@yahoo.com (Robert > Brown) wrote:
> >Our customer (of our ecommerce system) wants to be able to preserve > >deleted entities in the database so that they can do reporting, > >auditing etc.
> >The system is quite complex where each end user can belong to multiple > >institutional affiliations (which can purchase on behalf of the user). > >The end user also has a rich trail of past transactions affiliations > >etc. Thus in the schema each user entity is related to many others > >which in turn relate to yet others and so on.
> >In the past when a user was deleted all of his complex relationships > >were also deleted in a cascading fashion. But now the customer wants > >us to add a "deleted" flag to each user so that a user is never > >_really_ deleted but instead his "deleted" flag is set to true. The > >system subsequently behaves as if the user did not exist but the > >customer can still do reports on deleted users.
> >I pointed out that it is not as simple as that because the user entity > >is related to many, many others so we would have to add this "deleted" > >flag to every relationship and every other entity and thus have > >"deleted" past purchases, "deleted" affiliations - a whole shadow > >schema full of such ghost entities. This would overtime degrade > >performance since now each query in the system has to add a clause: > >"where deleted = 0".
> >I assume this is a standard problem since many organizations must have > >this need of preserving deleted records (for legal or other reasons). > >I tried to talk them into creating a simple audit file where all the > >deletions will be recorded in XML but they were not too happy with > >that.
> >Is there a more satisfying solution to this than have this "deleted" > >flag?
> >Thanks for your help,
> >- robert
> ....... > We use Oracle 8.1.7.4 on Solaris 2.7 boxes > remove NSPAM to email
As Leandro and Mikito point out, you have flaws in your design. First, deleting the user entity is the only thing that is logically deleted, but rather than make it an indicator, use a date, as in DELETE_DATE as nullable. Non-null entities are to be ignored. In addition, all relationships to the user entity should be evaluated to determine if they need to be sensitive to the user entity's status. Some may, and others may not. This is a business question, not a technical question. For instance, if there are invoices and inventory tables linked in some way to the user entity, would you really want to *not* display that information just because an associated user entity had been deleted?
I'd really need to see your design and talk to your business to know exactly what impact this design change would have on your database.
> Our customer (of our ecommerce system) wants to be able to preserve > deleted entities in the database so that they can do reporting, > auditing etc.
> The system is quite complex where each end user can belong to multiple > institutional affiliations (which can purchase on behalf of the user). > The end user also has a rich trail of past transactions affiliations > etc. Thus in the schema each user entity is related to many others > which in turn relate to yet others and so on.
> In the past when a user was deleted all of his complex relationships > were also deleted in a cascading fashion. But now the customer wants > us to add a "deleted" flag to each user so that a user is never > _really_ deleted but instead his "deleted" flag is set to true. The > system subsequently behaves as if the user did not exist but the > customer can still do reports on deleted users.
> I pointed out that it is not as simple as that because the user entity > is related to many, many others so we would have to add this "deleted" > flag to every relationship and every other entity and thus have > "deleted" past purchases, "deleted" affiliations - a whole shadow > schema full of such ghost entities. This would overtime degrade > performance since now each query in the system has to add a clause: > "where deleted = 0".
> I assume this is a standard problem since many organizations must have > this need of preserving deleted records (for legal or other reasons). > I tried to talk them into creating a simple audit file where all the > deletions will be recorded in XML but they were not too happy with > that.
> Is there a more satisfying solution to this than have this "deleted" > flag?
robertbrown1...@yahoo.com (Robert Brown) wrote: > Our customer (of our ecommerce system) wants to be able to preserve > deleted entities in the database so that they can do reporting, > auditing etc.
> The system is quite complex where each end user can belong to multiple > institutional affiliations (which can purchase on behalf of the user). > The end user also has a rich trail of past transactions affiliations > etc. Thus in the schema each user entity is related to many others > which in turn relate to yet others and so on.
> In the past when a user was deleted all of his complex relationships > were also deleted in a cascading fashion.
Users are not deleted. They may die, they may be incarcerated, they may be fired, but they cannot be deleted. Only data is deleted.
> But now the customer wants > us to add a "deleted" flag to each user so that a user is never > _really_ deleted but instead his "deleted" flag is set to true. The > system subsequently behaves as if the user did not exist but the > customer can still do reports on deleted users.
> I pointed out that it is not as simple as that because the user entity > is related to many, many others so we would have to add this "deleted" > flag to every relationship and every other entity and thus have > "deleted" past purchases,
If I am hit by a bus tomorrow, will the office furniture I ordered (but which is owned by the company) disappear? Will the patents I've generated for the company no longer be valid?
> "deleted" affiliations - a whole shadow > schema full of such ghost entities.
Why would each of these need it's own deleted flag? If they are being joined against the user table, then they can just rely on the user table to provide the necessary logic. If they are not being joined against the user table, then why would you want them to be excluded based on a condition of a user?
> This would overtime degrade > performance since now each query in the system has to add a clause: > "where deleted = 0".
Do they want the reporting to go back indefinately, or would they like the data to be "really" deleted after a certain time of "virtual" deletion?
> I assume this is a standard problem since many organizations must have > this need of preserving deleted records (for legal or other reasons).
It is a standard problem, but there is not a standard solution.
Xho
-- -------------------- http://NewsReader.Com/ -------------------- Usenet Newsgroup Service $9.95/Month 30GB
Robert Brown (robertbrown1...@yahoo.com) writes: > Our customer (of our ecommerce system) wants to be able to preserve > deleted entities in the database so that they can do reporting, > auditing etc.
> The system is quite complex where each end user can belong to multiple > institutional affiliations (which can purchase on behalf of the user). > The end user also has a rich trail of past transactions affiliations > etc. Thus in the schema each user entity is related to many others > which in turn relate to yet others and so on.
> In the past when a user was deleted all of his complex relationships > were also deleted in a cascading fashion. But now the customer wants > us to add a "deleted" flag to each user so that a user is never > _really_ deleted but instead his "deleted" flag is set to true. The > system subsequently behaves as if the user did not exist but the > customer can still do reports on deleted users.
> I pointed out that it is not as simple as that because the user entity > is related to many, many others so we would have to add this "deleted" > flag to every relationship and every other entity and thus have > "deleted" past purchases, "deleted" affiliations - a whole shadow > schema full of such ghost entities. This would overtime degrade > performance since now each query in the system has to add a clause: > "where deleted = 0".
Indeed, this is quite a big change if this was not in the system from the beginning.
In our system, many items are not deletable, because even if an item goes away, there might still be plenty of references to it. So in these tables - accounts, customers, instruments, currencies, to name a few - there is a deregdate column. This column serves the double purpose of telling us if the item is still active, and if it is not, when the entity was deregistered. And, yes it happens that deregistered entities are revived too!
A non-trivial issue here is to know when a deregistered item should be included and when it should not. If you are producing a list of last month's tranactions, it obviously should. But if you are populating a list of available products to order, deregistered products should not be included. So when you introduce this concept in your system, you have a lot to write in your functional specification too.
Another issue which becomes complicated, is referential integrity. accounts.accresponsible may refer to the users table, but if the account is active, the user must be too. Currently we do this in triggers, which is a bit complex, and more difficult than foreign-key constraints. One thought I've been playing with is to have tables like active_accounts, active_users etc. This would not be the partition suggested by others, since active_accounts would only hold the account number, and foreign keys to other deregisterable items. Thus, active_accounts.accresponsible would refer to active_users.userid. The full data would still be in acconts and users, for both active and deregistered items.
I have never considered the performance cost for "AND deregdate IS NULL", but I would suggest that if you need to access that column, you probably already access some column which is not in any index, so there is already a bookmark lookup, so I would not expect any particular penalty. (This applies to MS SQL Server. Not that I really expect Oracle to be different, but I don't know Oracle.)
-- Erland Sommarskog, SQL Server MVP, som...@algonet.se