Guidance would be most welcome

4 views
Skip to first unread message

Paul Childs

unread,
Jun 13, 2006, 7:54:45 AM6/13/06
to Django users
I have been using Django for about a month now and I just moved over to
the development version.

I am creating an application in which the client has dictated very
stringent database record management in order to have a detailed audit
trail on all changes made data. This does not necessarily mean logging.

The rules are quite simple.

· once saved, records are immutable.
· records can not be deleted. they must be marked for deletion only.
· to "edit" a record a new record is created with the new changes
and a reference must point back to the old record
· a comment must be attached to every edit or delete to explain why
the action was taken.

I was planning on doing this with a combination of Managers, Views and
overriding the delete() and save() methods of my models. I have done
some experimentation and found this to be awkward, not very elegant and
to contravention of the DRY principle.

Is there a better way of achieving this or am I on the right track?

Waylan Limberg

unread,
Jun 13, 2006, 10:36:59 AM6/13/06
to django...@googlegroups.com
I have been thinking about how to do something similar - specifically,
wikipage history. Note that I have not actually tried any of this yet
but I would split the history out into a second table. Perhaps this
incomplete example to illustrate:

class Page(models.Model):
title = models.CharField(maxlength=30)
content = models.TextField()
status = models...

class PageHistory(models.Model):
title = models.CharField(maxlength=30)
content = models.TextField()
page = models.ForeignKey(Page)
date = models.DateField()
comment = models.textField()
author = models.ForeignKey(User)

Whenever a page is saved, the data in Page is overwritten and a new
record is added to PageHistory with the ForeignKey pointing back to
page. (Should PageHistory contain the current version of Page or only
previous versions? I'm not sure. The former feels more secure but the
later more DRY.) The only time PageHistory is accessed is if one needs
to review the history. I suppose this is a fancy form of logging.

The problem I'm not sure how to address is if a Page is deleted. To
preserve the history, the records should stay in PageHistory, but if
the record is deleted from Page, then we leave a ForeignKey hanging.
Perhaps a status field should be altered on delete. Subclassing the
save() and delete() methods for Page should do the trick.

In my case this only needs to be done for one model. Sounds like you
need to do this for every model which could get a little unweildy. I
don't know that I helped much.


--
----
Waylan Limberg
way...@gmail.com

Paul Childs

unread,
Jun 13, 2006, 10:48:58 AM6/13/06
to Django users
Thanks for the input Waylan.

It certainly gives me another angle on the solution.

I have been digging into the docs again and it seems to me that Django
is flexible enough to handle this without having to dive into the
source.

Of course it would be nice just to call the save() and delete() methods
and everything would behave as listed in my first post.

Cheers

DavidA

unread,
Jun 13, 2006, 12:13:57 PM6/13/06
to Django users
I have a similar problem where I want to maintain an audit trail for
"manual overrides" to inputs to our risk loop. I haven't implemented it
in Django yet, but the way I've done this in the past is similar to
Waylan: using a history version of the table. There are a few
differences, however. My table's will probably look more like this:

ParameterOverride
id
name
value
user
note

ParameterOverrideHist
id
name
value
user
note
time_from
time_thru

The "hist" table simply adds the time_from/thru fields to track the
date range that this value was valid. Whenever a new row is added to
the main table, it is also added to the hist table (the id is *not*
auto_increment on the hist table). Then time_from is set to the current
time and time_thru is set to null. When a row is edited, a new row is
inserted into the hist table with time_from set to now and time_thru
set to null, but the old row that had time_thru null is updated so its
time_thru is the same as the new row's time_from. Deleted rows simply
update the row in the hist table with time_thru = null to time_thru =
the current time.

Thus you can query the hist table on any give date/time to get the
active row at that moment: "where _date_ >= time_from and (_date_ <
time_thru or time_thru is null)". While these queries are a little
ugly, this design allows you to get the row at a given time more
effectively that only storing the "updated" date in it (you have to
find the row with max(date) where date >= _eval_date_)

In my C#/Sql Server life, I handled this update logic using triggers on
the main table. In Django I'd probably override save() and delete()
like Waylan suggested.

I've been noodling on how to do this more easily since I'll have need
this approach quite a few times in my applications. Maybe a decorator?
It would be way cool to just say:

@track_audits()
class ParameterOverride(models.Model):
name = models.CharField(...)
value = models.FloatField(...)
user = models.ForeignKey(...)
note = models.TextField(...)

And it creates the hist table, overrides the save/delete methods and
maybe adds a few convenience methods like get_audit_trail(),
get_on_date(...)

Anyway, food for thought...
-Dave

Carlos Yoder

unread,
Jun 15, 2006, 3:21:23 AM6/15/06
to django...@googlegroups.com
Hey there Paul.

I'm working with a price comparison website, and we have a similar
system going on. You see, products have 1+ prices, stored in a prices
table. We get prices daily from dealers, and handle historical prices
in a different table, just like Waylan said. The prices_hist table
holds the entire history for every product, and the "prices" table
just the current ones. On the front end of our app we only access the
"prices" table, leaving "prices_hist" just for report generation.

As to deletions (we call them discontinued offers), we have another
table called "prices_deleted" with fields idProducts, idDealers and a
timestamp. Every time a dealer stops selling a product, we delete it
from the 'prices' table, and insert a record to 'prices_deleted'. This
way, any history for the offer is maintained, but the snapshot table
("prices") stays current.

Sorry for the convoluted, incomplete text. I didn't have my morning
caffeine yet.

On 6/13/06, Paul Childs <pch...@gmail.com> wrote:
>


--
Carlos Yoder
http://carlitosyoder.blogspot.com

GrumpySimon

unread,
Jun 15, 2006, 4:59:46 AM6/15/06
to Django users
Why bother shuffling data between tables?

I've got a wiki app I've been working on (slowly - hoping to release
soon!), and I add a revision field ( IntegerField ), and just add a new
entry with revision+1. The latest version has the latest revision
number. Easy :-)

Then it's just a matter of something like this:

page = Page.objects.all().filter( title="foo" ).order_by( '-revision'
)[ : 1 ]

Personally, I would add all the revision info (timestamp, username) to
that same table as well. If you did feel the overwhelming urge to split
it up, then I'd move the content field into a table by itself with a
1:1 relationship to the history table, for a few reasons - it's going
to be a large amount of info, and lots of the times we don't want the
content, but just a page/history list etc. Getting the content and
ignoring it is a waste of database energy, to ignore this, we have to
use Django's values( ) filter, but this de-objectifies things, and we
lose stuff like get_absolute_url(). More importantly, in MySQL's MyISAM
tables rows are read inline, so even if we don't SELECT content,
MySQL's still reading it ( although maybe this is more of an argument
for using InnoDB tables or postgres ).

For deleting/locking/etc, it'd depend on whether a deletion/lock means
ALL Pages with that title are deleted/locked ( here I'd use a status
table which links a status to a title - i.e. all entries with title
"foo" ), or whether the status change belongs to one page only - in
which case, you may as well just add it to the main Page table.

Cheers,
Simon

Waylan Limberg

unread,
Jun 15, 2006, 4:24:09 PM6/15/06
to django...@googlegroups.com
On 6/13/06, Paul Childs <pch...@gmail.com> wrote:
>
> Of course it would be nice just to call the save() and delete() methods
> and everything would behave as listed in my first post.
>
This is one of the Summer of Code projects in the works. You can read
about it and offer some input on the dev-list:
http://groups.google.com/group/django-developers/browse_thread/thread/c36683c6bef6105f/2d5aeadbbe3d072d#2d5aeadbbe3d072d

Paul Childs

unread,
Jun 16, 2006, 6:12:07 AM6/16/06
to Django users
Sweet! Thanks for the heads up.

Reply all
Reply to author
Forward
0 new messages