History, history exclusions, and soft delete

145 views
Skip to first unread message

Karsten Sperling

unread,
Oct 20, 2016, 11:20:00 PM10/20/16
to Ebean ORM
I'm looking into enabling @History on a number of key entities in my data model (essentially all the mutable entities that make up the core data model about users and accounts of my service). In this context I've got a number of questions

1) It seems History already implicitly retains a copy of deleted entities, so enabling @SoftDelete on top of it seems unnecessarily complex. However with just History there doesn't seem to be a built-in way to get the latest version of a possibly-deleted entity. What's considered best practice in a situation like this?

2) Some of the entities I'd like to enable History on have a few fields that I'd like to exclude from the history (mostly because they change frequently and where history is not relevant, one other case being the password hash for a user where I'd like to avoid retaining old values for security reasons). I've tried marking them with @HistoryExclude, but that seems to only work on many-to-many fields and is ignored on anything else. A workaround I can think of would be to split the 'transient' fields of each entity off into a separate table that doesn't have history. Is that the best approach?

3) A @WhenModified column seems somewhat superfluous in a table with @History it should always exactly match the lower bound of the sys_period field?

Cheers, Karsten

Rob Bygrave

unread,
Oct 21, 2016, 5:50:40 PM10/21/16
to ebean@googlegroups
> 1)

@SoftDelete has Foreign Key implications which is a significant difference between the two.  The soft deleted row is still in the "main" table etc and hence the delete does not HAVE to cascade - there are no PK or FK relationships on the history tables.  So history tables are not a direct alternative to soft delete as I see it based on the difference in delete cascade behavior.



> with just History there doesn't seem to be a built-in way to get the latest version of a possibly-deleted entity.

Correct, there is just the findVersions() at this stage.


> 2) - one other case being the password hash for a user where I'd like to avoid retaining old values

Probably the nice approach would be to be able to put @HistoryExclude on those properties and for the underlying trigger DDL to just set null (or exclude the column from the trigger etc).  That would be the best thing to do in that case and would be a nice enhancement to do.


> 2) - mostly because they change frequently and where history is not relevant,

This is more tricky.  If the trigger can easily determine that the changed columns are all ignorable then it as option to keep both sets of columns in the same table (so this depends on the DB trigger language and features).  However, in practice to date I've always seen the separate table in a 1-1 relationship and that can have benefits if the tables have lots of rows (index, space management etc).

Right now, you have to split the two (unless you modify the generated triggers etc).



> 3) A @WhenModified column seems somewhat superfluous

Yes it does and I need to look at this some more.  In theory the sys_period column(s) are 'internal' and @WhenModified is part of the 'public' model.  In practice it looks like we could map @WhenModified to the sys_period column and this might be beneficial from the perspective of indexes (only indexing sys_period and not additionally having an index on @WhenModified) etc.

So yes this is something to look at.



Cheers, Rob.




--

---
You received this message because you are subscribed to the Google Groups "Ebean ORM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ebean+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rob Bygrave

unread,
Oct 21, 2016, 5:55:32 PM10/21/16
to ebean@googlegroups
FYI: Logged https://github.com/ebean-orm/ebean/issues/837
ENH: Support using @HistoryExclude on normal properties (other than @ManyToMany)

Karsten Sperling

unread,
Oct 21, 2016, 10:35:40 PM10/21/16
to Ebean ORM


On Saturday, October 22, 2016 at 10:50:40 AM UTC+13, Rob Bygrave wrote:
> 1)

@SoftDelete has Foreign Key implications which is a significant difference between the two.  The soft deleted row is still in the "main" table etc and hence the delete does not HAVE to cascade - there are no PK or FK relationships on the history tables.  So history tables are not a direct alternative to soft delete as I see it based on the difference in delete cascade behavior.
So have you used both together on the same entity in practice? My immediate use case (an events table that  references users of an account, which may subsequently be removed) is really more soft delete than history, however I'm quite certain that I'll want full history on these tables pretty soon anyway for auditing purposes.
 
> 2) - one other case being the password hash for a user where I'd like to avoid retaining old values

Probably the nice approach would be to be able to put @HistoryExclude on those properties and for the underlying trigger DDL to just set null (or exclude the column from the trigger etc).  That would be the best thing to do in that case and would be a nice enhancement to do.
Yeah I was wondering about that. I guess for trigger-based implementations it would complicate the triggers a fair bit, at least if you're trying to ignore changes that only have excluded properties entirely. I guess for the security use case nulling out the values (or not having the columns) would be sufficient, whereas for the "frequent irrelevant updates" case the separate table is probably the better solution.

Another question mark is that this presumably would not be supported on native SQL2011 temporal tables.
 

> 2) - mostly because they change frequently and where history is not relevant,

This is more tricky.  If the trigger can easily determine that the changed columns are all ignorable then it as option to keep both sets of columns in the same table (so this depends on the DB trigger language and features).  However, in practice to date I've always seen the separate table in a 1-1 relationship and that can have benefits if the tables have lots of rows (index, space management etc).

Right now, you have to split the two (unless you modify the generated triggers etc).


> 3) A @WhenModified column seems somewhat superfluous

Yes it does and I need to look at this some more.  In theory the sys_period column(s) are 'internal' and @WhenModified is part of the 'public' model.  In practice it looks like we could map @WhenModified to the sys_period column and this might be beneficial from the perspective of indexes (only indexing sys_period and not additionally having an index on @WhenModified) etc.

So yes this is something to look at.
Another question in that context is what the actual time stamp used is... @WhenModified seems to use System.currentTimeMills() at the time of the actual operation, whereas the history triggers use the transaction timestamp on the DB side. It seems like it would be nice if those values agreed. For example this would make it possible to look at the current @WhenModified value and lookup the previous state in the history table (assuming the change don't happen more frequently than the resolution of the timestamp)

I guess using 'now()' or similar db-side when setting @When... columns would have a similar penalty to db-generated ids; it would need to be fetched back separately if you also want to have the correct value in the bean itself.

Rob Bygrave

unread,
Oct 25, 2016, 7:27:30 AM10/25/16
to ebean@googlegroups
On 22 October 2016 at 15:35, Karsten Sperling <kar...@sperling.co.nz> wrote:


On Saturday, October 22, 2016 at 10:50:40 AM UTC+13, Rob Bygrave wrote:
> 1)

@SoftDelete has Foreign Key implications which is a significant difference between the two.  The soft deleted row is still in the "main" table etc and hence the delete does not HAVE to cascade - there are no PK or FK relationships on the history tables.  So history tables are not a direct alternative to soft delete as I see it based on the difference in delete cascade behavior.
So have you used both together on the same entity in practice? My immediate use case (an events table that  references users of an account, which may subsequently be removed) is really more soft delete than history, however I'm quite certain that I'll want full history on these tables pretty soon anyway for auditing purposes.

I think having both is fine.  I personally have not done that yet in that thus far @SoftDeleted rows tend to mean that they are never deleted but just in a permanent "non active" mode.
 
 
> 2) - one other case being the password hash for a user where I'd like to avoid retaining old values

Probably the nice approach would be to be able to put @HistoryExclude on those properties and for the underlying trigger DDL to just set null (or exclude the column from the trigger etc).  That would be the best thing to do in that case and would be a nice enhancement to do.
Yeah I was wondering about that. I guess for trigger-based implementations it would complicate the triggers a fair bit, at least if you're trying to ignore changes that only have excluded properties entirely. I guess for the security use case nulling out the values (or not having the columns) would be sufficient, whereas for the "frequent irrelevant updates" case the separate table is probably the better solution.

Yes, that is how I see it.
 

Another question mark is that this presumably would not be supported on native SQL2011 temporal tables.

That is right. I don't think this is supported in native SQL2011 implementations per say.  It is still worth supporting for Postgres and MySql etc though.
 
 

> 2) - mostly because they change frequently and where history is not relevant,

This is more tricky.  If the trigger can easily determine that the changed columns are all ignorable then it as option to keep both sets of columns in the same table (so this depends on the DB trigger language and features).  However, in practice to date I've always seen the separate table in a 1-1 relationship and that can have benefits if the tables have lots of rows (index, space management etc).

Right now, you have to split the two (unless you modify the generated triggers etc).


> 3) A @WhenModified column seems somewhat superfluous

Yes it does and I need to look at this some more.  In theory the sys_period column(s) are 'internal' and @WhenModified is part of the 'public' model.  In practice it looks like we could map @WhenModified to the sys_period column and this might be beneficial from the perspective of indexes (only indexing sys_period and not additionally having an index on @WhenModified) etc.

So yes this is something to look at.
Another question in that context is what the actual time stamp used is... @WhenModified seems to use System.currentTimeMills() at the time of the actual operation, whereas the history triggers use the transaction timestamp on the DB side. It seems like it would be nice if those values agreed. For example this would make it possible to look at the current @WhenModified value and lookup the previous state in the history table (assuming the change don't happen more frequently than the resolution of the timestamp)

Yes. In fact we (the sponsor and I) discussed this exact issue.  We both felt it would be nice to be able to configure the @WhenModified to be able to use the underlying DB now() function for @History tables.
 

I guess using 'now()' or similar db-side when setting @When... columns would have a similar penalty to db-generated ids; it would need to be fetched back separately if you also want to have the correct value in the bean itself.

 
Yes exactly.  However, if we insert (or update) but then don't NEED to return or use the generated value then using the DB now() would be good for the reasons you have stated. 

So yes, I think it would be a really nice option to be able to choose perhaps along the lines of - @WhenModified(useDbNow=true) ... or a global option like "all @WhenModified on @History tables use DB now()" ... or both.
 
Reply all
Reply to author
Forward
0 new messages