Major new django-reversion release - opinions needed

46 views
Skip to first unread message

Dave Hall

unread,
Feb 16, 2011, 6:44:19 AM2/16/11
to django-r...@googlegroups.com
Hello all,

I've put together a specification for a major new release of django-reversion. This release would be backwards-incompatible, and aim to fix a load of long-term niggles and desirable features in one fell swoop.

The specification is here:


I'd love to hear your feedback and comments over whether some of these changes are a good/bad idea. From my point of view, implementing all of the proposed changes would make reversion smaller, faster and more reliable. However, one of the changes in particular could be pretty controversial, so I'd love to hear your views.

None of the changes I'm proposing would affect the simple, plug-n-play admin integration (aside from providing a much nicer revision browser interface). They all affect the low-level api, so are only really of interest to those of you who are using it to add version control to your own view code.

I'm going to actually implement these changes in about a week, so now's the time to complain if you don't like what you see coming! :P

Best,

Dave.

--
Dave Hall
Etianen.com

Telephone: 07525 452381
Web: http://www.etianen.com

ses1984

unread,
Feb 16, 2011, 10:48:18 PM2/16/11
to Django Reversion Discussion
I read the spec and everything makes sense going forward, especially
the part about grouping diff helpers as a separate project.

One part confused me though. In your spec you wrote:

"I currently register all models in models.py but instead of the
middleware I use with reversion.revision(): to group revisions
together."

I'm not quite sure what you mean here.

DistantParts

unread,
Feb 17, 2011, 5:39:59 AM2/17/11
to Django Reversion Discussion
Just read through the list of changes, and nothing there looks like it
will be an issue for me, mostly because I'm already using JSON and
integer primary keys.

Best Wishes,
Steven

Andy Baker

unread,
Feb 17, 2011, 6:17:17 AM2/17/11
to django-r...@googlegroups.com
My thoughts are thus:

1. Django fully supports non-integer primary keys.
2. Many relational db classes teach that non-natural primary keys are bad
3. There are many other legitimate reasons for choosing non-integer
primary keys. (legacy schema, GUIDs etc.)

therefore:

People are always going to develop apps with non-integer primary keys.
These would be excluded if you dropped support for them from
Reversion.

> --
> You received this message because you are subscribed to the Google Groups
> "Django Reversion Discussion" group.
> To post to this group, send an email to django-r...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-reversi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-reversion?hl=en-GB.
>

David Hall

unread,
Feb 17, 2011, 6:41:16 AM2/17/11
to django-r...@googlegroups.com
Yeah, I can see those points. In fact, that's why I originally included support for non-integer primary keys.

However...

The way that django does generic relations to non-integer foreign keys is to store the object_id as a TextField. This has the following negative side effects:
  • Increased storage cost (small, mostly trivial)
  • Joins become impossible, requiring inefficient application code to compensate.
  • Indexes become impossible, requiring inefficient table scans to loop over models.
  • Unique clauses become impossible, requiring inefficient application hashes.
  • It still only works for fields that can be trivially serialized to strings. If your primary key is something like a date, or worse, a combination of two columns, it won't work.
At the moment, django-reversion (and django) only really support primary keys that are string or integer-based. This comes at the cost of inefficient application code. The question is, are these trade-offs worth it? It it worth slowing down the entire application just because the a small number of database tables that for some reason can't support an auto-incrementing primary key?

I originally followed django's lead and stored object_id references as TextFields. Having had to maintain reversion for two years, and seen the massive inefficiencies it produces, I'm not convinced that Django made the right choice here.

But I'm completely open to counter arguments! :P

Dave.

Andy Baker

unread,
Feb 17, 2011, 11:38:06 AM2/17/11
to django-r...@googlegroups.com
> I originally followed django's lead and stored object_id references as
> TextFields. Having had to maintain reversion for two years, and seen the
> massive inefficiencies it produces, I'm not convinced that Django made the
> right choice here.
> But I'm completely open to counter arguments! :P
> Dave.

I think the problem lies in not following Django's lead.

Agree or not, that's the design decision embedded in Django and if you
don't follow it then you'll be incompatible with an unknown number of
Django apps.

So I suppose the question is - how many apps don't use integer keys?

Oliver George

unread,
Feb 17, 2011, 6:45:34 PM2/17/11
to django-r...@googlegroups.com, Andy Baker
Chances are this is overly simplistic but you could give people the option to patch the models which use non-pk primary fields.  

I imagine it'd be easy enough to do:
1. Find models with non-integer primary keys
2. Change the non-integer PK to be a standard unique field (not-pk)
3. Add an integer primary key

That might boil down to "remove PK flag on non-integer primary keys and let django create an integer pk as per usual".

This wouldn't work for people unable to update their database structure (say querying a legacy system) but perhaps they are a corner case you're not overly worried about.





--

Andy Baker

unread,
Feb 18, 2011, 12:38:15 PM2/18/11
to django-r...@googlegroups.com
On Thu, Feb 17, 2011 at 11:45 PM, Oliver George <oliver...@gmail.com> wrote:
> Chances are this is overly simplistic but you could give people the option
> to patch the models which use non-pk primary fields.

Aren't there situations where adding an integer key would change the
behaviour of 3rd party apps?

David Hall

unread,
Feb 18, 2011, 12:44:01 PM2/18/11
to django-r...@googlegroups.com
Indeed.

I think that the non-integer primary keys will have to stay. They do add some pretty massive inefficiencies to some queries, but following Django's lead is probably the way forward here.

The downside is that reversion archives will always be relatively slow and resource-intensive to access. This is fine for last-resort recovery of old versions, but not so great for the people building it into public-facing code.

Ah well.

Dave.


--
You received this message because you are subscribed to the Google Groups "Django Reversion Discussion" group.
To post to this group, send an email to django-r...@googlegroups.com.
To unsubscribe from this group, send email to django-reversi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-reversion?hl=en-GB.

Andy Baker

unread,
Feb 18, 2011, 1:14:07 PM2/18/11
to django-r...@googlegroups.com
> The downside is that reversion archives will always be relatively slow and
> resource-intensive to access. This is fine for last-resort recovery of old
> versions, but not so great for the people building it into public-facing
> code.

Could you have a fast code path that covers the majority of apps that
DO have integer primary keys? I know it's an added coding burden but
it looks like it's a case of "performance, simplicity, compatibility:
pick any two..."

brillgene

unread,
Feb 19, 2011, 2:35:08 PM2/19/11
to Django Reversion Discussion
Here is some feedback:

Does reversion need to support different serialization backends?
Absolutely! We're using YAML currently and really like the fact that
we can switch anytime without impacting earlier or future reversions.
Having said that I assume the flag on each reversion is only 1
character (to store the serialization type)....since we have to
serialize entire objects, saving a char for serialization format
doesn't sound like much.
Also IMHO, the flexibility is not worth giving up. Lets say there is a
new serialization format on the block tomorrow, the current format
allows seamless backward compatible change.

Does reversion need to support non-integer primary keys?
Not, IMHO. As you have said, integer PKs can be significantly
optimized due to faster DB indexing and the concept it pretty much a
standard for SQL at least for relational databases.
Also, existing tables can be relatively easily migrated to add a PK
field as an integer.
However since others have expressed a need for it, I like the idea of
2 different code paths....maybe the code path selection can be manual
and specified at reversion config time at the project level to
simplify the complexity of maintaining which tables have integer PKs
and which do not.

Does reversion need to provide diff-match-patch helpers?
No opinion since we're not using these, but keeping them in the code
might ensure that we don't have to maintain separate documentation for
different versions of the code base in case changes in reversion core
have an impact on these helpers.
Spinning off into another project may be better but would add lots of
overhead and maintenance work.

Does reversion need to support such an advanced low-level api?
Well, the current API was dead simple to understand and use (great
job) but so is your new proposed one.
If the API needs to be changed to simplify the underlying code and
better thread safety, I'm all for it but here are my 2 cents:)

Django docs indicate that objects (Model Manager class) are best used
for dealing with multiple objects of that model. Using
Revision.objects.commit seems to indicate that multiple revisions are
being saved. May I recommend that instead of using the Model Manager,
we use another static function on the revision class? Revision.commit
directly? or use a manager class under a different name since
'objects' manager is typically used to manipulate multiple objects of
model Revision?
Having said that, its only a convention and with simple documentation
can be used that way easily enough.
Reply all
Reply to author
Forward
0 new messages