Google Groups

Re: Deprecate change pk + save behavior for 1.4


Ian Clelland Nov 30, 2011 1:15 PM
Posted in group: Django developers (Contributions to Django itself)


On Wed, Nov 30, 2011 at 12:40 PM, Kääriäinen Anssi <anssi.ka...@thl.fi> wrote:

"""
Is this referring exclusively to natural, or user-specified primary key columns? Despite Luke's reference to nullable primary keys (are these even allowed by SQL?), a common idiom for copying objects is this:
"""

Not allowed by SQL specification, but many databases do allow them (source wikipedia).

/me runs off to go correct Wikipedia ;) 

I checked the Wikipedia article on Primary Key first, and didn't see that, but I did note this:

A table can have at most one primary key, but more than one unique key. A primary key is a combination of columns which uniquely specify a row. It is a special case of unique keys. One difference is that primary keys have an implicit NOT NULL constraint while unique keys do not.


 A PRIMARY KEY is a unique index where all key columns must be defined as NOT NULL. If they are not explicitly declared as NOT NULL, MySQL declares them so implicitly (and silently).
 

Technically, a primary key constraint is simply a combination of a unique constraint and a not-null constraint.


A primary key constraint combines a NOT NULL constraint and a unique constraint in a single declaration. That is, it prohibits multiple rows from having the same value in the same column or combination of columns and prohibits values from being null. 
 


"""
   obj.pk = None
   obj.save()

I have used use this pattern in more instances than I can remember; whether for duplicating objects, or for making new variants of existing objects. I would hate to see the behaviour deprecated, or worse, for the old object to simply get reassigned a new (or null) id.
"""

If nullable primary keys are going to be allowed, then the above can not work.

That would be a huge backwards compatibility issue (from my perspective, at least).
 
You would need to use NoneMarker in there, or .save() would need a kwarg for backwards compatibility mode. obj.clone() is still another possibility. Maybe nullable primary keys should be forbidden?

(I like that last one best :) )
 

"""
For changing natural primary key fields, I would prefer to see a pattern like this:

class User:
  firstname = models.CharField
  lastname = models.CharField
  pk = (firstname, lastname)

u = User.objects.get(firstname='Anssi', lastname='Kääriäinen')
u.firstname='Matti'
u.save(force_update=True)
"""

That is a possibility, although currently that has well defined meaning: try to update the object with pk ('Matti', 'Kääriäinen'), error if it does not exist in the DB.

That is also true, and makes the whole situation more difficult to resolve.
The trouble stems from the fact that once the user changes any of the primary key fields, we currently lose the ability to refer uniquely to the row in the database that represents "the current object". (This may actually go against one of Codd's rules, but then, the ORM isn't actually a database itself).

Originally, Django ORM objects had essentially no metadata to tie them to the database, once retrieved. They could be only associated with a particular row by their implicit id field, which made the force_insert and force_update (and the mess of logic in Model.save()) necessary, and made idioms like "clear the pk and save to copy" possible.

With multi-db, Django has moved part-way to marrying an ORM object with the actual database storage location -- objects have a hidden record of the database from which they were retrieved, but id is still just a mutable field like any other. It sounds like what is being proposed is that model instances should not only know which database, but which table row their backing data is stored in -- and that should persist despite any changes made in code to any field in the instance.

(I'm still not sure whether I should support or oppose this, but my knee-jerk reaction is "Hey! That's going to break half of my management commands!")
 

"""
(specifically, with the force_update parameter being required for a PK change). Then, as long as we store the original PK values, the object can be updated in place. A bare save() would work just as currently changing the id field does -- create a new row if possible, otherwise, update the row whose PK matches the new values.
"""

IMHO forbidding creation of a new object while leaving the old object in place when calling save() is needed. Current behavior is unintuitive. One clear indication of this being unintuitive is that even Django's admin does not get it right. If bare save() will be deprecated, then an upgrade path for current uses is needed. A new kwarg for save (old_pk=None would be a possibility) or obj.clone() would be needed.

I'm not sure that I agree -- I don't know if there needs to be a fundamental distinction between a new model instance and one that was retrieved from the database. I do agree that there should be a way to specify "change the primary key on this object" vs "save a new object with this primary key".

(And I'm waiting for someone to come in and say "The current behaviour is documented, leave it alone. Use filter(pk=old_value).update(pk=new_value) if you want to change a primary key. It's all in code anyway, and you should know which fields are PK fields before you go messing with them." :) )

Solving all these problems before 1.4 seems hard.

Agreed.
 

 - Anssi

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.




--
Regards,
Ian Clelland
<clel...@gmail.com>