Pre-DEP: Meta.without_primary_key (related to CompositeFields)

127 views
Skip to first unread message

sky.d...@moveon.org

unread,
May 22, 2017, 2:52:11 PM5/22/17
to Django developers (Contributions to Django itself)
Hi,

We have several legacy database tables that don't have primary keys. With older versions of Django, we've hacked it by lying about a field that was not a primary key but recent Django versions validate pks more strictly.

Some (but not all) of our legacy tables have multiple primary keys -- i.e. are unique only across a few fields.  This harks to the CompositeField work and discussion [0].

But CompositeFields are not enough for us, some of our tables are essentially append-only, and have no uniqueness constraints across any/all fields.  It also seems like CompositeField has stalled several times precisely because we are spiking to a very complex end goal.

I'd like to propose, both as an incremental step to CompositeFields and something useful in itself, a model Meta option for `without_primary_key` -- if Meta.without_primary_key=True then it would turn off the complaints during model validation, etc.  One might object that things like get/delete/caching can't work with that model.  However those features can't be supported in tables without a primary key anyway.

Incrementally, after without_primary_key is implemented/supported, we could then add features for models without_primary_key but also has a Meta.unique_together value across some fields -- i.e. start trying to support inheritance and/or ForeignKey references to those tables, building up support.

I've started looking at how deep a change this would be, and believe it's pretty tractable.
Before I get too involved with a DEP and PR, what do people think?

/sky


Shai Berger

unread,
May 23, 2017, 4:31:47 AM5/23/17
to django-d...@googlegroups.com
Hi,

Thank you for making this suggestion.

It is my guess that allowing pk-less models will place quite a burden on many
parts of Django, which assume a PK exists. There may also be other solutions
to the problem you raise -- e.g. changing the legacy table to add a PK,
perhaps while providing a pk-less view to any legacy systems which need to
access it.

In general, SQL database tables without any uniqueness guarantee are an
antipattern, which I don't believe Django should support. The question remains
how much such a feature can be made to contribute towards composite keys.

All in all, I would like to know more about your use case -- if you are going
to have no get/delete, no Admin, no updating save, how exactly are you going
to use these models? As you may be aware, since the Meta API formalization, it
is possible to create pseudo-models which are good enough for many purposes,
without changing Django and with much less strict adherence to "real" models'
behavior. Perhaps that is the way to go?

HTH,
Shai.

sky.d...@moveon.org

unread,
May 23, 2017, 2:45:30 PM5/23/17
to Django developers (Contributions to Django itself)
Hi Shai,

Thanks for your feedback.

Our 'real' use case, is we have an opaque legacy application that we are rewriting in Django.  Adding columns to the tables before we have migrated it away from the old application is too risky to do.  I'm at the PyCon2017 sprins now, and as JKM said it -- having to deal with non-primary-key databases might be due to poor life choices :-).  I agree that non-uniqueness is an antipattern -- my argument is purely about what systems the ORM can interract with/support rather than a feature that would/should be used to build new applications/models.  I would hope that just like with RawSQL or @csrf_exempt, docs would make that clear.

The huge value we get even without get() or admin/etc is making multi-join queries super-easy through the amazing queryset api.  I'm not very familiar with the Meta API formalization, so maybe you could elaborate how we would use it in such a circumstance -- would I somehow be setting _meta = CustomThing() the way we add model managers by setting objects?  Would that work on the same backend/db connection?

I originally feared that removing PK would be a complex and burdensome project -- which is why I have been content with hacks until now. As I've found preliminarily, pk is not as bound as I thought.  This is mostly because unsaved model objects are already possible, so a lot of code already tests for model.pk before doing something with it.  I was also surprised to see, e.g. db.migrations doesn't seem like it would need any changes at all.

To go back to use-cases, and the relation to composite fields, a lot of our keyless tables seem to be about set membership.  Most of these *are* basically elaborate many-to-many intermediate tables, where two (or three) fields link several tables together as connected. When we have those multiple ids, I have been using models.ForeignObject to make ORM links.

One is something like:

class Vote(models.Model):
   list = models.ForeignKey("List")
   user = models.ForeignKey(User)
   ## a bunch of other fields

   comment = models.ForeignObject("VoteComment", on_delete=models.CASCADE,
                                  from_fields=['list_id', 'user_id'],
                                  to_fields=['list_id', 'user_id'])

I know ForeignObject isn't an externally supported API.  However, I think it does gesture how composite foreignkeys would emerge from this.  The first step is to get rid of places that depend on a (unique and singular) primary key -- from there, we can (even slowly) add support for a ForeignKey pointing to an object without a primary key but does have the same unique_together field coupling.  Once we support ForeignKey() that way, we can work on support for inheritance and admin support.

/sky

Roger Gammans

unread,
May 25, 2017, 12:47:24 PM5/25/17
to django-d...@googlegroups.com
Hi,

This would be useful for us too ; this is our use-case, again this is a
legacy schema which are rebuilding the system to use django, but there
are some models which we are using composite-pk support for due to the
following. Having these feature many we could use a vanilla django.(eg
one without local patches)

Many of the table in our system are effectively, what django refers to
a through table. Eg, the join table in a many-2-many relationship , or
more often then not multi-join.  We almost always fetch by a a the
set of joined PKs and those have been used a composite primary key on
the table.

In this use case we are accepting not having admin tables as in many of
them are computed detail tables, but we still have uniqueness
guarantees at the SQL level since the Db still has the tables defined
with a PK.

Django's query builder as standard cannot interact with these tables,
adding this option will allow us to code model which partially capture
the semantics and build queries against them. I don't think we lose the
ability to use get() as there is still a uniqueness guarantee on the
table; it just it is partially hidden from the orm .

In some cases these are our largest tables with 10^7 or even 10^8 rows,
so adding an additional field is not undertaken lightly. We also have
an additional restriction in that the legacy code is still active is
some part of our system and still need to interact with these tables as
well.

I hope that might explain one way this option could be helpful.
Reply all
Reply to author
Forward
0 new messages