Composite key development

422 views
Skip to first unread message

Craig Kerstiens

unread,
Nov 29, 2016, 7:10:38 AM11/29/16
to django-d...@googlegroups.com
Hi all,

My company (Citus Data) is interested in sponsoring some Django work. In particular work on support for composite primary keys. From what I understand this wouldn't be the first time the work has been explored and it sounds like it has a number of intricacies to it (https://github.com/django/deps/blob/master/draft/0191-composite-fields.rst and https://github.com/django/deps/blob/master/draft/0192-standalone-composite-fields.rst). Our hope here is that it would be done in line with something that could eventually become part of an official Django release vs. a one-off work around. 

While we know there's no guarantee of it being accepted, we'd love to find someone with some knowledge of all the existing areas that would have to be touched and has some experience contributing to Django core to improve that likelihood. As part of the work, we would want the two existing DEPs to be completed and work towards getting those accepted by the broader community. And hopefully it goes without saying, but we'd fully expect all the work to done in the open similar to other Django development. 

If you're interested in doing this work please reach out as we'd love to discuss further. And if we have enough interest we'll be doing a full CFP for the work to try to keep the process as fair as possible. 

~ Craig

Julien Phalip

unread,
Feb 27, 2017, 2:10:55 PM2/27/17
to Django developers (Contributions to Django itself)

Thank you Craig for bringing this up and for offering support. This is a long-wanted feature —the original ticket is 12 (!) years old— and as you know there has been extensive work done on this in the past, so it'd be great to find a way to push this all the way across the finish line.


I just took a bit of time and tried to review all past discussions and implementations on the subject. See my recap and questions below. If anyone here on this list has any answers or thoughts about this, please jump in! I will then compile all new feedback and try to help move this forward.


Composite fields DEPs

==================


Currently the consensus as a first step is to finalize these two draft DEPs:


https://github.com/django/deps/blob/master/draft/0191-composite-fields.rst

https://github.com/django/deps/blob/master/draft/0192-standalone-composite-fields.rst


Those drafts were written by Thomas Stephenson in 2015 and were discussed on this thread: https://groups.google.com/forum/#!topic/django-developers/MZUcOE6-7GY as well as in this pull request: https://github.com/django/deps/pull/12


It seems that some extra topics might need to be included and fleshed out in those DEPs, in particular:

- Inspectdb integration

- Migration implications

- Form fields handling

- Admin integration

- Serialization/deserialization


Some questions:

- Do you agree that all the topics above should be covered in the DEPs?

- Should other topics also be covered?

- Are the current design details (observer pattern, lookup functions, constraints management, etc) already accepted or do they still need more discussion?


Thomas had also started an implementation for these DEPs in this branch: https://github.com/ovangle/django/tree/dep_191


I've just merged current master into that branch and pushed it to my fork. You can see the details here: https://github.com/django/django/compare/eb97af0402536a86d5d8c916859986288bb88dc9...jphalip:dep_191


Some questions:

- What is the status of this implementation work?

- What are the pieces that still need to be implemented in order to fulfill the existing DEPs?

- Is this implementation at a stage where it can formally be reviewed? If so I can create a PR to initiate a more detailed discussion.


Composite primary keys (GSoC)

=========================


Michal Petrucha participated in two Google Summer of Code (GSoC) projects in 2011 and 2013, during which some extensive code, tests and documentation were written. Unfortunately this project didn't reach the point where it could be merged into core and it eventually fell behind. Some relevant resources about these projects can be found here:

- GSoC'2013 proposal: https://gist.github.com/koniiiik/5408673

- GSoC'2013 status updates: https://groups.google.com/forum/#!topic/django-developers/CD7OrkJ63zc

- Discussion about some implementation details: https://github.com/django/django/pull/1407

- Status update from Michal from last year: https://groups.google.com/forum/#!topic/django-developers/SunyXc_BTVM


Michal's implementation work is divided in two separate branches ("composite-fields" and "foreignkey-refactor"), presented below with their corresponding list of commits:


"composite-fields" branch:

- Branch: https://github.com/koniiiik/django/tree/soc2013/composite-fields

- Commits: https://github.com/django/django/compare/041a076dadce547d450cf73d97401d63cde8891d...koniiiik:soc2013/composite-fields


"foreign-refactor" branch:

- Branch: https://github.com/koniiiik/django/tree/soc2013/foreignkey-refactor

- Commits: https://github.com/django/django/compare/041a076dadce547d450cf73d97401d63cde8891d...koniiiik:soc2013/foreignkey-refactor


I took a quick stab at merging current master into those branches but didn't make it too far as it turned out to be a little tedious. The main reason is that since 2013 there have been extensive changes in the ORM internals, in particular around the Meta api. So I'm not sure if it's worth doing that merge at this point. However, it'd be a shame to let all that great work go to waste and hopefully some of it can still be reused.


Some questions:

- Are there any portions of code or some solutions from Michal's implementation that could be ported over to the DEPs and to Thomas' more recent implementation?

- What is the difference or relationship between the "composite-fields" and "foreignkey-refactor" branches?

- Should a new DEP be submitted to specifically tackle composite primary keys and foreign keys?


Other resources

============


Open tickets relating to this subject:

- Original ticket for composite primary keys: https://code.djangoproject.com/ticket/373

- Support for Virtual Fields: https://code.djangoproject.com/ticket/16508


Third-party implementations of composite fields and composite primary keys:

- https://github.com/simone/django-compositekey

- https://github.com/vporton/django-composite-fields

- https://bitbucket.org/bikeshedder/django-composite-field


Other relevant links:

- Out-dated Django wiki page: https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys

- SQL Alchemy's PrimaryKeyConstraint API: http://docs.sqlalchemy.org/en/latest/core/constraints.html#sqlalchemy.schema.PrimaryKeyConstraint


That's about it for now. Again, please anyone feel free to jump in with any questions or feedback, or if you're interested in contributing to this work.


Thanks!


Julien 

Michal Petrucha

unread,
Feb 27, 2017, 3:45:39 PM2/27/17
to django-d...@googlegroups.com
Hi everyone,

I can't speak for Thomas' implementation, as I haven't had the time
and energy to look into it too much, but I can try to clarify some
aspects of what I did years ago.
Sadly, I don't see how a merge of current master would be feasible
with those branches. This all happened before migrations, the explicit
AppRegistry, and all the meta changes were merged, all of which cause
quite some conflicts. It should be feasible to port significant
portions of my old work onto master, though, commit by commit.

> Some questions:
>
> - Are there any portions of code or some solutions from Michal's
> implementation that could be ported over to the DEPs and to Thomas' more
> recent implementation?

I would hope so, but unfortunately, can't say with any certainty. Even
if I just look at the interface, Thomas' approach seems quite
different from mine, but without going through the code, it's hard for
me to tell what parts could be relevant.

> - What is the difference or relationship between the "composite-fields" and
> "foreignkey-refactor" branches?

Right, so the gist of it is that if we ever want to support composite
foreign keys, it will be necessary to turn ForeignKey into a virtual
field, which is exactly what I attempted to achieve in the
“foreignkey-refactor” branch. This would also have the neat side
effect of clearly defining what even is a virtual field in the ORM,
since that was definitely not a very well defined concept at the time.
(I think it's gotten better with the formalization of _meta, but it's
still very heavily influenced by the early approach of “any aspects of
GenericForeignKey that don't closely match regular fields”.)

My thinking was that I'd first try to get the refactoring part as
solid as possible, and only then start to build on that by introducing
the new concept of composite fields. Unfortunately, I seem to have
crammed too much into the GSoC project, so I couldn't really stop and
take the time to get the FK refactor over the finish line, and instead
focused on getting as much of the new-feature code finished as
possible.

So yeah, foreignkey-refactor is the part where I turned ForeignKey
into a virtual field, and composite-fields is the part based on the
first branch where I added composite fields to the equation.

FWIW, there's https://code.djangoproject.com/ticket/16508 where I've
tried to introduce some prerequisites for the ForeignKey refactor in
smaller chunks.

> - Should a new DEP be submitted to specifically tackle composite primary
> keys and foreign keys?

Good question. At a glance, Thomas' DEPs do not mention composite
foreign keys at all, so I guess if we want support for those at some
point, the complexity of this task would definitely warrant a DEP.

Anyway, most of this email focuses on virtual FKs, rather than
composite fields and composite primary keys. In my mind, that's the
order in which those things make sense, but I may be overly ambitious
here (and history seems to agree with that sentiment).

Cheers,

Michal
signature.asc

Asif Saifuddin

unread,
Feb 28, 2017, 12:03:56 AM2/28/17
to Django developers (Contributions to Django itself)
Hi Julian,

I have been also reviewing and studying the previous works, deps, discussions, codes and tickets. I have also been working to prepare a new dep based on the previous works.

Like what Michal said, from my observation, I found the works and approaches of Michal is quite better and it's possible to break the work down into smaller parts to implement gradually.

I am not quite sure how much work or which approaches of  Thomas Stephenson in 2015 could be mixed or needed with Michal's approach. In my humble opinion Michal spent more time in working on this and I also found his insights and suggestions on this topic more sensible.

I would also like to work on this. My Dep is not yet ready for a push.

You could certainly review and give you input on the Dep. Some core members suggested me to finalize a Dep before div into code.

Let me know what your thoughts.

Regards,

Asif

I have also contact with 

Roger Gammans

unread,
Feb 28, 2017, 2:36:17 AM2/28/17
to django-d...@googlegroups.com
On Mon, 2017-02-27 at 21:03 -0800, Asif Saifuddin wrote:
Hi Julian,

I have been also reviewing and studying the previous works, deps, discussions, codes and tickets. I have also been working to prepare a new dep based on the previous works.

I haven't had a chance to review Michael's code (is there a github link?) but I think the composite field approach is almost certainly a sensible route. Which is a shame because it adds quite a bit of extrope tha refactor work for me. ;-).

The code I have got[2], does not take that route[1], shows the switch between single field and composite primary adds a lot of complexity to far too many (IMHO) code routes. If it was a handful I wouldn't worry a great deal, but there is a lot of complexities which fall on the admin code which don't quite fall out as neatly as I would like. 

Hence I hope that by providing a named composite field, then we mitigate the issues above as arguable we still only have single field PK, but that that field is itself composite which may well mitigate the issues I have been seeing.


[1] We took a patch from somewhere which allowed multiple primary_key=True specifications in the model.
[2] Initially developed by by one of by colleagues based on some other patches he found  - We may unfortunately have lost the attribution.
-- 
Roger Gammans <rgam...@gammascience.co.uk>

Asif Saifuddin

unread,
Feb 28, 2017, 8:14:36 AM2/28/17
to Django developers (Contributions to Django itself)
Hi Roger,

I do agree with your points. I have some thoughts which I will share in Dep/mailing list soon.

Thanks


On Tuesday, November 29, 2016 at 6:10:38 PM UTC+6, Craig Kerstiens wrote:

Julien Phalip

unread,
Feb 28, 2017, 6:58:50 PM2/28/17
to django-d...@googlegroups.com
Hi Asif,

That sounds good. On first look I did have some reservations about some of the design details in the current DEP, especially around the observer pattern for data binding. But I’m going to have to dig deeper into the implementation to get a clearer idea.

It’d be great if you could publish your work-in-progress at some point so we can discuss the approach in more detail.

Thanks,

Julien

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/cc29383b-b930-4d4c-9f48-51fa10909ecd%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Julien Phalip

unread,
Feb 28, 2017, 7:13:41 PM2/28/17
to django-d...@googlegroups.com
Hi Michal,

Thanks for your detailed reply. Very helpful.

I was also curious, during your work did you look into SQLAlchemy? They seem to have a pretty elaborate system for multi-column fields. I know that overall the SQLAlchemy and Django ORMs follow a different approach, but I’m wondering if perhaps some inspiration could be found in how they specifically handle composite fields, both API-wise and implementation-wise.

Julien

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

Asif Saifuddin

unread,
Mar 1, 2017, 12:08:35 AM3/1/17
to Django developers (Contributions to Django itself)
Hi Julien,

I will publish it very soon.

Thanks
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.

Anssi Kääriäinen

unread,
Mar 20, 2017, 4:14:19 AM3/20/17
to Django developers (Contributions to Django itself)
Just my late 2 cents to this...

First, I wouldn't put too much weight on the DEP proposals. Looking back to them now, I don't think they went to the right direction.

For Michal Petrucha's work, it was really close to being merged some years ago. The problem was that migrations happened at the same time, and that resulted in need of substantial changes to Michal's work, and unfortunately this never happened. So, I'd put a lot of weight on this work.

If I'd were to take on this task, I'd try to develop the feature with the idea of merging the work in multiple batches. For example this might be a possible plan:
  1. Introduce virtual/composite field, only CREATE TABLE migrations support at this point, not documented as public API
  2. Add support for primary key and unique composite fields, again non-public
  3. Work on foreign keys, again non-public
  4. Migrations support
  5. Make the work public API (that is, document the work)

This would give a chance to get the work done in incremental pieces. Doing this all in one go is such a large task that there is a big risk of not getting this done at all.

Whatever approach you take, I really hope you'll be able to work on this and get this long waited feature in to Django.

 - Anssi

Julien Phalip

unread,
Mar 24, 2017, 4:47:46 PM3/24/17
to django-d...@googlegroups.com
Thank you Anssi. It’s very useful to have your perspective as you’ve done a lot of oversight work on this specific feature before and have lots of experience working with the ORM internals.

So it seems like the consensus at this point is to use Michal’s original work as a basis. I like the way you’ve broken down the different steps. This makes sense and would likely increase chances for eventually reaching the end goal. Each one of those steps could potentially be fleshed out as independent DEPs at some point.

Personally I’m also very keen to make this happen so I’m volunteering to provide oversight and help with documentation, testing and code reviews.

This approach also seems in line with what Asif mentioned earlier about his own work. Let’s give him a chance to provide some updates on this when he’s ready, and then we can take it from there.

Thanks all!

Julien

To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

Asif Saifuddin

unread,
Mar 25, 2017, 1:32:12 PM3/25/17
to Django developers (Contributions to Django itself)
Hi Julien,

actually I am following ansi and michals works and suggestions regarding this issues so my plan is quite in line with their suggested plan of implementation.

I have opened a PR in dep which is very very WIP right now. I haven't been able to address everything nicely till now, but you could check you the dep and put some questions/suggestion there so that I can address them well.

Currently I am working on first steps [Fields/Relations API clean up+VirtualField based refactor, composite key might come after that of may be initial work of composite key based on virtualField could also be addressed]

I will work on next few days to complete it.

Thanks,
Asif
Reply all
Reply to author
Forward
0 new messages