Default to BigAutoField

939 views
Skip to first unread message

Kenneth Reitz

unread,
Jun 9, 2017, 2:24:37 PM6/9/17
to Django developers (Contributions to Django itself)

Dear Django Dev,

 

At Heroku, we have the privilege of seeing an extremely broad range of customers utilizing tools like Django to build their applications and companies. One of the things that we’ve seen customers hit, time and time again when using tools like Django, is integer overflows for primary keys. Their application starts behaving unpredictably once they reach the overflow, not even knowing such a constraint exists, and they often think the problem is with their database provider, rather than with their schema. Once they realize what is wrong, it’s a relatively trivial fix, but a migration can take several hours to complete, which results in unacceptable amounts of downtime.

 

Because of this, Heroku, as a company, would like to encourage bigints as a sane reasonable default for primary keys for application models. If the Django project agrees with this idea, that would mean that Django would provide BigAutoField as the default for ‘id’ instead of AutoField.

 

Rails made this change recently, and has seen success in doing so.

 

I’m happy to provide the code to do this, but I wanted to discuss it here before doing so, to hear what the general consensus was to the proposal of such a change.

 

 

Pros:

  • Users of Django, following the docs, won’t accidentally hit the int overflow barrier.

  • Encourages best-practices from the beginning.

  • Bigints don’t take up much more storage than ints when using Postgres.

  • In-line with other frameworks moving forward on this issue, like Rails.

 

Cons:

  • Backwards compatibility would need to be considered.

 

Why not UUID?

 

I agree! I love using UUID for my primary keys, and I think a patch to Django which provides an AutoUUIDField would be wonderful. However, there are a few major drawbacks to making this the new default:

 

  1. It’s confusing to new users, would make onboarding process more difficult.

  2. UUID is difficult to implement in MySQL.

  3. UUID has larger storage requirements.

  4. Incrementing IDs are actually useful.

 


So, my proposal is to simply lift the int barrier to a bigint barrier for new Django applications, and I think it will save a lot of developers a lot of pain in the long run.

 

Many thanks,

 

Kenneth Reitz

Heroku Python


Jacob Kaplan-Moss

unread,
Jun 9, 2017, 3:43:28 PM6/9/17
to django-developers
I think this would be a good improvement, and I'd like to see it. I've been bitten by integers overflowing at least twice I can remember in my career, which is two times too many.

However, a major thing we'd have to work out is the upgrade path Consider a simple model:

    class Person(Model):
        name = CharField()

In Django 1.11, this actually generates a model with an integer `id` field. But in we change it, in Django vNext, that `id` field would "turn into" a bigint magically without the underlying table changes. That'd be confusing: you'd expect the model to be "fixed" by pugrading to vNext, but it wouldn't be. I think the migrations engine would detect this as a migration (?), so perhaps that's the path forward, but it could still be super-confusing. We've never shipped a release of Django that required a migration to _all_ your models.

Have you thought about what the upgrade path should look like, Kenneth?

Jacob

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/6fe3401c-4404-4bd8-9d22-58df95cd1348%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tom Forbes

unread,
Jun 9, 2017, 3:53:19 PM6/9/17
to django-d...@googlegroups.com
How would this work with generic foreign keys as well? Those migrations couldn't be automatic.

Kenneth Reitz

unread,
Jun 9, 2017, 3:54:43 PM6/9/17
to django-d...@googlegroups.com
My initial thought was to just have this apply to *new* Django applications, if that's possible.

Running this migration could take quite some time on some larger apps, and would require a non-trivial amount of time to execute. 

--
Kenneth Reitz

You received this message because you are subscribed to a topic in the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/imBJwRrtJkk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.

Kenneth Reitz

unread,
Jun 9, 2017, 3:59:58 PM6/9/17
to django-d...@googlegroups.com
However, it should also be noted that those same larger applications are the ones that are likely to run into this problem eventually, so perhaps forcing the migration is the best path moving forward. 

Interested in hearing thoughts about this.

--
Kenneth Reitz

Curtis Maloney

unread,
Jun 9, 2017, 7:11:05 PM6/9/17
to Jacob Kaplan-Moss, django-developers
I know people hate settings, but what about one for auto id field type?

It would let you handle backwards compatibility, uuid, and bigint...

--
C
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Tim Graham

unread,
Jun 9, 2017, 9:37:04 PM6/9/17
to Django developers (Contributions to Django itself)
I'm not sure how this could work with migrations. In a sense, it would involve making the auto-generated primary key "swappable", including foreign keys that point to it. This sounds like a headache.

I haven't thought of a possible solution since Kenneth floated this idea in #django-dev yesterday.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.

Curtis Maloney

unread,
Jun 9, 2017, 9:53:46 PM6/9/17
to Tim Graham, Django developers (Contributions to Django itself)
I don't see it helping once you hit the problem, but like custom user its a recommendable setting for pre planning

--
C

Collin Anderson

unread,
Jun 9, 2017, 9:53:47 PM6/9/17
to django-d...@googlegroups.com
I might be wrong, but if the default changes, won't the migrations detect it and migrate it just fine, including foreign keys?

All of my migrations have this:
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),

So everyone would either need to manually specify the AutoField to keep the old behavior, or run makemigrations to auto-generate migrations to BigAutoField. This seems similar to increasing the max_length of EmailField, user.username, and user.last_name, though would affect a lot more models in this case.

(I'm not sure what I think about the setting idea.)

(While we're at it, maybe we could make it a (new) PositiveBigAutoField to help out the mysql folks and close the oldest open ticket:



To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

Andrew Godwin

unread,
Jun 10, 2017, 2:59:53 AM6/10/17
to Django developers (Contributions to Django itself)
As long as you changed the actual field that ended up under the "id" column, then yes, migrations should detect it and apply all the changes during an upgrade, including to foreign keys.

Generic foreign keys are another problem, however, not to mention the prospect of how many migrations and ALTER TABLES this would result in on a large project. We could include in the upgrade notes how to stave it off (explicitly specify an ID field), but I'm sort of on the fence as to if that's good enough.

Andrew

Claude Paroz

unread,
Jun 10, 2017, 5:33:35 AM6/10/17
to Django developers (Contributions to Django itself)
I think we should first discuss if it makes sense to set a BigAutoField by default for all tables. I'm not convinced of that yet. I looked at my projects with this perspective and found for example a case where I have many small lookup tables (containing between 2-20 entries) for which I know I would never use BigAutoField if I'd design the schema on paper.

For me, it's a bit like the `on_delete` parameter for foreign keys. A no "one size fits all" situation.
For example, a quick analysis of contrib models (sorry for bad formatting):

Model                                 BigAutoField appropriate
=======================
admin.LogEntry                 yes
auth.User                            no
auth.Group                          no
auth.Permission                 no
contenttype.ContentType no
flatpage.FlatPage              no
redirect.Redirect                no
sessions.Session              yes
sites.Site                             no

Shouldn't we treat that issue by better documentation instead?
Another idea is to leverage the system check framework (--deploy part) to warn when the max id is over 50% of available range.
We are perfectionists, aren't we :-)

Claude

Curtis Maloney

unread,
Jun 10, 2017, 5:40:42 AM6/10/17
to django-d...@googlegroups.com
Right, hence my point of having a global setting to say "the default
auto-field is ..."

This would:
a) let people continue to use AutoField
b) let people opt for BigAutoField
c) let 3rd party projects be agnostic
d) let people use UUIDField(default=uuid.uuid4)
e) possibly make it feasible to change mid-stream from AutoField to
BigAutoField...

--
C
> --
> You received this message because you are subscribed to the Google
> Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-develop...@googlegroups.com
> <mailto:django-develop...@googlegroups.com>.
> To post to this group, send email to django-d...@googlegroups.com
> <mailto:django-d...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/0fba4c2c-cd56-4c7d-82b9-7be0a7a3d233%40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/0fba4c2c-cd56-4c7d-82b9-7be0a7a3d233%40googlegroups.com?utm_medium=email&utm_source=footer>.

Curtis Maloney

unread,
Jun 10, 2017, 5:41:59 AM6/10/17
to django-d...@googlegroups.com
f) let MySQL users opt for PositiveBigAutoField if they want...

Tom Forbes

unread,
Jun 10, 2017, 6:48:15 AM6/10/17
to django-d...@googlegroups.com
I really like the idea of a global configurable setting. One problem with a setting is that it's not always changeable, which settings kind of imply (IMO). Going from int->bigint is always possible, but going backwards may not be, nor might going from int->guid.

I attempted an int->guid migration on one of our systems and ran into some migrations and postgres-specific issues. Seems like it might be incredibly difficult to get a simple toggle setting to work with it, and there are lots of pitfalls.

A warning to the system check framework could be added to check if the primary key type is changed and potentially incompatible generic foreign key columns exist. And in the docs rather than use a plain IntegerField, just use a field that changes based on the setting.

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

Claude Paroz

unread,
Jun 10, 2017, 8:21:49 AM6/10/17
to Django developers (Contributions to Django itself)
Le samedi 10 juin 2017 11:40:42 UTC+2, Curtis Maloney a écrit :
Right, hence my point of having a global setting to say "the default
auto-field is ..."

This would:
a) ...

I see, but this conforms to the pattern "use the same id field type for all models of my project". I'm not sure we should encourage that.

Another idea would be to offer variants of models.Model which models could inherit from, like models.BigIntModel or models.UUIDModel.

Curtis Maloney

unread,
Jun 10, 2017, 8:25:49 AM6/10/17
to django-d...@googlegroups.com


On 10/06/17 22:21, Claude Paroz wrote:
> Le samedi 10 juin 2017 11:40:42 UTC+2, Curtis Maloney a écrit :
>
> Right, hence my point of having a global setting to say "the default
> auto-field is ..."

> I see, but this conforms to the pattern "use the same id field type for
> all models of my project". I'm not sure we should encourage that.

Yeah... naming would be key with "DEFAULT_AUTO_FIELD_TYPE" or whatever...

> Another idea would be to offer variants of models.Model which models
> could inherit from, like models.BigIntModel or models.UUIDModel.

Ah, well... now you're talking. But then, you can do this already as an
abstract base with TYPE as id...

class BigIntModel(models.Model):
id = BigAutoField(primary_key=True)
class Meta:
abstract = True


--
C

Claude Paroz

unread,
Jun 10, 2017, 9:31:10 AM6/10/17
to Django developers (Contributions to Django itself)

Absolutely! It would only be a convenience thing. The more I think about this, the more I think that it might be addressed by better documentation (and maybe some convenience shortcuts).

emo...@mozilla.com

unread,
Jun 11, 2017, 11:59:40 AM6/11/17
to Django developers (Contributions to Django itself)
On Saturday, 10 June 2017 10:33:35 UTC+1, Claude Paroz wrote:
Another idea is to leverage the system check framework (--deploy part) to warn when the max id is over 50% of available range.

I was about to suggest the same. Seems like something worth doing regardless of whether we change the default (since it would help catch issues in existing installations who haven't yet migrated).

Ed

Ash Christopher

unread,
Jun 12, 2017, 10:19:48 AM6/12/17
to Django developers (Contributions to Django itself)
I really like this idea. We created a custom BigAutoField for our systems with sharded databases, and fast growing data.

I understand the desire to fix it for new projects moving forward, but this smells a little like what happened with custom User models - it was introduced for new projects, but there was no clear upgrade/migration path for existing projects. 

--
Kenneth Reitz

To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

Melvyn Sopacua

unread,
Jun 14, 2017, 10:36:36 AM6/14/17
to django-d...@googlegroups.com

On Friday 09 June 2017 15:59:50 Kenneth Reitz wrote:

> However, it should also be noted that those same larger applications

> are the ones that are likely to run into this problem eventually, so

> perhaps forcing the migration is the best path moving forward.

 

 

Existing models are the problem. Then again the database knows the truth. So with a little inspection during apps.get_models we might be able to do the right thing and even allow migrating in steps.

 

Apps is also the place to mark an app as migrated.

 

In fact - couldn't an AppConfig grow a method "get_autoid_type()" and inject the right one?

 

You asked fr thoughts, so there's my 2c stream.

--

Melvyn Sopacua

Kenneth Reitz

unread,
Aug 17, 2017, 5:36:30 PM8/17/17
to Django developers (Contributions to Django itself)
I have opened a pull request:


Andrew and I came up with a good solution for migrations, together at DjangoCon. 

Andrew Godwin

unread,
Aug 17, 2017, 5:43:41 PM8/17/17
to Django developers (Contributions to Django itself)
To elaborate on the solution we eventually came up with - we default models to use a new BigAutoField that migrations will pick up on and generate migrations to alter columns to, but for safety reasons for those that don't read release notes, made the migration autodetector ask you if you want to make these migrations with a slowness warning.

It also tells you how to preserve the old behaviour and avoid new migrations if you wish (manually set id = SmallAutoField)

I like this approach as it means no new settings or Meta options or anything, has a reasonable upgrade path, and won't let people unwittingly wander into giant changes. The downside is that it does add slightly more friction to the upgrade process.

Andrew

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

Markus Holtermann

unread,
Aug 18, 2017, 8:44:23 AM8/18/17
to django-d...@googlegroups.com
Thanks for taking the effort to work on this, Kenneth!

I'm don't fully agree with the approach. This essentially forces 3rd
party package authors to make the call about the primary key field size.
While for small to medium size projects BigAutoField is unlikely
required and only comes with additional (storage) costs. Given that the
migrations would need to be part of the 3rd party package there's also
no (trivial) way for project developers to force or change to
SmallAutoField for those packages. The same thing holds the other way
round.

Unfortunately, I don't have another solution at hand.

I realized that I'm a bit late to the discussion and should've chimed
in before all that work was done. Please accept my apologies for that.

/Markus
>> email to django-develop...@googlegroups.com.
>> To post to this group, send email to django-d...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/django-developers/e3effc41-10e1-42e2-9037-
>> 84c98217cd91%40googlegroups.com
>> <https://groups.google.com/d/msgid/django-developers/e3effc41-10e1-42e2-9037-84c98217cd91%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>--
>You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
>To post to this group, send email to django-d...@googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFwN1uo4Y_pWSf3zAe_4R0GGkDqBv1YGus8Q%2BWPPZ%3DZ6FPwdYQ%40mail.gmail.com.

Andrew Godwin

unread,
Aug 18, 2017, 12:14:44 PM8/18/17
to Django developers (Contributions to Django itself)
On Fri, Aug 18, 2017 at 5:43 AM, Markus Holtermann <in...@markusholtermann.eu> wrote:
I'm don't fully agree with the approach. This essentially forces 3rd
party package authors to make the call about the primary key field size.
While for small to medium size projects BigAutoField is unlikely
required and only comes with additional (storage) costs. Given that the
migrations would need to be part of the 3rd party package there's also
no (trivial) way for project developers to force or change to
SmallAutoField for those packages. The same thing holds the other way
round.

Unfortunately, I don't have another solution at hand.


This is also true of changing the primary key of third-party packages in general though - e.g. there's no way I can make everything use UUIDs even if my database would be way better at those.

I don't see any other solutions that aren't settings doing spooky-action-at-a-distance to primary keys, and that's something I really don't want to see.

Andrew
 

Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/
msgid/django-developers/e3effc41-10e1-42e2-9037-
84c98217cd91%40googlegroups.com
--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

Adam Johnson

unread,
Aug 27, 2017, 5:31:47 AM8/27/17
to django-d...@googlegroups.com
I don't think "primary key size" is something that's even within consideration for 99% of django apps. Sure a few more bytes are going to be wasted here, but you could argue the same that the default AutoField was already too big for most models that have 100's of instances and could use even a one byte primary key.

Defaulting to BigAutoField everywhere is the simple solution that stops everyone from ever worrying about their tables filling up. Additionally using compressed tables helps reclaim nearly all those unused bytes, at least on MySQL.


For more options, visit https://groups.google.com/d/optout.



--
Adam

Tim Graham

unread,
Aug 31, 2017, 11:13:10 AM8/31/17
to Django developers (Contributions to Django itself)
Glancing at the PR, one thing I'm not sure about is renaming AutoField to SmallAutoField. I think that's going to cause needless additional complexity and confusion among people who've worked with Django a long time. For example, you might think that "Small" there has a similar meaning as SmallIntegerField. Did I miss the discussion about the benefits of the rename? If we agree to proceed with it, please split it out into a separate ticket/PR for easier review.

Andrew
 
email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.

Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/
msgid/django-developers/e3effc41-10e1-42e2-9037-
84c98217cd91%40googlegroups.com
--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.



--
Adam

Adam Johnson

unread,
Aug 31, 2017, 12:11:45 PM8/31/17
to django-d...@googlegroups.com
I agree with Tim. I also think the rename has potential to mess with historical migrations, or other uses of the field classes, since the output of deconstruct() will change the class name.

To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Adam
Reply all
Reply to author
Forward
0 new messages