{{{
class BigAutoField(AutoFieldMixin, BigIntegerField):
def get_internal_type(self):
return 'BigAutoField'
def rel_db_type(self, connection):
return BigIntegerField().db_type(connection=connection)
}}}
Many projects prefer UUID field over big int for pk. So far we had to use
a mixin for that but I am under the impression `DEFAULT_AUTO_FIELD` exists
to lift this burden.
I'd expect something like this to work (additional adjustments might be
needed)
{{{
from django.db import models
from django.db.models.fields import AutoFieldMixin
class UUIDAutoField(AutoFieldMixin, models.UUIDField):
def get_internal_type(self):
return 'UUIDAutoField'
def rel_db_type(self, connection):
return models.UUIDField().db_type(connection=connection)
}}}
But if I use this for `DEFAULT_AUTO_FIELD` I get the error
{{{
ValueError: Primary key 'common.fields.UUIDAutoField' referred by
DEFAULT_AUTO_FIELD must subclass AutoField.
}}}
I noticed two things I want to share.
First, the default auto field `BigAutoField` is not subclassing
`AutoField` so it's not consistent with the error message.
Second, `AutoField` is subclassing `IntegerField`. If I understand this
correctly, users are limited to subclass `IntegerField` for
`DEFAULT_AUTO_FIELD`. If so, there's no way to implement `UUIDAutoField`
as `DEFAULT_AUTO_FIELD`.
I understand auto fields need to share a common interface but having to
subclass `IntegerField` is a big constraint.
I'm happy to open a PR once I know what's the desired functionality.
Possible solutions I see:
- enforce subclassing from `AutoFieldMixin` instead of `AutoField`
- I don't think this field has to be very generic because DB pk types are
very limited. As far as I know, only ints and UUIDs make sense for pk.
Maybe adding `UUIDAutoField` to django fields would make sense. That way
it'd have to enforce subclass of `AutoField` or `UUIDAutoField`
Mentioned also in [https://groups.google.com/g/django-
developers/c/MBPEPKlibGQ/m/uJVZbMpqBAAJ here],
[https://groups.google.com/g/django-
developers/c/MBPEPKlibGQ/m/ntEoLCCMBAAJ here] and
[https://github.com/django/django/pull/13179#issuecomment-742473840 here].
{{{
Django==3.2rc1
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32577>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:1>
Old description:
New description:
{{{
}}}
{{{
- enforce subclassing of `AutoFieldMixin` instead `AutoField`
- I don't think this field has to be very generic because DBs pk types are
very limited. As far as I know, only ints and UUIDs make sense for pk.
Maybe adding `UUIDAutoField` to django fields would make sense. That way
it'd have to enforce subclass of `AutoField` or `UUIDAutoField`
Mentioned also in [https://groups.google.com/g/django-
developers/c/MBPEPKlibGQ/m/uJVZbMpqBAAJ here],
[https://groups.google.com/g/django-
developers/c/MBPEPKlibGQ/m/ntEoLCCMBAAJ here] and
[https://github.com/django/django/pull/13179#issuecomment-742473840 here].
{{{
Django==3.2rc1
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:2>
* type: Uncategorized => New feature
* stage: Unreviewed => Accepted
Comment:
OK, yes, part of the discussion was possible extension to ''other kinds''
if field, so we can accept this as a new feature.
(To be clear, I consider it out-of-scope for the original ticket, and so
not a release blocker for 3.2)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:3>
Comment (by Tomasz Wójcik):
I understand it's out of scope for 3.2 as you already have a RC though I
think docs could mention it has to subclass `AutoField`.
{{{
The type of this implicit primary key can now be controlled via the
DEFAULT_AUTO_FIELD setting and AppConfig.default_auto_field attribute. No
more needing to override primary keys in all models.
}}}
This could be more specific as it doesn't apply to UUID pk projects.
----
I'm glad it's accepted as a feature request, thanks. I'm looking forward
to hearing how you want this to be tackled. And as I said, I'm happy to
open a PR as I want to use it with UUIDs in my next project.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:4>
* cc: Diego Lima (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:5>
Comment (by Ian Foote):
> I don't think this field has to be very generic because DBs pk types are
very limited. As far as I know, only ints and UUIDs make sense for pk.
While I'm not sure it would make sense in an {{{AutoField}}}, string
primary keys are possible. For example, you could use a language code
({{{en}}}, {{{fr}}}, {{{de}}}, etc) as a primary key.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:6>
Comment (by Tomasz Wójcik):
Replying to [comment:6 Ian Foote]:
> > I don't think this field has to be very generic because DBs pk types
are very limited. As far as I know, only ints and UUIDs make sense for pk.
>
> While I'm not sure it would make sense in an {{{AutoField}}}, string
primary keys are possible. For example, you could use a language code
({{{en}}}, {{{fr}}}, {{{de}}}, etc) as a primary key.
Sure, my phrasing wasn't the best. What I meant is I think users only use
numeric types and uuids for generic app-wide primary keys so I don't think
it's necessary to allow other types for {{{AutoField}}}, such as string.
So instead of refactoring the entire {{{AutoField}}}, another type -
{{{UUIDAutoField}}} could be added.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:7>
Comment (by Antonin Grêlé):
Replying to [comment:7 Tomasz Wójcik]:
> What I meant is I think users only use numeric types and uuids for
generic app-wide primary keys so I don't think it's necessary to allow
other types for {{{AutoField}}}, such as string. So instead of refactoring
the entire {{{AutoField}}}, another type - {{{UUIDAutoField}}} could be
added.
Could I argue that IDs based on human-readable strings, like
[https://coolname.readthedocs.io coolname], are a valid use case?
It's easier to reference '"soft-cuddly-shrew-of-expertise" than "a1e1b1c5
-5d5e-4eef-9ee9-e74e87ce71d6" or "2467584".
Would this be seen as bad practice and we should use a slugfield that's
not the primary key for something like this instead?
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:8>
Comment (by Tomasz Wójcik):
Replying to [comment:8 Antonin Grêlé]:
> Replying to [comment:7 Tomasz Wójcik]:
> > What I meant is I think users only use numeric types and uuids for
generic app-wide primary keys so I don't think it's necessary to allow
other types for {{{AutoField}}}, such as string. So instead of refactoring
the entire {{{AutoField}}}, another type - {{{UUIDAutoField}}} could be
added.
>
> Could I argue that IDs based on human-readable strings, like
[https://coolname.readthedocs.io coolname], are a valid use case?
>
> It's easier to reference '"soft-cuddly-shrew-of-expertise" than
"a1e1b1c5-5d5e-4eef-9ee9-e74e87ce71d6" or "2467584".
>
> Would this be seen as bad practice and we should use a slugfield that's
not the primary key for something like this instead?
Popular use case for a "readable uuid" like that is a randomly generated
username. It's usually not a pk, just has a unique constraint, because you
often perform many operations against users table and you don't want this
table to be slow on joins. For other cases, why does it need to be
readable at all?
To my understanding there's no significant performance hit from using UUID
as a primary key. It is 4x larger than int32 (128bit). So also index is
bigger, cache will store less data etc.
To my knowledge, varchar is a completely different story. It's way bigger
and more difficult to sort, because it's very difficult to compare.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:9>
Comment (by Mariusz Felisiak):
Note: Fixing this ticket will also resolve #17337.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:10>
Comment (by Nick Pope):
I've been involved in some of the changes and discussions related to
`AutoField` improvements and `DEFAULT_AUTO_FIELD` and wanted to clarify a
few things, as well as what needs to be done to achieve the creation of
`UUIDAutoField`.
----
Some points about `DEFAULT_AUTO_FIELD` which was added in
b5e12d490af3debca8c55ab3c1698189fdedbbdb:
- The current implementation restricts to values that are a subclass of
`AutoField`. (See
[https://github.com/django/django/commit/b5e12d490af3debca8c55ab3c1698189fdedbbdb
#diff-11ee95e0230bcff70a542128891516b8022177e0af58f00257aee9e8ef83a933R245
here].)
- It was named `DEFAULT_AUTO_FIELD` to indicate that it referred to the
''automatically generated'' `id` field.
- This was instead of `DEFAULT_AUTOFIELD` which would imply that it
could only ever be related to `AutoField`.
- Another suggestion had been `DEFAULT_MODEL_PK`, but we weren't ready
to allow other fields without `AutoField`-like semantics.
- We could lift the restriction to allow other non-`AutoField` types, but
the concept of an ''automatically generated'' `id` field needs to be
decoupled from `AutoField` a bit.
----
Some history and thoughts about `AutoField`:
- Much of the logic related to the ''automatically generated'' `id` field
is hardwired to look for `AutoField` subclasses. This needs to be fixed.
- `AutoField` has essentially always been like an `IntegerField`, but
historically copy-pasted some of that rather than inherit from
`IntegerField` directly.
- `BigAutoField` inherited from `AutoField`, but overrode with parts of
`BigIntegerField`, but it didn't originally inherit from
`BigIntegerField`.
- As a result, these `*AutoFields`s didn't support the system checks or
validators of their `*IntegerField` equivalents.
- I addressed some of this in 21e559495b8255bba1e8a4429cd083246ab90457 for
#29979:
- Moved the logic about automatically generated fields into
`AutoFieldMixin`.
- Fixed the `*AutoField` classes to inherit from `AutoFieldMixin` and
their appropriate `*IntegerField` type.
- Added `AutoFieldMeta` to maintain backward compatibility for
`isinstance(..., AutoField)` checks and provide a hook point for
deprecation of it in the future.
'''One more thing to mention about an ''automatically generated'' `id`
field is that the value is generated by the database and returned - it
must not be generated in Python.'''
----
What needs to be done to get to `UUIDAutoField`? Well, here is what I
think the path is...
- Add a `db_generated`/`is_auto` (or another better name) flag to
`AutoFieldMixin`. (See
[https://code.djangoproject.com/ticket/24042#comment:5 my comment]. I'm
not convinced #24042 should have been closed.)
- Replace `isinstance(..., AutoField)` checks with the new flag. (I seem
to recall that `isinstance(..., AutoFieldMixin)` wasn't really favoured.)
- This will also open up `DEFAULT_AUTO_FIELD` to a wider range of non-
integer automatically generated field types.
- Deprecate `isinstance` (and `issubclass`?) checks on `AutoField` within
`AutoFieldMeta`. (The metaclass can be scrapped when the deprecation
period ends.)
- Implement support for `.db_default` so that we can specify a function
for generating a UUID on the database rather than in Python, e.g.
[https://docs.djangoproject.com/en/3.2/ref/contrib/postgres/functions/#randomuuid
RandomUUID].
- Work is ongoing in this [https://github.com/django/django/pull/13709
PR] to address #470 which will implement this.
- It is also possible that this
[https://github.com/django/django/pull/11783 PR] to address #30032 is an
acceptable alternative path.
- Make `AutoFieldMixin` part of the public API and document it so that
others can use it for their own non-integer, non-UUID implementations.
- Finally, implement `UUIDAutoField` itself. This is a start, there will
be more pieces to the puzzle:
{{{#!python
class UUIDAutoField(AutoFieldMixin, UUIDField):
def __init__(self, *args, **kwargs):
kwargs['db_default'] = RandomUUID()
super().__init__(*args, **kwargs)
def deconstruct(self):
name, path, args, kwargs = super().deconstruct()
del kwargs['db_default']
return name, path, args, kwargs
def get_internal_type(self):
return 'UUIDAutoField'
def rel_db_type(self, connection):
return UUIDField().db_type(connection=connection)
}}}
It is likely that `AutoFieldMixin.get_db_prep_value()` would need
attention because of `connection.ops.validate_autopk_value()`.
Note that this field would also be limited to databases that have a
function to generate a UUID.
As Mariusz said, implementing a non-integer auto field such as
`UUIDAutoField` would close #17337.
----
Here are some responses to some of the other things mentioned in the
comments above.
> First, the default auto field BigAutoField is not subclassing AutoField
so it's not consistent with the error message.
This is a little confusing as it is done a bit magically for now - see the
bit about `AutoFieldMeta` above.
There was, however, a bug for subclasses of `SmallAutoField` and
`BigAutoField` which was fixed by
45a58c31e64dbfdecab1178b1d00a3803a90ea2d.
> While I'm not sure it would make sense in an AutoField, string primary
keys are possible. For example, you could use a language code (en, fr, de,
etc) as a primary key.
Yes, you can create a textual primary key, but you have to override `id`
for every model, e.g.
{{{#!python
id = models.CharField(max_length=64, primary_key=True)
}}}
It is unlikely that it'd make sense as an "`AutoField`" unless you can
generate the value on the database-side.
> Could I argue that IDs based on human-readable strings, like coolname,
are a valid use case?
>
> It's easier to reference '"soft-cuddly-shrew-of-expertise" than
"a1e1b1c5-5d5e-4eef-9ee9-e74e87ce71d6" or "2467584".
>
> Would this be seen as bad practice and we should use a slugfield that's
not the primary key for something like this instead?
Again, unless this were generated by the database, you'd probably be
better off doing something like:
{{{#!python
def generate_coolname():
... # Implementation here.
class MyModel(models.Model):
id = models.CharField(default=generate_coolname, max_length=64,
primary_key=True)
}}}
Guaranteeing uniqueness is always a fun challenge...
> To my understanding there's no significant performance hit from using
UUID as a primary key. It is 4x larger than int32 (128bit). So also index
is bigger, cache will store less data etc.
I dredged up some articles on this topic I recalled seeing a while back:
- https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/
- https://www.2ndquadrant.com/en/blog/sequential-uuid-generators-ssd/
> And as I said, I'm happy to open a PR as I want to use it with UUIDs in
my next project.
You can use UUID primary keys today. You just have to define a custom
primary key so that the auto field isn't generated:
{{{#!python
class MyModel(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4,
editable=False)
}}}
Granted, this is less convenient, but it is possible and
[https://docs.djangoproject.com/en/3.2/ref/models/fields/#uuidfield
documented].
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:11>
Comment (by Samuel Bishop):
Replying to [comment:10 Mariusz Felisiak]:
> Note: Fixing this ticket will also resolve #17337.
This would be an interesting development, and it will be great to make
some progress on the nonrel front.
----
Replying to [comment:11 Nick Pope]:
> - It was named `DEFAULT_AUTO_FIELD` to indicate that it referred to the
''automatically generated'' `id` field.
> - This was instead of `DEFAULT_AUTOFIELD` which would imply that it
could only ever be related to `AutoField`.
> - Another suggestion had been `DEFAULT_MODEL_PK`, but we weren't ready
to allow other fields without `AutoField`-like semantics.
> - We could lift the restriction to allow other non-`AutoField` types,
but the concept of an ''automatically generated'' `id` field needs to be
decoupled from `AutoField` a bit.
While the situation currently doesn't support it, having a
`DEFAULT_MODEL_PK` which accepts any field which sets `primary_key=True`
and `unique=True` would be very very useful as there are lots of use
cases, such as legacy and interoperability scenarios, where having the
primary key generation logic in python is going to be necessary, and its
good to hear that the situation is "were not ready" not "we don't want to
do this".
> - Add a `db_generated`/`is_auto` (or another better name) flag to
`AutoFieldMixin`. (See
[https://code.djangoproject.com/ticket/24042#comment:5 my comment]. I'm
not convinced #24042 should have been closed.)
Reading through that, I agree, it feels like closing the ticket was based
on the smallest possible interpretation of the issue.
> Note that this field would also be limited to databases that have a
function to generate a UUID.
This is another one of reasons why I think `DEFAULT_MODEL_PK` (and thus
lifting the need to have the database generate the primary key value),
should be the ultimate goal, as someone maintaining a ULID Field library
for use as a primary key, even if I go the extra mile and write/maintain
multiple independent database functions/stored procedures (and all the
extra management code to ensure they are loaded, updated, etc) for each
supported database (currently Sqlite, MySQL, MariaDB, PostgreSQL,
CockroachDB, with MSSQL on the horizon), I would "still" like to support
generating the ULID value in python code for the broadest possible
compatibility.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:12>
* cc: Brian Helba (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:13>
* cc: raydeal (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:14>
* cc: johnthagen (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:15>
Comment (by raydeal):
In my opinion allowing users to use uuid as DEFAULT_AUTO_FIELD isn't good
unless top db providers (Oracle, MS SQL, Postgres, SQLite) will support it
in db as a kind of sequence and they will introduce uuid as auto primary
key type.
DEFAULT_AUTO_FIELD is dedicated to create id field in every model. I can
imagine some situation where UUID is better than INT as Primary Key, but
in most cases it isn't. If it was, the top db providers would implement it
a long time ago.
UUID is random generated value by default and it doesn't guarantee unique
values like a sequence based on INT does. Generation a value is slower
than from sequence.
AutoField in Django that can be DEFAULT_AUTO_FIELD uses db sequence that
gives 100% unique values.
If UUIDAutoField is created in Django based on @Nick Pope proposal
(RandomUUID), it will be worth to add in documentation that **it doesn't
qurantee that value is unique**.
But, definitely, it is worth to allow users to create a custom AutoField
based on any other field type (as reported in #17337), maybe except
Text/Boolean types, and use it as PK. Anyone can then create their own
UUIDAutoField and take responsibility for uniques.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:16>
* owner: nobody => raydeal
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:17>
* cc: Michael Kuron (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:18>
* owner: raydeal => (none)
* status: assigned => new
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:19>
* cc: Alireza Safari (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:20>
Comment (by Mathieu Poussin):
> "So yes, having a sequence PK (that stays hidden) and a GUID UK (that
customers see) can be a good option"
It depends, on centralized databases this may be fine because everything
is local, however on distributed systems (for example cockroachdb), having
to handle incremental integer is a much slower (and exponentially slower
with more nodes), because you basically need to coordinate all the nodes
that can write data to stop, get the last sequence value, insert your
line, increment the sequence and then unlock it (so you lock insert on the
table during this time, even on others nodes from the cluster).
Example: https://github.com/cockroachdb/cockroach/issues/41258
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:21>
Comment (by raydeal):
Replying to [comment:21 Mathieu Poussin]:
> It depends, on centralized databases this may be fine because everything
is local, however on distributed systems (for example cockroachdb), having
to handle incremental integer is a much slower (and exponentially slower
with more nodes), because you basically need to coordinate all the nodes
that can write data to stop, get the last sequence value, insert your
line, increment the sequence and then unlock it (so you lock insert on the
table during this time, even on others nodes from the cluster).
>
> Example: https://github.com/cockroachdb/cockroach/issues/41258
Good point, in distributed system it could be slower.
Maybe it is only cocroachdb problem? Why algorithm of coordination of
nodes for sequence is slow? is it much slower then uuid or only little?
There is more questions to ask:
- What happen if in more than one node uuid will be the same for different
data? do they have mechanism to sort it out?
- Is there more "writes" or "reads" from db? is "writing" really too slow
in my project?
- why chose distributed database (cockroachdb) instead PostgresSQL (with
sharding)?
and more questions, depending on project, to chose db and avoid headache
in future.
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:22>
* cc: Petr Přikryl (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:23>
* cc: Wilson E. Husin (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:24>
* cc: Luiz Braga (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:25>
* cc: Dmytro Litvinov (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:26>
Small note that we might see a renewed interest for this feature with the
[https://commitfest.postgresql.org/47/4388/ upcoming support for UUIDv7 in
Postgres].
Not that it changes a lot of the arguments made here against allowing this
to be default but it seems that UUIDv7, due to fact it creates less sparse
and fragmented btrees, [https://ardentperf.com/2024/02/03/uuid-benchmark-
war/ performs reasonably well when compared to bigint].
--
Ticket URL: <https://code.djangoproject.com/ticket/32577#comment:27>