By the time I opened the issue ticket I had become convinced that the DB Compiler was effectively an impossible route. I completely agree with your sentiments about implementing Compiler. I'd go as far as to suggest that few small documentation changes may be warranted in order to suitably explain to future developers that they should not take this route if their database is not "relational enough".
Using different models has some advantages in that it can take full advantage of the underlying database's capabilities. But it does sacrifice compatibility with a significant amount of existing Django packages, so putting aside the additional complexity level, its not the target I'm aiming for.
I'm definitely coming to the conclusion that Queryset is the correct place to start work.
I think there are a number of issues this will expose/create, such as issues related to UUID usage especially as Primary Keys, and from just a few minutes re-reading the queryset class I also think there may be a need to clarify
So far, I've found the following problems related to UUIDs that might get in the way of 'finishing' this work.
Existing issues:
I've identified one new "issue".
There is an implicit assumption that primary keys are useful for ordering by the current QuerySet API methods `.first()` and `.last()`.
I'll raise an issue for this item after I give an opportunity for further discussion here since I'd like to have more of an idea regarding typical usage of these two queryset methods. I'm currently unsure how often these are used on unordered QuerySet objects. If the current behaviour of implicitly falling back to ordering by the primary key is in heavy use, I will need to take that into consideration. In the shorter term I currently have a few possible workarounds in mind to replicate the existing behaviour but the performance implications of these different methods become significantly more important if the implicit order by primary key behaviour is heavily used. Longer term, this behaviour might be good to deprecate by documenting that without an integer primary key, this behaviour cannot be relied upon, and removing any workarounds that emulate integer ordering type behaviour.
Ticket 6663 was closed quite some time ago, however in order to get the most from any attempt to support non relational databases, via QuerySet or otherwise, it will need to be revisited and either reopened or a new issue created to address the point I'm about to make that I feel is encompassed by 6663. I hope I can avoid any confusion and be clear what I feel is covered by this.
The current UUIDField that was recently added to Django is not always suitable for use as a database primary key because:
- The UUIDField generates the UUID with Python code and this is less than optimal in some circumstances. Many databases can or do generate document or row UUID 'primary keys' automatically. It should be possible to let Django defer the creation of the UUID and rely on the database for the creation of UUID primary keys just like it currently does for automatically incrementing integer primary keys.
- Existing Django applications/libraries were not written with UUID primary keys. Supporting existing Django applications and models is one of my goals, so requiring explicit use of something like `id = UUIDField(primary_key=True)` on a model in order to make it compatible, represents an issue to me.
Ticket 6663 was about the ability to use a UUID as the primary key. While on the surface this appears solved, we can do `id = UUIDField(primary_key=True)` and we have a UUID as the primary key, what hasn't been addressed is the ability to say "I want to use UUIDs for primary keys", I feel this was the intent behind Ticket 6663 and it should be reopened with an explicit focus on fixing the following two things:
- The default AutoField that Django provides any model that doesn't explicitly create its own id field, should not "force" the use of an automatically incrementing integer based primary key.
- A mechanism for configuring what kind of primary keys should be used. The two most likely configurations are all integer primary keys and all UUID primary keys, so my initial thoughts are that this mechanism should reside at the public QuerySet API layer, probably as a boolean value set during QuerySet class `__init__`.
In addition to this, in order for this to be most effective, there needs to be a way to specify that you want to use an alternative QuerySet class. There are several places one could override this for their own application and models very easily, however no convenient way to modify the 'default QuerySet' class provided by `Manager`. While my first thought is modify `BaseManager.from_queryset()` here
https://github.com/django/django/blob/stable/1.9.x/django/db/models/manager.py#L143 so that the definition of `Manager` no longer has to explicitly pass QuerySet, like it does here
https://github.com/django/django/blob/stable/1.9.x/django/db/models/manager.py#L238 the potential impact of such changes is definitely something I'm unfamiliar with, so I would greatly appreciate any feedback on how appropriate this approach would be.
- Sam